IntegratedDesktopSearch

Revision 12 as of 2006-06-13 20:34:55

Clear message

Launchpad spec: [https://launchpad.net/distros/ubuntu/+spec/integrated-desktop-search]

A powerful search mechanism deeply integrated with the desktop.

Possible points of integration:

  • Nautilus search (Konqueror?)
  • Filechooser
  • Panel applet
  • Web browser (index page content)
  • General search interface capable of searching all available contents
  • Tagging of anything, files, music, images, emails, news
  • Others?

Resources

[http://beaglewiki.org Beagle] is the obvious candidate for the indexing engine, but there is also the black horse [http://freedesktop.org/wiki/Software/Tracker Tracker] currently developed by [http://jamiemcc.livejournal.com Jamie McCracken]. Beagle is the more mature of the two, but Tracker has some advantages over Beagle. A few notable points

  • Beagle is developed by Novell which means it will have plenty of cooporate support
  • Beagle is written in C#, which makes for a rather large dependancy (storage wise)
  • Tracker is written in C and communicates with other apps over a DBUS interface
  • Tracker is extremely lightweight, and might even be suitable for embedded systems
  • Tracker uses mature technologies such as mysql and libextractor instead of in-development technologies
  • Tracker is able to store file metadata on demand
  • Tracker has an integrated keyword/tagging functionality for all first class objects

Comments

  • KillerKiwi : I think this is all covered by Beagle / Nautilus Search and Deskbar in Dapper

    • MikkelKamstrupErlandsen : Nautilus is not linked against Beagle, so it does not utilize the searching capabilties of it. Also beagle is not integrated into the filechooser (as I think Novell is doing), or in the web browser for that matter. The keyword is integration.

  • MikkelKamstrupErlandsen : The recent developments on Tracker shows that it is maturing really fast. I have been using 0.0.4 on several boxes for a good while, and I'd say that it mathes (or outperforms) Beagle on both stability, memory and speed. The only setback is the number of "backends", where Tracker still misses things as news, emails and tomboy notes - they will arrive eventually though. The very energetic move would be for Ubuntu/Canonical to sponsor Jamie to work on Tracker. This would be a move that would position Ubuntu/Canonical has bleeding edge technology contributors to the free desktop along side Novell and Red Hat.

  • FryerFox : Does Tracker allow you to create more complex searches such as (type: pdf) and (dir: /reference/rfc) and (tcp near ip)? Beagle is an excellent application, but a major problem I have with Desktop Search is the lack of complex searching facilities. This makes it quite unsuitable for a file search which you might want to be restricted to a single directory or only deal with the file name and not the contents, etc.

    • MikkelKamstrupErlandsen : Tracker allows for rdf queries, so - yes, it allows for far more flexible searches than Beagle. You can for example search for all jpegs with sizes between 800x600 and 1024x720 that was created before 2004.

      • FryerFox : Well, that for me seems to be the most important criteria for a long-term searching solution. If the search solution is not extremely versatile, then it will (obviously) have limited application and can't be used as the founding infrastructure for an integrated system searching facility. I think we should keep in mind possible uses of a search engine, rather than common current uses. For example, in GConf, return me a key that matches var[\_]*[\d]+ and is of boolean type, or in Epipheny, find me the web pages I visited in the last week that contained a reference to a PDF and the words 'superposition principle' but not 'blind decomposition'.

        • MikkelKamstrupErlandsen : Well, Tracker can do all of that. Tracker is not a search engine as such. You should think of it as metadata storage and indexing. You can get, set, and search arbitrary metadata on any first class object (emails, documents, conversations etc.).

  • JackWasey The problem is really that there is no consistent rich meta-data. Each application and desktop environment has its own method, usually involving vast numbers of .hidden folders. Using something like user_xattr or ["Reiser4"] extensively would vastly improve search, and desktop usability. Imagine being able to tag any file or folder like [http://del.icio.us]... A good desktop system would automatically tag things it knew about, e.g. mp3 -> music, but you could add your own. This would beat the useless forced categorisation of mp3 genre, and id3 tags; and of course there are powerful use cases in other areas, especially of multimedia.

    • MikkelKamstrupErlandsen : Extended attributes and Reiser4 is not a really portable way of storing metadata - it's kinda Linux only. IMHO it is also better to use user space tools for such tasks than kernel space. Also Tracker is designed to do exactly this - and frankly, I'd trust Tracker over Resier4 when it comes to stability. Why would there need to be a music tag on mp3 files when you can search by mime-types? Or simply use the build in Music service of Tracker? I know that there is a Rhythbox patch for Tracker tagging in the melting pot, it is asimple matter for other apps to follow suit...