Coming in InterMine 0.94 - new search
One of the new features in the upcoming InterMine 0.94 release is a new keyword search. This provides a really fast search across all text fields in the database. Results are faceted just like an Amazon product search - currently by type (the class of object) and organism. Clicking on a facet restricts results to just that category. Boolean operations, wildcards and phrases are all supported.
The search is based on Lucene and uses Bobo for the faceting. This is the same technology used by LinkedIn to power their profile search.
Indexing
Indexing the database runs as a post-process step which creates the index in a directory. The index is then zipped and stored in the database, when you deploy a webapp pointing at the database it will extract the index again. For FlyMine indexing takes less than an hour, including a large proportion of the database.
By default the index will include the text fields of all objects in the database. Each object in the database becomes a document in the index with text attributes attached. You can configure classes to ignore, such as locations and scores that don't provide text information. You can also add related information to an object, for example you can configure that the synonyms, pathways and GO terms should be included in the Gene's entry.
More Details
The faceted search system was implemented by Nils Kölling, a summer intern with InterMine. See the talk he gave for more technical details.
The new search system is one new feature in the upcoming InterMine 0.94 release
Attachments
-
flymine_search_example.png
(110.7 KB) -
added by rns 18 months ago.
An example search in FlyMine

