What is InterMine?

InterMine is a powerful open source data warehouse system. Using InterMine, you can create databases of biological data accessed by sophisticated web query tools. InterMine can be used to create databases from a single data set or can integrate multiple sources of data. Support is provided for several common biological formats and there is a framework for adding your own data. InterMine includes an attractive, user-friendly web interface that works 'out of the box' and can be easily customised for your specific needs.

InterMine is actively developed by the FlyMine & Cambridge UK ModENCODE Teams at the Cambridge Systems Biology Centre.

Why would I use InterMine?

  1. I have some data and I would like to let people query it on the web.
  2. I would like to build an integrated biological database so I can query multiple types of data at once.
  3. I have a database and I would like a companion data warehouse for flexible querying and list operations.

What does it do?

InterMine makes it easy to integrate multiple data sources into a single data warehouse. It has a core data model based on the sequence ontology and supports several biological data formats, just configure which organisms or data files are required. It is also easy to extend the data model and integrate your own data. There are Java and Perl APIs and an XML format to help import data.

A sophisticated web application provides query access to users. The interface allows creation of custom queries, includes template queries - web forms to run 'canned' queries and can upload and operate on lists of data. It is possible to create and add widgets to analyse lists with graphs and enrichment statistics. An important feature is that an admin user can publish new template queries, change report pages and create public lists at any time without any programming. Many aspects of the web app can be configured and branded.

How does it work?

When building InterMine, data are parsed from source formats and loaded into a central database. Queries are executed on this database with no need to access the original source data. Overlapping data sets are integrated by common identifiers, for example, genes from different sources may be merged by identifier or symbol.

InterMine is built in Java and uses the open source PostgreSQL database system. The web application is based on Struts using JSP pages and Ajax. All code and dependencies are freely available and open source. We welcome contributions from other developers.

How do I get started?

The following documents explain the software requirements, how to check out code and run tests. A MalariaMine tutorial works through creating an example InterMine instance.

  • GettingStarted - guides to setting up and running InterMine
  • MalariaMine - a tutorial to build a data warehouse and web interface with example P. falciparum data

If you have any questions about using InterMine please contact dev[at]flymine.org or join the MailingList.