Using the InterMine Query Optimiser as a standalone module
The InterMine Query Optimiser can be used by itself in a number of ways, depending on your needs. All options require the intermine.jar file to be in your classpath. Background
Before you use the Optimiser, it is important to know a little about how it operates.
- The Optimiser is only suitable for use with read-only databases. The optimiser builds precomputed tables, which contain derivative data constructed from other tables in the database. If the other tables in the database are altered, then these precomputed tables will be inconsistent with the data, and queries run through the optimiser will return incorrect results. The expected mode of operation is to completely build the data and never alter it, then build the precomputed tables. Note however that the precomputed tables can be built after the database has been opened to read access.
- The Optimiser requires write access to the database if it is to build any precomputed tables. At the moment, precomputed tables are only created when the user manually calls a method - the database should be write-enabled while this is happening. However, in future versions, we intend the Optimiser to be able to spot frequent slow queries and automatically create precomputed tables to speed them up, in which case the database must also be write-enabled. The Optimiser never attempts to alter any of the original data tables.
- The precomputed tables will take a long time to create. The whole point of creating a precomputed table is for the effort to be optimistically expended, so that when a real query comes in it can be serviced rapidly. The system is also expected to lend efficiency by the fact that the precomputed table only needs to be built once, but may speed up many requests.
- Each precomputed table takes the form of an SQL query, and is stored in the database along with the results of that query. The optimiser inspects the list of available precomputed tables to see if any of the results can be used as a shortcut by the database program when producing results for a real query.
- The optimiser currently only supports PostgreSQL as an underlying database.
Setting up a database
The database must be set up in a particular order:
- Create your database as normal. Completely set up the entire database, populate it with all data, and commit to never changing that data again.
- Configure the InterMine Optimiser to access that database. This is done by editing the intermine.properties file to include a description of the database. An example intermine.properties file is included in the distribution with example values.
- Create some precomputed tables. This can be done by writing a program in Java as described below, and running it. Precomputed tables can be created before, during, or after enabling read access to the database.
- Enable read access to the database, through the Optimiser. This is also described below.
Building precomputed tables
All precomputed tables must be built by using the PrecomputedTableManager class, which maintains an index of the tables it has created, along with the SQL used to create the precomputed table. For example:
import org.intermine.sql.Database;
import org.intermine.sql.DatabaseFactory;
import org.intermine.sql.query.Query;
import org.intermine.sql.precompute.PrecomputedTable;
import org.intermine.sql.precompute.PrecomputedTableManager;
// Create a PrecomputedTableManager
Database db = DatabaseFactory.getDatabase("db.main"); // configure db.main in intermine.properties
PrecomputedTableManager ptm = new PrecomputedTableManager.getInstance(db);
// Add a precomputed table to the database
Query q1 = new Query("select a.col1 as a_col1, a.col2 as a_col2, b.col1 as b_col1, b.col2 as b_col2"
+ " from a, b where a.col3 = b.col3");
PrecomputedTable pt1 = new PrecomputedTable(q1, "precomp1")
ptm.add(pt1);
// Add another precomputed table to the database
Query q2 = new Query("select c.column1 as column1, count(*) as c from c group by c.column1");
PrecomputedTable pt2 = new PrecomputedTable(q2, "precomp2")
ptm.add(pt2);
Query Optimising
From within an existing Java application:
Simply call the function optimisedSqlString = QueryOptimiser.optimise(sqlString, database) (where database is an org.intermine.sql.Database as created in the above code snippet), wherever you have an sql string that needs optimising. Run the resulting optimised SQL string as usual.
