Example queries for InterMine

This document contains a few examples of IQL queries that can be used in the InterMine system. IQL queries all map onto the Java query system.

InterMine queries are similar to those formulated in SQL, but object oriented in that they refer to (classes of) objects rather than tables, and simpler because they have a more restricted syntax. A query consists of a SELECT list (of QueryClasses or QueryEvaluables which form the results of the query), a FROM list (of SubQueries or QueryClasses that are involved in the rest of the query) and a WHERE list of constraints on the objects involved (including fields of these objects and expressions involving them). In addition the results can be grouped or ordered (by QueryClass or QueryEvaluable).

QueryClasses

A QueryClass represents all objects of a particular class in the database, including instances of all subclasses of that class. Therefore, to retrieve all objects of a particular type from the database, select that QueryClass:

SELECT Employee FROM Employee

A QueryClass on the FROM list can have an alias, and then all other parts of the query must refer to that alias instead of the class name. This is useful when you have more than one QueryClass of the same class, or to avoid typing:

SELECT e FROM Employees AS e

QueryEvaluables

A QueryEvaluable is a value in the database that is a primitive value, like a String or a number. The most commonly used QueryEvaluable will be a QueryField, which is a field of a QueryClass (or other type with fields, like a subquery). It refers to one of the fields of the objects of the class that the QueryClass represents. So, to retrieve all the names of all the employees in the database, select that QueryField:

SELECT e.name AS name FROM Employee AS e

Another QueryEvaluable is a QueryValue, which refers to a fixed value, and a QueryExpression is another which represents some kind of calculation or manipulation of other QueryEvaluables. Most natural-looking arithmetic can be just typed into a query:

SELECT e.name AS name, 65 - e.age AS retirement FROM Employee AS e

A QueryFunction is a function that applies itself to more than one row of the results, and collapses all those rows into a single row. For example:

SELECT COUNT(*) AS c FROM Employee

The QueryFunction "COUNT(*)" takes all the rows of the result, counts them, and replaces them with a single row saying how many rows there were. You can use the "GROUP BY" clause to control which rows are grouped together for the QueryFunction. For example:

SELECT e.name AS name, COUNT(*) AS c FROM Employee AS e GROUP BY e.name

The rows of the Employee table are grouped together by the "name" field, and each group has the number of rows counted, producing results saying how many of each employee name there is. A similar operation is to make the query distinct, which removes all the duplicated rows in the results:

SELECT DISTINCT e.name AS name FROM Employee AS e

This will return all the names of all the employees in the database, but only each name once. It contains the same data as the query with the COUNT, except without the count data.

Subqueries

This brings up another point - we already know how to count the employees, but what if we want to count the number of distinct employee names? Because the DISTINCT operation is applied after the GROUP BY operation, you need to reverse that by using a subquery:

SELECT COUNT(*) AS c FROM (SELECT DISTINCT e.name AS name FROM Employee AS e) AS subquery

A subquery is a way of taking the results of a query, and treating it like a QueryClass in another query. The subquery should be placed inside parentheses and placed in the FROM list of the main query, and it must be given an alias. Fields from inside the subquery can be referred to in the main query:

SELECT subquery.name AS name, subquery.e.name AS samename FROM (SELECT e.name AS name, e FROM Employee AS e) AS subquery

Constraints

We do not often need to retrieve the entire contents of the database in a query - usually we want to restrict the results that we get. This is done with constraints, in a WHERE clause. A SimpleConstraint restricts the values of an object's field:

SELECT e FROM Employee AS e WHERE e.name = 'Fred'
SELECT e FROM Employee AS e WHERE e.age > 65

A ContainsConstraint restricts the objects in a QueryClass according to a reference or collection from another QueryClass:

SELECT e, a FROM Employee AS e, Address AS a WHERE e.address CONTAINS a
SELECT d, e FROM Department AS d, Employee AS e WHERE d.employees CONTAINS e