Last modified 3 years ago
Last modified on 10/06/09 17:00:14
InterMine Project Description File Format
The project.xml file is found in the root directory of your Mine (e.g. malariamine/project.xml). This is where you configure which data sources are loaded in the Mine when the database is built and where the source files to load are located. See more info on integrating data into the Mine.
The Project XML File Format
- properties
- target.model
- the name of the model created and used by this project
- source.location
- the location of the sources used by this project, this should contain at least ../bio/sources to get the common InterMine sources. You can specify more directories (if you create your own sources) with relative paths from the project.xml file.
- The XML file can contain several source locations, but each one has to have its own property tag. See flymine/project.xml for an example.
- common.os.prefix
- used by the integration system to choose which database properties to use
- intermine.properties.file
- the name of the file in the home directory which contains properties such as database name, user names and passwords.
- These properties override the contents of default.intermine.properties.file.
- default.intermine.properties.file
- the location of the default properties for this mine
- target.model
- sources
- The sources element gives a list of sources to integrate along with any properties specific to those sources.
- Properties within the <source> tag are used only when processing the given source and will override any properties in the source's project.properties file.
- See BioSources for more information about which sources are currently available in InterMine.
- NOTE - if you specify relative paths in for the src.data.dir or src.data.file attributes this is actually relative to the integrate sub-directory not relative to the location of project.xml. Using absolute paths is usually clearer.
- Post processing
- These are tasks that run after the data loading is completed.
- They are used to set calculate/set fields that are difficult to do when data loading or that require multiple sources to be loaded.
- See PostProcessing for more information about each postprocess.
A short example
<project type="bio">
<property name="target.model" value="genomic"/>
<property name="source.location" location="../bio/sources"/>
<property name="source.location" location="../../../../mysources"/>
<property name="common.os.prefix" value="common"/>
<property name="intermine.properties.file" value="flymine.properties"/>
<property name="default.intermine.properties.file" location="../default.intermine.integrate.properties"/>
<sources>
<source name="wormbase-identifiers" type="wormbase-identifiers">
<property name="src.data.dir" location="/shared/data/wormbase/current"/>
</source>
</sources>
<post-processing>
<post-process name="make-spanning-locations"/>
<post-process name="create-chromosome-locations-and-lengths"/>
<post-process name="create-overlap-relations-flymine" dump="true"/>
<post-process name="do-sources"/>
</post-processing>
</project>
For a more complete example, see flymine/project.xml which covers all the projects currently available in the model.
See RunningABuild for more information about running a database build.
