Data files to integrate
All data required to build MalariaMine is included in trunk/bio/tutorial/malariamine/malaria-data.tar.gz. Copy this to somewhere and extract from the archive.
cp bio/tutorial/malariamine/malaria-data.tar.gz my_data_dir cd my_data_dir tar -zxvf malaria-data.tar.gz
Edit 'malariamine/project.xml' so that all occurances of my_data_dir point to the selected location. For example:
<sources>
<source name="malaria-gff" type="malaria-gff">
<property name="gff3.taxonId" value="36329"/>
<property name="src.data.dir" location="my_data_dir/malaria/malaria-genome/gff"/>
</source>
...
The data included is:
/malaria-genome
The malaria genome as gff3 and fasta, originally downloaded from (...)
/uniprot
UniProt XML with protein information and sequences from SwissProt and Trembl. Downloaded from: http://www.ebi.ac.uk/uniprot/database/download.html and filtered on taxon id 36329.
/psi/intact
All protein interaction data available from IntAct in PSI format. Downloaded from: ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi1/species.
/inparanoid
InParanoid orthologues between P. falciparum and S. pombe. Downloaded from: http://inparanoid.cgb.ki.se/download/current/sqltables/.
/gene_ontology
The Gene Ontology structure. Downloaded from http://www.geneontology.org/
/go_annotation
GO term assignments for P. falciparum and S. pombe. Downloaded from http://www.geneontology.org/
