Perl Item XML generation modules
Introduction
In the source:trunk/intermine/perl directory we provide a Perl library for creating files in InterMine "Item XML" format. Files in this format can be loaded into an InterMine database by creating a "source".
The Perl code depends on the XML::Parser::PerlSAX for parsing the model file and on the XML::Writer class for output (install these with: sudo cpan -i XML::Parser::PerlSAX XML::Writer).
See also: ItemsAPIJava
Usage
Most code using these modules will follow this pattern:
- make a model:
my $model = new InterMine::Model(file => $model_file);
- make an item factory:
my $factory = new InterMine::ItemFactory(model => $model);
- make an item:
my $gene = $factory->make_item("Gene"); - set some attributes, references and/or collections:
$gene->set("identifier", "CG10811"); - repeat 4 as necessary
- output with XML::Writer
FlyMine example
Example using the FlyMine model:
use XML::Writer;
use InterMine::Model;
use InterMine::Item;
use InterMine::ItemFactory;
my $model_file = $ARGV[0];
die unless defined $model_file;
my $model = new InterMine::Model(file => $model_file);
my $factory = new InterMine::ItemFactory(model => $model);
my $gene = $factory->make_item("Gene");
# set an attribute
$gene->set("identifier", "CG10811");
my $organism = $factory->make_item("Organism");
$organism->set("taxonId", 7227);
# set a reference
$gene->set("organism", $organism);
my $pub1 = $factory->make_item("Publication");
$pub1->set("pubMedId", 11700288);
my $pub2 = $factory->make_item("Publication");
$pub2->set("pubMedId", 16496002);
# set a collection
$gene->set("publications", [$pub1, $pub2]);
# write as InterMine Items XML
my @items_to_write = ($gene, $organism, $pub1, $pub2);
my $writer = new XML::Writer(DATA_MODE => 1, DATA_INDENT => 3);
$writer->startTag("items");
for my $item (@items_to_write) {
$item->as_xml($writer);
}
$writer->endTag("items");
Output:
<items>
<item id="0_1" class="" implements="http://www.flymine.org/model/genomic#Gene">
<attribute name="identifier" value="CG10811" />
<collection name="publications">
<reference ref_id="0_3" />
<reference ref_id="0_4" />
</collection>
<reference name="organism" ref_id="0_2" />
</item>
<item id="0_2" class="" implements="http://www.flymine.org/model/genomic#Organism">
<attribute name="taxonId" value="7227" />
</item>
<item id="0_3" class="" implements="http://www.flymine.org/model/genomic#Publication">
<attribute name="pubMedId" value="11700288" />
</item>
<item id="0_4" class="" implements="http://www.flymine.org/model/genomic#Publication">
<attribute name="pubMedId" value="16496002" />
</item>
</items>
Longer example
In source:trunk/bio/scripts there is a longer example: intermine_items_example.pl
The script has three arguments:
- a string describing a DataSet
- a taxon id
- the path to a genomic model file
If you install XML::Writer, the script should run as is from the bio/scripts/ directory
Example command line:
./intermine_items_example.pl "FlyMine" 5833 ../../flymine/dbmodel/build/model/genomic_model.xml
