Last modified 3 years ago Last modified on 25/06/09 14:32:07

BioSources > fasta

The fasta source loads features and their sequences and will create a feature for each entry in a fasta file and set the sequence.

To configure a fasta source, add an entry to the project.xml file, like so:

    <!-- example from flymine/project.xml -->
    <source name="uniprot-fasta" type="fasta">
      <property name="fasta.taxonId" value="7227 7237 6239 7165 7460 4932 9606 10090"/>
      <property name="fasta.className" value="org.intermine.model.bio.Protein"/>
      <property name="fasta.classAttribute" value="primaryAccession"/>
      <property name="fasta.dataSetTitle" value="UniProt data set"/>
      <property name="fasta.dataSourceName" value="UniProt"/>
      <property name="src.data.dir" location="/data/uniprot/current"/>
      <property name="fasta.includes" value="uniprot_sprot_varsplic.fasta"/>
      <property name="fasta.sequenceType" value="protein" />
      <property name="fasta.loaderClassName"
                value="org.intermine.bio.dataconversion.UniProtFastaLoaderTask"/>
    </source>
attributecontentpurpose
taxonIdspace-delimited list of taxonIdsonly features with the listed taxonIds will be loaded
classNamefully-qualified class namedetermines which feature will be loaded
classAttributeidentifier field from classNamedetermines which field from the feature will be set
dataSetTitlename of datasetdetermines name of dataset object
dataSourceNamename of datasourcedetermines name of datasource object
src.data.dirlocation of the fasta data filethese data will be loaded into the database
includesname of data filethis data file will be loaded into the database
sequenceTypeclass nametype of sequence to be loaded
loaderClassNamename of Java file that will process the fasta filesonly use if you have created a custom fasta loader

datasets and datasources

Proteins, genes, and chromsomes have a datasets collection. A dataset is set of results or data from a datasource. A dataset has a reference to a datasource, which is from which organisation the data came from.

See FlyMine's project.xml file for more examples.


Back: BioSources