General specifications

From WormBaseWiki
Revision as of 18:42, 17 November 2011 by Vanaukenk (talk | contribs)
Jump to navigationJump to search

General specifications for Textpresso-based CCC Curation


Input


location of files on Textpresso for retrieving paper titles and abstracts

Arabidopsis: /data2/data-processing/data/arabidopsis/Data/processedfiles/title/ /data2/data-processing/data/arabidopsis/Data/processedfiles/abstract/


source files for curatable sentences - supplied by Textpresso team, stored on tazendra


gene name to gene identifier mapping file

Arabidopsis: /home/acedb/kimberly/ccc_tair/tair_ccc_datafiles


Curation


web-based curation form

Arabidopsis: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/tair/tair_ccc.cg


postgres table for storing curation

Arabidopsis: ccc_tair_gene_comp_go


Output - three-column or GAF


gene name to gene identifier mapping file

Arabidopsis: /home/acedb/kimberly/ccc_tair/tair_ccc_datafiles


GO term - GO ID mappings

All implementations: http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo


paper identifier mapping file

Arabidopsis: /data2/data-processing/data/arabidopsis/Data/processedfiles/accession/


NCBI taxon ID

Arabidopsis: 3702

dictyBase: 44689

FlyBase: 7227