Automated descriptions for C. briggsae
From WormBaseWiki
Revision as of 19:56, 30 October 2014 by Rkishore (talk | contribs) (→Source file for Process, Molecular function and Sub-cellular localization (cell component)data)
Contents
http://textpresso-dev.caltech.edu/concise_descriptions/
Location of the concise description files for C. elegans:
- For viewing the latest dump:
http://tazendra.caltech.edu/~postgres/cgi-bin/data/concise_dump_new.ace
- Script: /home/postgres/work/citace_upload/concise/dump_concise.pl
- File location: /home/postgres/public_html/cgi-bin/data/concise_dump_new.ace
Semantic categories in a Concise Description for C. briggsae
1. Orthology/Similarity to C. elegans and human
2. Processes
3. Molecular Function
4. Sub-cellular localization (Cell component)
Source files for homology data
1. Orthologs file:
2. Best BlastP hits file: ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS245/species/c_briggsae/PRJNA1073/c_briggsae.PRJNA10731.WS245.best_blastp_hits.txt.gz
- Contact: Michael Paulini
Source file for Process, Molecular function and Sub-cellular localization (cell component)data
- gene association file for C. brigssae: ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS245/ONTOLOGY/gene_association.WS245.wb.c_briggsae
- Process:
- Need data from these rows:
- where column 9: has value 'P' (Process),
- column 2 (DB_Object ID): i.e WBGene00000307
- column 3 (DB_Object symbol), i.e, Cbr-bli-4
- column 5: GOID, eg, GO:0006508
- column 6: DB:Reference (Reference), eg.PMID:12062106, take all references that are pipe-separated
- column 7: Evidence code, i.e, IEA
- column 8: With, eg. INTERPRO:IPR000209
- Need data from these rows:
- Molecular Function:
- Need data from these rows:
- where column 9 has value 'F' (Molecular Function)
- column 2: (DB_Object ID), eg., WBGene00000307
- column 3: DB_Object symbol, eg., Cbr-bli-4
- column 5: GOID, eg, GO:0004252
- column 6: DB:Reference (Reference), eg.PMID:12520011, take all references that are pipe-separated
- column 7: Evidence code, eg, IEA
- column 8: 'With (or) From' eg., INTERPRO:IPR000209
- Need data from these rows:
- Sub-cellular localization (cell component)
- Need data from these rows:
- where column 9 has value 'C' (Cellular Component)
- column 2: (DB_Object ID), eg., WBGene00000324
- column 3: DB_Object symbol, eg, Cbr-exp-2
- column 5: GOID, eg, GO:0008076
- column 6: DB:Reference (Reference), eg.PMID:12520011, take all references that are pipe-separated
- column 7: Evidence code, eg, IEA
- column 8: 'With (or) From eg., INTERPRO:IPR000209
- Need data from these rows:
Template for a C. briggsae gene description
For the test phase, order of sentences:
- Orthology
- Process
- Function/identity
- Component