Specifications for CCC Curation from Textpresso Search Page
Requirements for Using Textpresso Search Results in General CCC Curation
These specifications are for allowing a curator to search any Textpresso implementation using the CCC categories, submit the resulting sentences to a curation form, make annotations, and download the annotations in a gene_association file format.
This pipeline would make use of the XML format of a returned sentence. An XML version of sample search results from WBPaper00037859 was edited:
1) Removed all category names between the <annotation> tags.
2) Kept all information in within the <bibliography> tags.
3) Removed information within the <field_references> tags - this was a scrambled sentence, is this how they are typically identified?
4) Potentially curatable sentences are found within the <field_results> tags.
5) Going from XML to curation form:
Display information within <bibliography> at the top of the page:
Title:
Authors:
Journal:
Year:
DocID:
Type:
Literature:
Accession (PMID):
Abstract:
6) Working from left to right on the curation form:
First box: all entities within the protein or gene tag. The exact tag name for this will be different for each implementation, for example:
protein_celegans
genes_arabidopsis
dicty_genes
Back to Gene Ontology