Difference between revisions of "CCC Form 2.0 Specifications"

From WormBaseWiki
Jump to navigationJump to search
Line 48: Line 48:
 
#To a MOD
 
#To a MOD
 
#To Protein2GO
 
#To Protein2GO
 +
#As a file - GO Gene Association File (GAF)
  
 
'''Files needed'''
 
'''Files needed'''
 
#Mapping file for gene names and synonyms to MOD identifier and UniProtKB identifier
 
#Mapping file for gene names and synonyms to MOD identifier and UniProtKB identifier

Revision as of 20:44, 24 October 2012

This page is intended to document specifications for the next version of the Textpresso for Cellular Component Curation (CCC) tool. The changes to the tool, and pipeline, have been suggested by curators and are also part of the broader plan for Textpresso-based curation pipelines and the GO's Common Annotation Framework.

Tool Features

Textpresso search specifications

  • Frequency
  • Corpus
  • Categories
  • Filtering (Textpresso)
  1. Journal
  2. Date
  3. Document IDs
  • Filtering (non-Textpresso)
  1. SVM
  2. Gene Ontology Gene Association File
  • Ranking search results
  • Naming search results file
  • Storing search histories
  1. Recording versions of pdf2text conversion
  2. Recording version of categories used
  3. Recording search criteria, i.e. categories, corpus, filters
  4. Recording curator and date of search

Curation form

  • Curator login
  • Import of search results files
  • Organization of search results file
  • Selection of search results file for curation
  • Display of paper bibliographic information
  • Curation of selected sentences file
  • Search functionality
  1. Gene
  2. Paper
  3. Curator
  4. Annotation date
  5. Component term in sentence
  6. GO term used for annotation
  • Curation when all entities are recognized
  • Curation when one or more entities is not recognized, needs to be added
  • Feedback from form to Textpresso
  1. Enter a new gene name and identifier
  2. Enter a new component term in sentence, add to Textpresso cellular component category
  • Edit a previous annotation
  • Edit relationship index
  • Delete a search results file
  • Export annotations
  1. To a MOD
  2. To Protein2GO
  3. As a file - GO Gene Association File (GAF)

Files needed

  1. Mapping file for gene names and synonyms to MOD identifier and UniProtKB identifier