Difference between revisions of "Textpresso Central"

From WormBaseWiki
Jump to navigationJump to search
Line 37: Line 37:
  
 
1) the CCC (Cellular Component Curation) form
 
1) the CCC (Cellular Component Curation) form
 +
 +
Pros: sentences are seen on the same page as annotations
 +
form pre-populates curation fields with protein names, category terms, and suggested annotations
 +
easy to mark sentences if not curatable
 +
 +
Cons: duplicating or making multiple annotations is cumbersome
 +
don't see term info for proteins or GO terms
 +
don't see additional annotations for proteins mentioned in sentences
  
 
2) the interaction configuration of the OA  
 
2) the interaction configuration of the OA  
 +
 +
Any others?  Ask other WB curators.
  
 
*(+) Robust back-end infrastructure with internal Textpresso database holding all annotations.
 
*(+) Robust back-end infrastructure with internal Textpresso database holding all annotations.

Revision as of 21:38, 6 November 2011

General considerations: Specification of data models, markup languages, and flow now is important.


Searching and Category/Ontology Development

  • (+) Control panel: loading papers from existing corpora into a viewer, incorporation of PubMed queries; search results will be used to import full text from PMC or journal site
  • (+) Development of NLP toolbox: pattern matching, statistics, svm, hmm, crf
  • (+) Index of all NLP results for faster querying
  • (+) Textpresso Ontology viewer and editor
  • (+) Ontology development
  • (+) Robust back-end infrastructure with internal Textpresso database holding all annotations
  • (+) Querying the curation status of papers
  • (+) Add searching for previously made annotations in papers to search capabilities


Viewing

  • (+) Viewer: selecting terms, importing them into OA, prepopulating entries of forms; display results from NLP tools; initiate new NLP analyses (pattern matching, statistical, machine learning)


Annotating and Curating

  • (+) OA and its interaction with TC

Curators would like to be able to view the search results while curating and make annotations from the true positive sentences.

Currently, for doing this we have:

1) the CCC (Cellular Component Curation) form

Pros: sentences are seen on the same page as annotations form pre-populates curation fields with protein names, category terms, and suggested annotations easy to mark sentences if not curatable

Cons: duplicating or making multiple annotations is cumbersome don't see term info for proteins or GO terms don't see additional annotations for proteins mentioned in sentences

2) the interaction configuration of the OA

Any others? Ask other WB curators.

  • (+) Robust back-end infrastructure with internal Textpresso database holding all annotations.