Difference between revisions of "Sentence Saver for Category Seed"

From WormBaseWiki
Jump to navigationJump to search
(New page: ==Basic Functionality== *The sentence saver for seeding categories was intended to be used as a starting point for seeding new categories that could then be tested in Textpresso searches....)
 
Line 11: Line 11:
 
*For example, creating new categories for curation of GO Molecular Function terms related to enzymatic activity by selecting sentences from papers already curated.
 
*For example, creating new categories for curation of GO Molecular Function terms related to enzymatic activity by selecting sentences from papers already curated.
 
*Alternatively, saving sentences from papers initially being read for another purpose, such as concise descriptions or reference genome curation.
 
*Alternatively, saving sentences from papers initially being read for another purpose, such as concise descriptions or reference genome curation.
 +
 +
==Sample Work Scenarios==
 +
 +
'''Brand New Categories'''
 +
 +
*Curator logs in.
 +
*Enters a list of 30 WBPaper IDs containing sentences positive for a given data type previously uncurated.
 +
*Clicks on 'Load Sentence File and Query Papers !
 +
*Selects ~100 sentences from the positive papers.
 +
*Saves sentences, giving sentence file a name.
 +
*Examines frequency/distribution of terms.
 +
*Selects terms for one or more categories.
 +
*Names categories and saves them (somehow connecting the new categories to the saved sentence file).
 +
 +
'''Adding more sentences to an existing file'''
 +
 +
*Curator wants to add more sentences to an existing file to improve frequency analysis.
 +
*Logs in.
 +
*Enters a list of new papers from which to save sentences to an already existing file.
 +
*
 +
 
 +
  
 
==Key Features==
 
==Key Features==

Revision as of 14:40, 22 July 2009

Basic Functionality

  • The sentence saver for seeding categories was intended to be used as a starting point for seeding new categories that could then be tested in Textpresso searches.
  • The input is a list of paper IDs for known positive papers for a given data type.
  • The initial output is a list of words sorted according to their frequency and distribution in a subset of sentences within the input papers.
  • The final output is one or more categories of words that can be uploaded to a Textpresso implementation for testing.

Use

  • The sentence saver is used when curators already know papers from which they want to save sentences.
  • For example, creating new categories for curation of GO Molecular Function terms related to enzymatic activity by selecting sentences from papers already curated.
  • Alternatively, saving sentences from papers initially being read for another purpose, such as concise descriptions or reference genome curation.

Sample Work Scenarios

Brand New Categories

  • Curator logs in.
  • Enters a list of 30 WBPaper IDs containing sentences positive for a given data type previously uncurated.
  • Clicks on 'Load Sentence File and Query Papers !
  • Selects ~100 sentences from the positive papers.
  • Saves sentences, giving sentence file a name.
  • Examines frequency/distribution of terms.
  • Selects terms for one or more categories.
  • Names categories and saves them (somehow connecting the new categories to the saved sentence file).

Adding more sentences to an existing file

  • Curator wants to add more sentences to an existing file to improve frequency analysis.
  • Logs in.
  • Enters a list of new papers from which to save sentences to an already existing file.


Key Features

  • Users need to be able to continually add sentences to an existing file, as it is quite likely that all sentences will not be selected in one sitting.
  • Users need to be able to see the sentence file prior to uploading a list of papers. This will help to keep track of the list of papers from which sentences have been selected.
  • User will need to be able to edit the sentences file, if needed, deleting individual sentences or whole papers, if necessary.

Details