Difference between revisions of "How to make a new Textpresso category"

From WormBaseWiki
Jump to navigationJump to search
Line 4: Line 4:
 
     implementation has over 100 categories for use in searching and retrieval.   
 
     implementation has over 100 categories for use in searching and retrieval.   
  
     For each implementation, there is likely some vocabulary that is specific to that implementation and will need to be
+
     For any Textpresso implementation, there is likely some vocabulary that is specific to that implementation and will need
    included as Textpresso categories.  For example, categories of organism-specific gene and protein names (official as
+
    to be included as categories.  For example, categories of organism-specific gene and protein names (official as well as
     well as synonyms) are essential.  In addition, categories of anatomy terms, both gross and subcellular, as well as life
+
     synonyms) are essential.  In addition, categories of anatomy terms, both gross and subcellular, as well as life cycle or
     cycle or life stage terms, are critical to retrieving curatable information.
+
     life stage terms are critical for retrieving curatable information.
 +
 
 +
    But what about other terms?  How do I know when an existing category is sufficient, when an existing category is close but
 +
    could use some additional terms or editing, or when I need a new category entirely?
 +
 
 +
    Ideas:  a version of the category editor that gave a frequency analysis from a set of positive sentences and mapped the
 +
    results onto existing categories.
 +
 
 +
'''How do I edit an existing category for my implementation?'''
 +
 
 +
    Ideas: Are existing categories easily downloaded as text files? 
 +
            What are the prospects for a simple category editor?
 +
 
 +
'''How do I create a new category?'''
 +
 
 +
    The best way to create a new Textpresso category is to collect a set of sentences that describe the data type or
 +
    information you'd like to curate and perform a word frequency analysis. 
 +
    WormBase has a [http://textpresso-dev.caltech.edu/cgi-bin/azurebrd/sentence_word_category.cgi sentence saver tool] that
 +
    helps curators do this.
 +
    allows a curator to perform this analysis.

Revision as of 15:23, 20 November 2009

Do I need to make a new category?

    Textpresso categories are an excellent tool for finding specific information in the literature.  The current C. elegans
    implementation has over 100 categories for use in searching and retrieval.  
    For any Textpresso implementation, there is likely some vocabulary that is specific to that implementation and will need
    to be included as categories.  For example, categories of organism-specific gene and protein names (official as well as
    synonyms) are essential.  In addition, categories of anatomy terms, both gross and subcellular, as well as life cycle or
    life stage terms are critical for retrieving curatable information.
    But what about other terms?  How do I know when an existing category is sufficient, when an existing category is close but
    could use some additional terms or editing, or when I need a new category entirely?
    Ideas:  a version of the category editor that gave a frequency analysis from a set of positive sentences and mapped the
    results onto existing categories.

How do I edit an existing category for my implementation?

    Ideas: Are existing categories easily downloaded as text files?  
           What are the prospects for a simple category editor?

How do I create a new category?

    The best way to create a new Textpresso category is to collect a set of sentences that describe the data type or 
    information you'd like to curate and perform a word frequency analysis.  
    WormBase has a sentence saver tool that
    helps curators do this.
    allows a curator to perform this analysis.