Difference between revisions of "Overview"

From WormBaseWiki
Jump to navigationJump to search
Line 5: Line 5:
 
*strains and  
 
*strains and  
 
*rearrangements
 
*rearrangements
 +
 +
===Sources for allele phenotype curation===
 +
<pre>
 +
papers - see below
 +
unsolicited submission through webform:  allele_form@tazendra.caltech.edu
 +
solicited submission through author first pass form: first_pass.cgi@tazendra.caltech.edu
 +
National Bioresource Project of Japan (NBP): sent quarterly as a text file from Mary Ann, and processed by script into postges
 +
 +
</pre>
  
 
===Identifying papers for phenotype curation===
 
===Identifying papers for phenotype curation===

Revision as of 22:28, 23 October 2014

Intro

Phenotype curation at WormBase entails the assignment of a phenotype term from the C. elegans Phenotype Ontology to

  • variations
  • transgenes
  • strains and
  • rearrangements

Sources for allele phenotype curation

papers - see below
unsolicited submission through webform:  allele_form@tazendra.caltech.edu
solicited submission through author first pass form: first_pass.cgi@tazendra.caltech.edu
National Bioresource Project of Japan (NBP): sent quarterly as a text file from Mary Ann, and processed by script into postges

Identifying papers for phenotype curation

All papers flagged for phenotype curation can be queried for through the curation status cgi curation_status.cgi and described in great detail on the Curation_progress wiki. You can also use this form to assess if a particular paper or a list of papers have been flagged for a given datatype. Use the form to filter on all papers flagged for phenotype but not curated, which will take you to the list of uncurated papers with phenotype data.

Papers are flagged for phenotype curation through different pipelines (NOTE: the phenotype flag is called 'newmutant'):

  • Manual first pass - from 200? until 2009, one curator manually flagged every paper that was uploaded by WormBase for containing any of over 30 datatypes. These papers were flagged for containing mutant analysis data and many remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly over 800 papers that have not been curated.
  • Support Vector Machine (SVM) algorithm. This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data. The SVM typically runs every couple of weeks. A link to that week's results are sent out to all involved curators; however you can also access all svm results.

The results are listed as below. '.concat' refers to the fact that the paper and its supplementary materials were scanned together.

WBPaper00044453.concat	high	24220508
WBPaper00044456.concat	high	24223195
WBPaper00044457.concat	low	24223727
WBPaper00044458.concat	medium	24223821
WBPaper00044468.concat	low	24225442
WBPaper00044470.concat	high	24231678
WBPaper00044471.concat	high	24231804
WBPaper00044482.concat	high	24243022
WBPaper00044484.concat	high	24244204
WBPaper00044486.concat	high	24244732
WBPaper00044487.concat	high	24244862
WBPaper00044491.concat	low	24252776
WBPaper00044495.concat	high	24239117
WBPaper00044501.concat	high	24258276
WBPaper00044502.concat	high	24260022
WBPaper00044504.concat	high	24260346
WBPaper00044506.concat	high	24262006
WBPaper00044509.concat	high	22403392


  • Author first pass - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form. This form is sent out on a weekly basis (on Thursdays) to the first e-mail address that can be found in any paper, if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address. This form is a simpler form of the curator first pass form used during manual curation.

Once a paper has been entered in the OA and attached to an object-phenotype connection, it should disappear from the curation status form.

--kjy (talk) 00:33, 22 January 2014 (UTC)

Curating phenotype

This curation duty is specific for single mutants and not phenotypes dues to interactions (see interaction phenotypes section below).

To start curation of phenotypes, you can pick a paper was has been flagged but not curated. Once a paper has been chosen, log in to the phenotype OA and enter the WBPaperID into the paper field to retrieve the paper, which is accessible through the term info window of the OA. Run a search on the paper to make sure it hasn't already been curated - just in case it was missed during the curation status cgi update.

The WBPaperID field is an autocomplete field so by entering the number of the paper only, a drop down list should appear and you can select the paper from that list.

When ready to start curating, click 'New' in the menu of the lower table, this will erase the paper id you entered, but will reset the form with a valid pgid and will auto-populate your name in the curator field. Re-enter the WBPaperID.

All object fields, except strain should be autocomplete drop down lists. The files that are used to populate these fields are an obo-like format in that there is information attached to each object that shows up in the term info box when selected. Keeping the file updated from acedb and showing this information in the term info box helps during curation as it verifies the identity of the object being curated and saves the curator time from having to manually look up and verify the info themselves. These files although not technically 'obo files' will be referred to as obo files when referring to any flat file that contains a list of terms with accompanying information for display in the term info window. This is in contrast to other flat files that only contain a simple list of terms.

Using the Phenotype OA

Lots of documentation for the phenotype OA and dumpers can be found on the Phenotypes pages

Interaction phenotypes

This curation duty assigns phenotypes based on interactions between and among alleles and genes whose function has been altered through overexpression (transgenes) or through knockdown (RNAi). This curation uses the Interaction OA and assigns a genetic interaction tag along with a phenotype to these gene interactions. This curation duty is divided among different curators based on the elements in the interaction. That is, the RNAi curator assigns genetic interaction phenotypes to any interaction involving an RNAi object, which needs to be created by the RNAi curator. Allele-allele or allele-transgene interactions are curated by the allele phenotype curator.