Difference between revisions of "Overview"

From WormBaseWiki
Jump to navigationJump to search
Line 9: Line 9:
 
All papers flagged for phenotype curation are posted on the curation status cgi [http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi?action=fnc&field=newmutant here].  The different highlighting of the WBPaperIDs on the page reflect the different sources of data type flagging and is used as a means of prioritizing papers for curation.  Papers are flagged for phenotype curation through a few different pipelines:
 
All papers flagged for phenotype curation are posted on the curation status cgi [http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi?action=fnc&field=newmutant here].  The different highlighting of the WBPaperIDs on the page reflect the different sources of data type flagging and is used as a means of prioritizing papers for curation.  Papers are flagged for phenotype curation through a few different pipelines:
  
*Manual first pass (blue) - from 200? until 2009, one curator manually flag every paper that was uploaded by WormBase. These papers were flagged for containing mutant analysis data rather than phenotypic analysis of any of the other currently curated phenotype data types, and remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly 800 papers that have been flagged by this route but have still not been curated.  
+
*Manual first pass (blue) - from 200? until 2009, one curator manually flag every paper that was uploaded by WormBase. These papers were flagged for containing mutant analysis data and many remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly 800 papers that have been flagged by this route but have still not been curated.  
 
   
 
   
 
*Support Vector Machine (SVM) algorithm (blue). This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data.
 
*Support Vector Machine (SVM) algorithm (blue). This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data.
  
*Author first pass (green) - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form.  This form is sent out on a weekly basis (a Thursday) to the first e-mail that can be found in any paper if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address.  This form is a simpler form from the curator first pass form used during manual curation.   
+
*Author first pass (green) - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form.  This form is sent out on a weekly basis (on Thursdays) to the first e-mail that can be found in any paper, if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address.  This form is a simpler form of the curator first pass form used during manual curation.   
  
 
Once a paper has been entered in the OA and attached to an object-phenotype connection, it should disappear from the curation status form.  
 
Once a paper has been entered in the OA and attached to an object-phenotype connection, it should disappear from the curation status form.  

Revision as of 05:50, 16 May 2011

Intro

Phenotype curation at WormBase entails the assignment of a phenotype term from the C. elegans Phenotype Ontology to

  • variations
  • transgenes
  • strains and
  • rearrangements

Identifying papers for phenotype curation

All papers flagged for phenotype curation are posted on the curation status cgi here. The different highlighting of the WBPaperIDs on the page reflect the different sources of data type flagging and is used as a means of prioritizing papers for curation. Papers are flagged for phenotype curation through a few different pipelines:

  • Manual first pass (blue) - from 200? until 2009, one curator manually flag every paper that was uploaded by WormBase. These papers were flagged for containing mutant analysis data and many remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly 800 papers that have been flagged by this route but have still not been curated.
  • Support Vector Machine (SVM) algorithm (blue). This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data.
  • Author first pass (green) - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form. This form is sent out on a weekly basis (on Thursdays) to the first e-mail that can be found in any paper, if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address. This form is a simpler form of the curator first pass form used during manual curation.

Once a paper has been entered in the OA and attached to an object-phenotype connection, it should disappear from the curation status form.

Note: papers in red are ones that have been identified as containing genes that do not have any phenotype information attached to them as of yet, through either allele analysis or through RNAi analysis (see priority gene determination below). Hot pink papers signify when these red papers coincide with the green papers - author first passed papers, and can be viewed as the highest priority.

Curating phenotype

This curation duty is specific for single mutants and not phenotypes dues to interactions (see interaction phenotypes section below).

To start curation of phenotypes, I usually open the curation status cgi here, and take a look at the papers that need to be done. Once a paper has been chosen, log in to the phenotype OA and enter the WBPaperID into the paper field to retrieve the paper, which is accessible through the term info window of the OA. Run a search on the paper to make sure it hasn't already been curated - just in case it was missed during the curation status cgi update.

The WBPaperID field is an autocomplete field so by entering the number of the paper only, a drop down list should appear and you can select the paper from that list.

When ready to start curating, click 'New' in the menu of the lower table, this will erase the paper id you entered, but will reset the form with a valid pgid and will auto-populate your name in the curator field. Re-enter the WBPaperID.

All object fields, except strain should be autocomplete drop down lists. The files that are used to populate these fields are an obo-like format in that there is information attached to each object that shows up in the term info box when selected. Keeping the file updated from acedb and showing this information in the term info box helps during curation as it verifies the identity of the object being curated and saves the curator time from having to manually look up and verify the info themselves. These files although not technically 'obo files' will be referred to as obo files when referring to any flat file that contains a list of terms with accompanying information for display in the term info window. This is in contrast to other flat files that only contain a simple list of terms.

obo files for the phenotype OA

The following fields use an obo file, the name, source and script that generates the obo file used is noted.

  • Pub field -> paper.obo
  • Person field
  • Variation
  • Transgene
  • Rearrangment
  • Caused by -> WBGene
  • Phenotype ->phenotype.obo
  • Molecule ->molecule.obo
  • Anatomy
  • Life stage
  • Child of ->phenotype.obo
  • Laboratory evidence
  • Entity
  • Quality


Interaction phenotypes

This curation duty assigns phenotypes based on experiments testing interactions between and among alleles and genes whose function has been altered through overexpression (transgenes) or through knockdown (RNAi). This curation uses the Interaction OA and assigns a genetic interaction tag along with a phenotype to these gene interactions. This curation duty is divided among different curators based on the elements in the interaction. That is the RNAi curator assigns genetic interaction phenotypes to any interaction involving an RNAi object, which needs to be created by the RNAi curator. Allele-allele or allele-transgene interactions are curated by the allele phenotype curator.