Difference between revisions of "Overview"
|Line 5:||Line 5:|
===Identifying papers for phenotype curation===
===Identifying papers for phenotype curation===
Revision as of 22:28, 23 October 2014
Phenotype curation at WormBase entails the assignment of a phenotype term from the C. elegans Phenotype Ontology to
- strains and
Sources for allele phenotype curation
papers - see below unsolicited submission through webform: firstname.lastname@example.org solicited submission through author first pass form: email@example.com National Bioresource Project of Japan (NBP): sent quarterly as a text file from Mary Ann, and processed by script into postges
Identifying papers for phenotype curation
All papers flagged for phenotype curation can be queried for through the curation status cgi curation_status.cgi and described in great detail on the Curation_progress wiki. You can also use this form to assess if a particular paper or a list of papers have been flagged for a given datatype. Use the form to filter on all papers flagged for phenotype but not curated, which will take you to the list of uncurated papers with phenotype data.
Papers are flagged for phenotype curation through different pipelines (NOTE: the phenotype flag is called 'newmutant'):
- Manual first pass - from 200? until 2009, one curator manually flagged every paper that was uploaded by WormBase for containing any of over 30 datatypes. These papers were flagged for containing mutant analysis data and many remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly over 800 papers that have not been curated.
- Support Vector Machine (SVM) algorithm. This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data. The SVM typically runs every couple of weeks. A link to that week's results are sent out to all involved curators; however you can also access all svm results.
The results are listed as below. '.concat' refers to the fact that the paper and its supplementary materials were scanned together.
WBPaper00044453.concat high 24220508 WBPaper00044456.concat high 24223195 WBPaper00044457.concat low 24223727 WBPaper00044458.concat medium 24223821 WBPaper00044468.concat low 24225442 WBPaper00044470.concat high 24231678 WBPaper00044471.concat high 24231804 WBPaper00044482.concat high 24243022 WBPaper00044484.concat high 24244204 WBPaper00044486.concat high 24244732 WBPaper00044487.concat high 24244862 WBPaper00044491.concat low 24252776 WBPaper00044495.concat high 24239117 WBPaper00044501.concat high 24258276 WBPaper00044502.concat high 24260022 WBPaper00044504.concat high 24260346 WBPaper00044506.concat high 24262006 WBPaper00044509.concat high 22403392
- Author first pass - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form. This form is sent out on a weekly basis (on Thursdays) to the first e-mail address that can be found in any paper, if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address. This form is a simpler form of the curator first pass form used during manual curation.
Once a paper has been entered in the OA and attached to an object-phenotype connection, it should disappear from the curation status form.
This curation duty is specific for single mutants and not phenotypes dues to interactions (see interaction phenotypes section below).
To start curation of phenotypes, you can pick a paper was has been flagged but not curated. Once a paper has been chosen, log in to the phenotype OA and enter the WBPaperID into the paper field to retrieve the paper, which is accessible through the term info window of the OA. Run a search on the paper to make sure it hasn't already been curated - just in case it was missed during the curation status cgi update.
The WBPaperID field is an autocomplete field so by entering the number of the paper only, a drop down list should appear and you can select the paper from that list.
When ready to start curating, click 'New' in the menu of the lower table, this will erase the paper id you entered, but will reset the form with a valid pgid and will auto-populate your name in the curator field. Re-enter the WBPaperID.
All object fields, except strain should be autocomplete drop down lists. The files that are used to populate these fields are an obo-like format in that there is information attached to each object that shows up in the term info box when selected. Keeping the file updated from acedb and showing this information in the term info box helps during curation as it verifies the identity of the object being curated and saves the curator time from having to manually look up and verify the info themselves. These files although not technically 'obo files' will be referred to as obo files when referring to any flat file that contains a list of terms with accompanying information for display in the term info window. This is in contrast to other flat files that only contain a simple list of terms.
Using the Phenotype OA
Lots of documentation for the phenotype OA and dumpers can be found on the Phenotypes pages
This curation duty assigns phenotypes based on interactions between and among alleles and genes whose function has been altered through overexpression (transgenes) or through knockdown (RNAi). This curation uses the Interaction OA and assigns a genetic interaction tag along with a phenotype to these gene interactions. This curation duty is divided among different curators based on the elements in the interaction. That is, the RNAi curator assigns genetic interaction phenotypes to any interaction involving an RNAi object, which needs to be created by the RNAi curator. Allele-allele or allele-transgene interactions are curated by the allele phenotype curator.