Phenotype curation at WormBase entails the assignment of a phenotype term from the C. elegans Phenotype Ontology to
- strains and
Identifying papers for phenotype curation
All papers flagged for phenotype curation are posted on the curation status cgi here. The different highlighting of the WBPaperIDs on the page reflect the different sources of flagging the papers and is used as a means of prioritizing papers for curation. Papers are flagged for phenotype curation through a few different pipelines:
- Manual first pass (blue) - from 200? until 2009, one curator manually flag every paper that was uploaded by WormBase. These papers were flagged for containing mutant analysis data rather than phenotypic analysis of any of the other currently curated phenotype data types, and remain as a backlog of work that still needs to be tackled. As of 2011, there remains roughly 800 papers that have been flagged by this route but have still not been curated.
- Support Vector Machine (SVM) algorithm (blue). This is an automated flagging pipeline that was started at the end of 2009 that flags papers every couple of weeks on an automated basis. The algorithm was tuned specifically for ranking papers as no, low, medium, or high with respect to containing curatable phenotype data.
- Author first pass (green) - with the inception of automated methods to flag papers, a safety net for catching false negative papers was instigated in the form of an author first pass form. This form is sent out on a weekly basis (a Thursday) to the first e-mail that can be found in any paper if an e-mail hasn't been sent already. For the most part this targets papers that have recently been added to our local database, but if an e-mail cannot be found in the paper, at a later date an author may have verified a paper as being theirs, and at that time the author first pass form will be sent to the newly verified e-mail address. This form is a simpler form from the curator first pass form used during manual curation.
Note: papers in red on the page are papers that have been identified as containing genes that do not have any phenotype information attached to them as of yet, through either allele analysis or through RNAi analysis (see priority gene determination below). Hot pink papers signify when these red papers coincide with the green papers - author first passed papers, and can be viewed as the highest priority.