Difference between revisions of "November 25, 2009 - Sequence Curation Flags"
Line 29: | Line 29: | ||
*[http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi Curation status form] has lists, but not completely up-to-date. | *[http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi Curation status form] has lists, but not completely up-to-date. | ||
*How to evaluate the [http://caprica.caltech.edu/celegans/svm_results/ results] of the SVMs | *How to evaluate the [http://caprica.caltech.edu/celegans/svm_results/ results] of the SVMs | ||
− | **Precision | + | **Precision - Of the returned positives, how many are true positives? |
− | **Recall | + | **Recall - Of the true positives, how many were returned? (need to look at negatives for particular data type) |
*Is there any way for curators to see a list of features for each SVM? May help with understanding false positives. | *Is there any way for curators to see a list of features for each SVM? May help with understanding false positives. | ||
Revision as of 12:20, 25 November 2009
Back to Caltech documentation
Contents
Call information
4:30pm GMT | 11:30am EST | 10:30am CST | 8:30am PST
US: 1-877-384-2311, +1-480-629-1629
UK: 0800-358-3475, +44-207-154-0025
Canada: 1-866-243-1291
participant access code: 822114
Location: wherever you are
Participants
http://tazendra.caltech.edu/~postgres/cgi-bin/curator_first_pass.cgi
What type of data is going into each flag?
Pipelines and options for flagging
Data may come into WormBase via various pipelines (e.g. Genbank, Knockout Consortia, user submissions) but for flagging data in published papers, here are the current pipelines:
- Curated papers are the best training set. Flagged papers can be used, if flagging was generally consistent.
- Curation status form has lists, but not completely up-to-date.
- How to evaluate the results of the SVMs
- Precision - Of the returned positives, how many are true positives?
- Recall - Of the true positives, how many were returned? (need to look at negatives for particular data type)
- Is there any way for curators to see a list of features for each SVM? May help with understanding false positives.
- pattern matching
- category searches
- Curators need to tell Juancarlos they'd like to receive emails when authors flag a data type.
- Caltech needs to supply list of papers flagged since September 2009.
- Stats on return rates as of November 12, 2009 (supplied by Juancarlos):
Since Sept 1st, we have sent out 195 requests, and gotten back 72 results (36.9%).
Since Oct 1st, we have sent out 147 requests, and gotten back 52 results (35.3%).
Since Nov 1st, we have sent out 18 requests, and gotten back 7 results (38.9%).
}
Flag name | Flag information | Number of papers flagged manually (from curation status form) | Flag email (from first pass form) | Getting author flags? | Current approach | Curator(s) | Comments | Current pipeline sufficient? | |
---|---|---|---|---|---|---|---|---|---|
gene symbol | newly cloned gene or new name for previously known gene | 342 | genenames, vanauken | no-vanauken | SVM (see comments) | Kimberly, Mary Ann? | Currently being combined with seqchange. Could possibly employ secondary screen with categories. | ||
mapping data | genetic mapping data | 194 | genenames | ||||||
sequence features | regulatory sequence features, includes promoters, enhancer, elements in mRNA | 248 | worm-bug, stlouis, xiaodong (xdwang) | ||||||
mass spectrometry | mass spec analysis | 65 | gw3, worm-bug | Textpresso categories | Ruihua, Gary? | ||||
structure correction | gene structure corrections (see comments) | 333 | worm-ticket, worm-bug | tried SVM | Gary, Paul Davis | Ideally divided into four categories: a change in a gene's structure, the addition of an isoform, a change to one of the SL1/SL2 or polyA site features, a sequence correction in the N2 reference genome | |||
sequence change | sequence of mutant alleles | 981 | genenames | SVM | |||||
new SNPs | new polymorphisms | 50 | tbieri | ||||||
new mutant - alleles | new phenotypes for new or existing (?) alleles | 1372 | Erich, Gary, Jolene |