First-pass schedule, instructions, automation
[back]
Contents
Genetics author first pass form
Link to the project page for GENETICS author data submission
First-Pass Rotation
First-Pass Curator | Two-week period | Landmarks |
Karen | 2/16-3/1 | |
Ranjana | 3/2-3/15 | |
Raymond | 3/16-3/29 | |
Jolene | 3/30-4/12 | New forms active |
Gary | 4/13-4/26 | Author forms sent out |
Xiaodong | 4/27-5/10 | Author feedback coming in |
Erich | 5/11-5/24 | |
Wen | 5/25- 6/7 | |
Kimberly | 6/8-6/21 |
Automation progress
Texpresso's Automation Explanation and Schedule http://goldturtle.caltech.edu/wcat/
First-Pass a paper
Access papers on the WBPaperEditor page
Pick a paper and access the curation form
- Go to http://tazendra.caltech.edu/~postgres/cgi-bin/wbpaper_editor.cgi
- Choose your name
- Scroll down the page and select "Not Curated Plus Textpresso!" The body of these papers have passed through Textpresso and should report all found datatypes in the automated pipeline in the appropriate fields.
- Scroll/Page down to a paper and select "Curate!"
Alternatively
- Access the curation.cgi from the WBPaperID page itself
- Select the WBPaperID from left column to take you to the paper page-- ONLY SELECT PAPERS FROM WBPaper0030000 AND LATER
- Select first-pass curate
Note: the paper pdf can be accessed from the paper page along with supplemental materials.
Either action takes you to the curation.cgi (SEE BELOW)
New firstpass curation.cgi
Instructions for curators for the new form
Texpresso/Author/Curator interim form
First-pass data types explained
Link to an explanation of the data types as they existed on the old curator firstpass form
Adding new gene paper connections
You can add gene paper connections through the WBPaper editor page.
You can search directly for the paper on this page or you can access the paper by hitting "DISPLAY ALL!" and choosing the paper from the left column link. You will be taken to a summary page for the paper. At the bottom of the page confirm or add new genes associated with that paper.
Other Tasks
Who | When | Task | Goal |
Jolene | current | with Ruihua: create curation textpresso webpage workspace | automating newmutant sentence identification/extraction for phenotype curation. |
Wen | DONE | make foreign language tag for Journals/Articles | Clear FP list of non-English language articles, which can't be curated |
Andrei | DONE | Analyze author fill-out form | To provide a summary of what works and what not and to seek improvements of the procedure. |
Andrei | DONE | Work out the correspondence of fields between the author and curator forms. | |
Juancarlos | DONE | Waiting for consensus on what to put in fields for author and curator for FP. |
Raymond suggests : To have a first pass form for curators that shows author's submission for curator's approval (e.g. a tick) for that information to be sent (along with whatever information a curator puts in) for data extraction. If the first-pass curator dis-approves author's input (by not ticking), then the author's input will not be further processed or sent but it will be nevertheless stored as is in the database. For the next phase, results from automated first-pass via textpresso could be treated similarly as that of the author's. The ultimate goal is to maximize the number of fields that need no curator approval. Sources of first-pass curation should be clearly distinguished by a person ID (textpresso will be assigned one). Juancarlos is okay with this, but while I'm leaning towards assigning the author response to a PersonID, if this is not going to be used for evidence anywhere, the corresponding email is possibly the more accurate evidence since the receiver may pass it on to someone who is not the WBPerson that email is assigned to. In that case Textpresso wouldn't need a person ID. I'm still leaning toward using IDs though, I'm just not sure it reflects the right things if we ever want that in WB or something like that. Juancarlos also needs to know how curators want to enter data. For any given Paper-Field, would curators want to be able to enter unlimited entries, and make then invalid to delete ? Would you prefer the current system where there's only a single box where everyone mushes in all data ? Would you care about the history of deleted things ?| |
Juancarlos | DONE | Resolve duplicates | There are many papers on the firstpass list that are already firstpassed. Most of these papers are duplicates and have two WBPaperID assignments. Is there a way to resolve this?. |
Juancarlos | DONE | Assign Publication Type 'Review' to all papers that are annotated as 'review' in the Comments section of the FP curation form | Remove these papers from the corpus that textpresso searches for data type patterns. |
Arun | DONE | Increase scans for transgene objects to occur daily and change pipeline to run on new papers only | |
Juancarlos | DONE | Sort papers based on first pass checkout list based on whether or not they have been passed through Textpresso | Curators can now focus on papers whose data fields have already be filled in by textpresso. |
Unassigned tasks and comments that need more discussion
- We do have a record of data type curation for each paper, is there some way of combining the first pass curation with the curation status form?
-The curation status form gets flagged data from the first pass form (don't think I understand the question) -- Juancarlos
- How does the false positive work?
- the words ``false positive get appended (or prepended, I forget and can't find an example) into the text field -- Juancarlos
- Curator preference for FP remarks, can people deal with the not getting detailed notes about their data type and where it exists in the paper or should this be a mandatory part of first-pass?
- Discrepancy between FP papers and total papers curated. For some data types, curators get to the paper before the FP curator, it would be good to know that a curator already touched it. Is there a way to mark the paper in the FP list as curated for a specific data type but not others?
"I have curated some papers that did not go through first pass, mostly Expr_pattern papers published before 2001. Is it possible that you check current WormBase, find all the paper with Expr_pattern data and flag these papers in postgres as Expr_pattern "yes"?" WC "Jolene asked this question last week when she firstpassed an old paper and found that the Expr_patterns she flagged were already curated." JC
"Sure, please scp a .ace dump with the Expr_pattern and Papers into the acedb account on tazendra and I'll mark those as ``yes (the papers themselves won't show as already being curated, but will have that data like the old system worked with textpresso transgene)" JC
"Would it be possible to do this for all other data types that have been curated from papers that have not gone through first-pass? There are a number of papers that were curated for phenotypes but were not first-passed." KY
"Yes. Likewise generate a .ace file with the data you want, scp it to the acedb account on tazendra, and let me know which postgres tables to populate with that data." JC
- I thought on the checkout section they showed as ``RNAi only or something like that. -- Juancarlos
FP curator comments for St.Louis and Sanger structure correction data type
kjy 19:43, 9 February 2009 (EST)