Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchm |
|||
Line 137: | Line 137: | ||
Elsevier Linking | Elsevier Linking | ||
*Establishing pipeline for linking Science Direct papers to Wormbase and viceversa | *Establishing pipeline for linking Science Direct papers to Wormbase and viceversa | ||
+ | |||
+ | |||
+ | |||
+ | == October 20, 2011 == | ||
+ | |||
+ | Data Submission Working Group | ||
+ | *One representative from each site | ||
+ | *Raymond from Caltech | ||
+ | *Have a discussion for every type | ||
+ | **How are we doing? | ||
+ | **Are we up to date? | ||
+ | **Do we want unpublished data? | ||
+ | **Quality control | ||
+ | **Data type priority? | ||
+ | *Groups/people that have a lot of data of a particular type; acquire their data | ||
+ | **How can we facilitate data submission from these groups/individuals? | ||
+ | **Ultimately like to train these users/submitters to use curator tools | ||
+ | *Form-filling as part of publication process? | ||
+ | *What data types are we missing? Wiki Pages | ||
+ | **qPCR? | ||
+ | **Nanostring? | ||
+ | **Single molecule studies, absolute quantities? | ||
+ | **SAGE? | ||
+ | **3C (Chromosome Conformation Capture), 4C, 5C | ||
+ | **Metabalome? | ||
+ | **Pathways/Processes? | ||
+ | **Drug/disease interactions? | ||
+ | **Infections? | ||
+ | **Examples of C. elegans as a model? | ||
+ | *Expression data | ||
+ | **How do annotate a presence/absence call on expression of a given gene? | ||
+ | **rpkm cutoff? | ||
+ | **Number of molecules? | ||
+ | **Case-by-case thresholds |
Revision as of 16:23, 20 October 2011
2011 Meetings
October 6, 2011
SVM
- SVM Manuscript in process of resubmission
- Messy procedures in place; needs to be cleaned up
- Yuling able to get pipeline running from beginning to the end
- May take longer than previously expected to get up and running
- FlyBase has code running for SVM
- Yuling should contact FlyBase to make sure they're OK, once he has WormBase SVM working
- SGD would like to incorporate SVM into they're database
- Yuri would work on SGD SVM, ultimately
Physical interaction model
- Close to done
- Once model approved, work on:
- Converting BioGRID file to the ACE model
- Convert existing YH data into new model
Image curation
- Need to establish legal process for acquiring images
- Contact Caltech Office of the General Counsel for approval
WORM Paper
- Karen working on it
- Need a clearer/new plan
Expression pattern display
- Showing composite images with the option to see greater detail
- Separate out detail images according to Qualifier tags (Certain, Uncertain, Partial)
- Render oblique angles of cells/tissues to show 3D features/morphology of anatomy objects and, possibly, animations
- "Real" images should always take precedence to "virtual" images
- May talk to John Murray or Bill Mohler for confocal images of embryos
Worm Method (Worm Book)
- Any new web pages for C. elegans researchers, send to Raymond
- URL-finder for neuroscience (Yuling)
Textpresso
- Working on Fly Textpresso site (and others)
- Rose Oughtred would like Wnt Textpresso tools
- PDF tools (URL extractor)
- SVM re-working
- Arun working on GSA editor
Grants
- Need to think about next grant in a couple of months (winter); due in October 2012? 30 pages, short?
- Focus on writing papers (anatomy function, WORM, virtual worm, etc.)
- Focus on human-relevance
- Need Gantt charts and quarterly statements of progress
- Stats
- Where are we with each data type
- Status of website
- Automation
- What is different in the field?
- How is data changing?
October 13, 2011
WORM Publiction
- Looks good overall
- Change "library" of papers to "index"
- Putative deadline of November 15th
- Put in more details about the new website?
- Figures? Screenshots from new website? Gene page features?
- Save other (detailed) content for future papers
Paper pipeline
- Need to fix some PubMed IDs
- Over 900 articles in Postgres do not have a PubMed ID; Why? How to fix?
- Could some aspects of the pipeline be accomplished by a non-PhD?
- Supplemental hires? Students?
- What tasks would need to be done?
- Can these tasks be scripted/automated? Community input?
SVM
- Almost all scripts working
- Output of SVM slightly different from Ruihua's results
- Need to resolve the differences
- Data inconsistencies
- Need to spot check the papers to see what the major problems might be
- Yuling ~90% confident in his results
- Get feedback from relevant/interested curator(s)
- One-day CPU/computation time (~30 hours for 70 papers with 9 models)
NAR Publication Accepted
Genetic Interaction Curation
- Sharing curated genetic interactions with BioGRID
- What tools will each database use and how will we share data?
- Both BioGRID and WormBase will use the IMS tool at BioGRID for physical interactions
- WormBase will use the Ontology Annotator for genetic interactions
- BioGRID may still use IMS for genetic interactions, but the format will have to be parsed to populate Postgres
- BioGRID-curated genetic interactions can be flagged as such for later review
Physical Interaction model
- Kimberly and Paul Davis discussing
- XREF questions
- Paul would like to minimize the number of XREFs in the model
- Which XREFs can we remove?
Elsevier Linking
- Establishing pipeline for linking Science Direct papers to Wormbase and viceversa
October 20, 2011
Data Submission Working Group
- One representative from each site
- Raymond from Caltech
- Have a discussion for every type
- How are we doing?
- Are we up to date?
- Do we want unpublished data?
- Quality control
- Data type priority?
- Groups/people that have a lot of data of a particular type; acquire their data
- How can we facilitate data submission from these groups/individuals?
- Ultimately like to train these users/submitters to use curator tools
- Form-filling as part of publication process?
- What data types are we missing? Wiki Pages
- qPCR?
- Nanostring?
- Single molecule studies, absolute quantities?
- SAGE?
- 3C (Chromosome Conformation Capture), 4C, 5C
- Metabalome?
- Pathways/Processes?
- Drug/disease interactions?
- Infections?
- Examples of C. elegans as a model?
- Expression data
- How do annotate a presence/absence call on expression of a given gene?
- rpkm cutoff?
- Number of molecules?
- Case-by-case thresholds