WormBase-Caltech Weekly Calls May 2011
May 5, 2011
- Using current citace release (WS226) for Progress Report
- Daniela did not actually curate ~7000 pictures this month ;)
- Script to fetch pictures automatically
Snehal starts next week
- working on phenotype curation
- 90 genes >= 10 publications
- Of these 90, 54 with no annotated phenotype
- 20,000 genes with RNAi data
- Concise Descriptions? May be appropriate
- Prioritize by genes that currently have no description but have new papers
- Picture curation?
- Learning an ontology may be difficult, but she can stick to familiar material
- Karen, Raymond, Chris will talk to Todd
- Lineage browser for ontologies?
- Cell lineage browser? Capture multi-parentage issues
Ranjana - Preplanning WormBase side meetings at IWM?
Periodically sending someone to different sites
- E.g. Someone from Caltech goes to Hinxton, etc.
Practice Talks before International Worm Meeting
- on Monday before meeting
May 12, 2011
Example of Bandana for International Worm Meeting
- Looks good!
GO Consortium meeting next week
- Paul, Michael, Ranjana, and Kimberly going
- Discussion about tools (Juancarlos?)
- Section in meeting agenda to discuss curation
- Textpresso-based curation
- Ontology Annotator
- Demo by Kimberly or Ranjana?
- Relatively trivial (depending on situation) to setup new OA for a new site (Juancarlos)
- Discuss what's been proposed in the grant
- What does the consortium want? Brainstorming/Fantasizing
- People are likely partial to their own tools
- Make use of the best parts of everyone's tools
- Are others using flat/text files or web-based/java-based tools? Mostly web-based
- Phenote? Funding? Bio-Portal e-mail?
- PAINT tool (Phylogenetic Annotation and INference Tool) (http://wiki.geneontology.org/index.php/PAINT)
- Paul has received some reports, but not all, yet
- Started receiving permission from Elsevier!!
- Blanket-permission for 11 journals
- Publicly display list of cooperating journals online
Talks at the Int Worm Meeting
- Paul S listed for a Plenary talk during Workshop
- Two Wormbase Plenary talks
- Concise descriptions
- Kimberly helping
- Will get into phenotypes soon
Juancarlos waiting to hear back from some people about OA
- People need more time to test
- Karen has encountered some issues; needs to reproduce
- Ranjana - loads slowly?
- Mangolassi issues?
- Group met on Tuesday
- Made headway on genetic interaction organization
- Add "Additive" type term and "No Apparent Interaction", etc.
- Kimberly & Juancarlos working on
- Documented on Wiki
- CGI's still being used?
- Remove outdated or unused CGI's?
Rewriting script from Jack Chen
- Neuron connection search
- Inroad to working on more web-interface topics
Adding interaction datasets into WormMart (Xiaodong and Ruihua)
Gene interaction dumping script (Raymond/Todd)
- Karen doing queries at CGC
- How many genes represented in CGC strain list?
- Protein-coding genes with alleles in strains?
- 12,000 genes that are NOT represented in CGC
- Discrepancy wrt alleles; negative in AQL query, but present on site
- Strains with LOTS of alleles; how to handle?
- Polymorphisms vs. alleles
Aldrin Montana working over the summer
- We invited him to visit
- We'll come up with projects
- Web team fully involved in web site
- Help with minor GO issues (Ranjana)
- Requires knowledge of Perl
- Non-C.elegans species
- InterPro-to-GO mapping
- Export pipelines for other species?
- Maybe small project; requires more communication than coding?
- Raymond could help provide projects
- Would be nice to use PAINT across the nematodes for annotation
New Web site (Todd)
- Web team wrapping up various projects
- Polishing user interface
- Abby working on user interface
- Todd working on production environment
- Norie working on code that interfaces with database
- Release beta version at International Worm Meeting
- Working with Kevin Howe to standardize formats of files used during build
- Can now host new data for new species
May 19, 2011
Login System for people to submit and modify (WBPerson) personal information
- How to validate that someone registering for the site is the actual person?
- Use e-mail address, although some people don't have e-mail addresses; snail mail?
Poster Session Help Desk for International Worm Meeting
- Wen- yes, all set
- People will bring their own laptops
- Wen will make a Wiki page for people to sign up for sessions
- June 16th/17th OK for people
- We could start making a list of possible topics for Aldrin to work on
Clone Model Update
- Working with current clone model to test adding plasmid objects
- Works OK, but will probably want a couple of new tags
- One idea is to have a tag that links nematode-specific sequence in the plasmid to genomic regions and GBrowse viewer
- Test plasmid objects and ideas for genome alignment sent to Paul Davis, waiting to hear back
Inter-Release Patch files?
- Still need to work out, but should wait until the new website has stabilized
May 26, 2011
Genetic Interactions (Interactions Group)
- Made headway on genetic interactions organization and nomenclature
- We have talked about (previously) separating physical, genetic, predicted, and regulatory interactions into different models
- This will need more time and work
- We can immediately add the new genetic interaction type tags to the interaction model
- New types include: Enhancement/Suppression, Complex, No Apparent Interaction/Additive, Asynthetic
- BioGRID will use their own curation strategy and pipeline, but may eventually incorporate our curation approach
Concise Descriptions (Snehal)
- Avergaing 9 genes/week
- Prioritized on new publications
- Also working on FlyBase GSA markup (e.g. false negatives)
Picture Curation (Daniela)
- Have authorization from ~50% of Elsevier journals
- Still need Developmental Biology access (many C. elegans papers)
Textpresso for Cell Function (Raymond)
- Have scheme for decent precision (still accumulating numbers)
- Review papers are, by default, tagged as primary literature and can affect the numbers
- Textpresso needs to add a review tagging step (earlier in the pipeline)
- Search done with keywords like "mosaic", anatomy terms, site of action, etc.
- Many anatomy terms are not very specific (e.g. E, AB, MS, etc.)
- Can give Michael new category list criteria for Textpresso (cleaner than current from postgres table pap_primary_data)
- Can have inclusion & exclusion criteria
User Interface Query Tool (Raymond)
Human Disease Curation (Ranjana)
- C. elegans as a model for human disease
- Textpresso script identifies C. elegans papers with relevance to human disease
- Where should this information go? Curation status form?
- Right now, everything is simply writing to a text file that Arun has setup
- Connect Arun's script to Curation Status form?
- OMIM tags
- Accession/evidence tags?
- Apoptosis working group - identify apoptosis genes in various model organisms
- What should we do about it?
- Michael Paulini will change the data model for homology groups
- Old data comes from citace; we need to reformat the data if we want to keep
- Remove old KOG data entirely and replace with EggNOG data?
- Any new data coming from Caltech? No
- Transfer data to Hinxton
Author Flag (First-Pass) Forms
- Who is taking care of/in charge of this (curators)?
- Do curators look at them? Yes
- Gary prioritizes RNAi papers based on them, Xiaodong uses it, Daniela uses it
- Should we remove outdated/unused data types?
- We get the author form results before we have the PDF
- Check is in place now so author is not e-mailed until we have the PDF
- Have authors send us the flags and the PDF?
- Karen will coordinate a clean up process
- Group needs to participate in evaluating the automation pipeline
- Retraining SVM
- SVM performs most poorly on Expression Patterns and Gene Regulation
- RNAi SVM performs best
- Subtract papers that are already curated from the paper pool
- Let Ruihua know if there are any thoughts as to how to better train the SVM
- First-pass forms identify some papers that are missed by SVM (maybe help?)
- BioGRID interest? Can we use SVM on abstract only? Mostly use entire paper; could try abstracts
- Hard to get access to full text for all papers
- Good to encourage all databases/groups to get access to full text
- Abstract SVM depends on what is acceptable Recall/Precision values
GSA for FlyBase (Karen/Snehal)
- GSA markup from FlyBase is not as good as it could be
- Snehal may focus more on GSA for the time being
International Worm Meeting WormBase Bandanas
- Which to choose?