WormBase-Caltech Weekly Calls May 2011

From WormBaseWiki
Jump to navigationJump to search

May 5, 2011

Citace upload

  • Using current citace release (WS226) for Progress Report


Pictures

  • Daniela did not actually curate ~7000 pictures this month ;)
  • Script to fetch pictures automatically


Snehal starts next week

  • working on phenotype curation
    • 90 genes >= 10 publications
    • Of these 90, 54 with no annotated phenotype
    • 20,000 genes with RNAi data
  • Concise Descriptions? May be appropriate
    • Prioritize by genes that currently have no description but have new papers
  • Picture curation?
  • Learning an ontology may be difficult, but she can stick to familiar material


Aldrin

  • Karen, Raymond, Chris will talk to Todd
  • Lineage browser for ontologies?
  • Cell lineage browser? Capture multi-parentage issues


Ranjana - Preplanning WormBase side meetings at IWM?


Periodically sending someone to different sites

  • E.g. Someone from Caltech goes to Hinxton, etc.


Practice Talks before International Worm Meeting

  • on Monday before meeting


May 12, 2011

Example of Bandana for International Worm Meeting

  • Looks good!


GO Consortium meeting next week

  • Paul, Michael, Ranjana, and Kimberly going
  • Discussion about tools (Juancarlos?)
  • Section in meeting agenda to discuss curation
    • Textpresso-based curation
    • Ontology Annotator
    • Demo by Kimberly or Ranjana?
    • Relatively trivial (depending on situation) to setup new OA for a new site (Juancarlos)
    • Discuss what's been proposed in the grant
    • What does the consortium want? Brainstorming/Fantasizing
    • People are likely partial to their own tools
    • Make use of the best parts of everyone's tools
    • Are others using flat/text files or web-based/java-based tools? Mostly web-based
    • Phenote? Funding? Bio-Portal e-mail?
    • PAINT tool (Phylogenetic Annotation and INference Tool) (http://wiki.geneontology.org/index.php/PAINT)


Progress Report

  • Paul has received some reports, but not all, yet


Picture curation

  • Started receiving permission from Elsevier!!
  • Blanket-permission for 11 journals
  • Publicly display list of cooperating journals online


Talks at the Int Worm Meeting

  • Paul S listed for a Plenary talk during Workshop
  • Two Wormbase Plenary talks


Snehal

  • Concise descriptions
  • Kimberly helping
  • Will get into phenotypes soon


Juancarlos waiting to hear back from some people about OA

  • People need more time to test
  • Karen has encountered some issues; needs to reproduce
  • Ranjana - loads slowly?
  • Mangolassi issues?


Genetic Interactions

  • Group met on Tuesday
  • Made headway on genetic interaction organization
  • Add "Additive" type term and "No Apparent Interaction", etc.


Paper pipeline

  • Kimberly & Juancarlos working on
  • Documented on Wiki
  • CGI's still being used?
  • Remove outdated or unused CGI's?


Rewriting script from Jack Chen

  • Neuron connection search
  • Inroad to working on more web-interface topics


Adding interaction datasets into WormMart (Xiaodong and Ruihua)


Gene interaction dumping script (Raymond/Todd)


Strain Display

  • Karen doing queries at CGC
  • How many genes represented in CGC strain list?
  • Protein-coding genes with alleles in strains?
  • 12,000 genes that are NOT represented in CGC
  • Discrepancy wrt alleles; negative in AQL query, but present on site
  • Strains with LOTS of alleles; how to handle?
  • Polymorphisms vs. alleles


Aldrin Montana working over the summer

  • We invited him to visit
  • We'll come up with projects
  • Web team fully involved in web site
  • Help with minor GO issues (Ranjana)
    • Requires knowledge of Perl
    • Non-C.elegans species
    • InterPro-to-GO mapping
    • Export pipelines for other species?
    • Maybe small project; requires more communication than coding?
  • Raymond could help provide projects
  • Would be nice to use PAINT across the nematodes for annotation


New Web site (Todd)

  • Web team wrapping up various projects
  • Polishing user interface
  • Abby working on user interface
  • Todd working on production environment
  • Norie working on code that interfaces with database
  • Release beta version at International Worm Meeting
  • Working with Kevin Howe to standardize formats of files used during build
  • Can now host new data for new species


May 19, 2011

Login System for people to submit and modify (WBPerson) personal information

  • How to validate that someone registering for the site is the actual person?
    • Use e-mail address, although some people don't have e-mail addresses; snail mail?


Poster Session Help Desk for International Worm Meeting

  • Wen- yes, all set
  • People will bring their own laptops
  • Wen will make a Wiki page for people to sign up for sessions


Aldrin visiting

  • June 16th/17th OK for people
  • We could start making a list of possible topics for Aldrin to work on


Clone Model Update

  • Working with current clone model to test adding plasmid objects
    • Works OK, but will probably want a couple of new tags
    • One idea is to have a tag that links nematode-specific sequence in the plasmid to genomic regions and GBrowse viewer
  • Test plasmid objects and ideas for genome alignment sent to Paul Davis, waiting to hear back


Inter-Release Patch files?

  • Still need to work out, but should wait until the new website has stabilized


May 26, 2011

Genetic Interactions (Interactions Group)

  • Made headway on genetic interactions organization and nomenclature
  • We have talked about (previously) separating physical, genetic, predicted, and regulatory interactions into different models
    • This will need more time and work
  • We can immediately add the new genetic interaction type tags to the interaction model
  • New types include: Enhancement/Suppression, Complex, No Apparent Interaction/Additive, Asynthetic
  • BioGRID will use their own curation strategy and pipeline, but may eventually incorporate our curation approach


Concise Descriptions (Snehal)

  • Avergaing 9 genes/week
  • Prioritized on new publications
  • Also working on FlyBase GSA markup (e.g. false negatives)


Picture Curation (Daniela)

  • Have authorization from ~50% of Elsevier journals
  • Still need Developmental Biology access (many C. elegans papers)


Textpresso for Cell Function (Raymond)

  • Have scheme for decent precision (still accumulating numbers)
  • Recall?
  • Review papers are, by default, tagged as primary literature and can affect the numbers
  • Textpresso needs to add a review tagging step (earlier in the pipeline)
  • Search done with keywords like "mosaic", anatomy terms, site of action, etc.
  • Many anatomy terms are not very specific (e.g. E, AB, MS, etc.)
  • Can give Michael new category list criteria for Textpresso (cleaner than current from postgres table pap_primary_data)
    • Can have inclusion & exclusion criteria


User Interface Query Tool (Raymond)


Human Disease Curation (Ranjana)

  • C. elegans as a model for human disease
  • Textpresso script identifies C. elegans papers with relevance to human disease
  • Where should this information go? Curation status form?
  • Right now, everything is simply writing to a text file that Arun has setup
  • Connect Arun's script to Curation Status form?
  • OMIM tags
  • Accession/evidence tags?
  • Apoptosis working group - identify apoptosis genes in various model organisms


KOG data

  • What should we do about it?
  • EggNOG?
  • Michael Paulini will change the data model for homology groups
  • Old data comes from citace; we need to reformat the data if we want to keep
  • Remove old KOG data entirely and replace with EggNOG data?
  • Any new data coming from Caltech? No
  • Transfer data to Hinxton


Author Flag (First-Pass) Forms

  • Who is taking care of/in charge of this (curators)?
  • Do curators look at them? Yes
  • Gary prioritizes RNAi papers based on them, Xiaodong uses it, Daniela uses it
  • Should we remove outdated/unused data types?
  • We get the author form results before we have the PDF
    • Check is in place now so author is not e-mailed until we have the PDF
    • Have authors send us the flags and the PDF?
  • Karen will coordinate a clean up process
  • Group needs to participate in evaluating the automation pipeline


SVM performance

  • Retraining SVM
  • SVM performs most poorly on Expression Patterns and Gene Regulation
  • RNAi SVM performs best
  • Subtract papers that are already curated from the paper pool
  • Let Ruihua know if there are any thoughts as to how to better train the SVM
  • First-pass forms identify some papers that are missed by SVM (maybe help?)
  • BioGRID interest? Can we use SVM on abstract only? Mostly use entire paper; could try abstracts
    • Hard to get access to full text for all papers
    • Good to encourage all databases/groups to get access to full text
    • Abstract SVM depends on what is acceptable Recall/Precision values


GSA for FlyBase (Karen/Snehal)

  • GSA markup from FlyBase is not as good as it could be
  • Snehal may focus more on GSA for the time being


International Worm Meeting WormBase Bandanas

  • Which to choose?