WormBase-Caltech Weekly Calls April 2012

From WormBaseWiki
Jump to navigationJump to search

April 12, 2012

RNAi OA

  • OA almost ready to go live
  • Testing now with test curation
  • Should go live next week for official curation


New Website

  • Most problems are being fixed in a timely manner
  • Curators can now edit links and add custom widgets
  • Issues (tracked on GitHub) being dealt with quickly


BioCurator Meeting

  • Good meeting, bigger than before
  • Common themes: data standards, how to educate users of database materials and how to use it (and think critically)
  • How can MODs work better with journals and PubMed to solve the 'triage' problem?
    • Streamlining the paper acquisition/curation process
    • MODs should ask NLM to take the burden of retrieving PDFs
    • Get lawyers involved to make available?
    • Publishers tend to be lax on text mining rules, maybe will evolve into an easier process
  • Maybe write a grant for research project as a proof-of-principle that triage can be done in an effective/efficient manner
  • May ask ISB (Int Society Biocurators) for help with this
  • Sequence and protein curation: tools, databases (topic-specific; pathways, cancer, etc.)
  • GeneWiki for human gene annotation
    • One page for each gene; already have ~10,000 articles
    • ~Dozen editors, credibility of authors checked (?)
    • Reasonably satisfied with coverage of human disease genes
  • Whole-genome sequencing of individuals
    • Newly identified genetic disorder
    • VAST instead of BLAST
  • Tool to identify primers from papers and map them to the genome automatically
  • Intermine discussed
    • Comparable to WormMart
    • Object-oriented database
    • Performs similar to WormBase
    • Many pre-canned queries
    • Advanced search Query-builder available
    • MODs switched over to Intermine from BioMart
    • WormMart - Will Spooner tried to provide queries that are more natural
    • We can work to build an interface on top of Intermine, etc.
    • Todd has made progress with getting Intermine for WormBase
  • Lot's of specialized talks, reduced the productivity (compared to BioCreative meeting)
  • Curators explaining their curation pipeline
  • Textpresso still popular ;)
    • Six out of seven MODs using Textpresso
    • Discussed text mining in particular applications (eg. CCC)
    • Textpresso only tool using full-text for mining
    • Pete from FlyBase: SVM results are deteriorating (similar to WormBase)
      • Start training from scratch; hopefully get better recall/precision numbers
  • Natural language processing on figure legends/captions
    • Tries to find text in the body that relates to figure
    • Possible collaboration with Texpresso
  • NLP research group in Germany
    • 'Actor', 'agent' etc. and relationships (RDF triplets)
  • Doug Howe (ZFIN), zebrafish corpus small enough, doesn't need Textpresso
  • Julio Collado-Vides, Textpresso for E. coli fell apart, but trying to get back together


Paul will meet someone from Elsevier

  • Image curation/ rights issues


Genetic Interaction ontology

  • SGD on board with ontology so far; performing trial curation
  • FlyBase interested in using as well; will meet with Chris and Rose in May to discuss


April 19, 2012

Interaction object displays on WormBase website

  • Chris and Maher will sort out on GitHub
  • Chris will map data from old tags to new tags and suggest display changes for new data types where necessary
  • One issue to deal with is the complex objects with multiple Interaction_types (and intended to be separate objects)


Interaction model and intragenic suppression

  • We need to make some modifications to the new Interaction model if we want to accommodate intragenic suppression (or other intragenic) events
  • Proposed change is to:
  1. Make each allele a separate object
  2. Move the Variation (and Transgene) tag out of the Interactor_info hash and into the main Interaction model under the Interactor tag
  3. Add a Cis_intragenic_suppression and a Trans_intragenic_suppression tag under the Interaction_type tag (perhaps also intragenic_enhancement?)
  • With these changes:
    • Each variation (and transgene) can be listed as an interactor with Interactor_info indicating Affected, Effector, or Non_directional
    • Genes associated with intragenic, interacting variations will display (in Cytoscape view) as interacting with themselves via a Genetic Interaction
    • Mary Ann can then indicate/curate the flanking sequences for each allele


Life_stage objects still dump as names, not IDs

  • This is because ACEDB only handles names, not IDs
  • Daniela is in charge of this class; we can discuss with her when she's back
  • We likely want to change to a system where we use only IDs in .ACE objects


URL Constructors for GSA markup

  • Todd has taken care of much of the issue of URL construction for GSA marked-up papers
  • Karen will send Todd examples of Anatomy_term/Anatomy_name links that need to be checked
  • GSA papers will need to be rechecked to ensure that all links are working


Network outages

  • Various office network ports are non-functional as of yesterday
  • IMSS/Network admins aware of issue and working on it


Interaction and Gene_regulation objects for next upload

  • Conversion scripts will need to be run again to convert objects to new model format
  • Chris will look into whether or not the mapping files (needed to update Gene_regulation objects) will need to be updated for the newest data
  • Xiaodong will dump Gene_regulation objects out of the OA using the old dumping script


April 26, 2012

Meeting with Elsevier rep

  • Elsevier getting more open to text-mining
  • People build apps and then put them on the Science-Direct site (e.g. TAIR app)
  • Wanted a couple sentences on what we want from text-mining
  • GO consortium would like text-mining for triage of new papers
  • 'Climate is better now'


Yeast-two-hybrid data issues

  • Lots of redundancies, bogus objects, many objects per bait/target (Sequences, CDSs, genes, etc.)
  • Provenance of data isn't clear
  • Should mv PCR products be mapped each build to genes?
  • May want to start from scratch and collect YH data from Vidal and Walhout labs
  • Check if BioGrid is curating this data already


Next WormBase grant due in 6 months

  • 30 pages
  • Need to figure out what we want to do in next 5 years; how we want to organize
  • Combine SAB meeting and grant writing?
  • New page types lagged behind due to updating of web site: e.g. Process pages
  • What is reasonable/realistic for what new content can get online?


Curation wish-list on Wiki (Ranjana)

  • Many papers on new topics coming out
  • Drug-screening, drug interaction
  • Infection, parasitism


Anatomy links from Worm Atlas broken

  • Links need to be fixed/cleaned up
  • Going forward, may need some sort of DOI system (stable links indefinitely)
  • An issue of GSA markup as well
  • Published links will never change; will need to accommodate


Ontology searches

  • Trying to adapt AMIGO to use our .OBO files
  • National Center for Biomedical Ontologies uses Protege instead of OBO Edit
  • Consider adopting Protege/OWL files? Conversion could be trivial
  • Parent-child relationships file for C. elegans cell lineage; need to accommodate indeterminacy
  • Use synonym assignment to handle different possible outcomes/identities?


Elbrus has reached capacity limit

  • Broke RNAi curation pipeline
  • Useful bits of code on elbrus
    • Data submission forms (RNAi data)
    • Microarray query tool (broken/toss)
  • Should put (working) code on GitHub repository


User datamining demands

  • We need to accommodate users requests for data
  • Fix WormMart/incorporate Intermine
  • Bring back Batch Gene query
  • Custom query building (by curators) based on user requests?
  • Look at help desk e-mails and determine what users want
  • Pre-canned queries?
  • AcePerl scripts could perform batch gene queries