WormBase-Caltech Weekly Calls November 2011

From WormBaseWiki
Jump to navigationJump to search

November 3, 2011

Gene ID mapping problems (as discussed in site-wide call)

  • Does CIT need to worry about consistency of gene IDs over database builds?
  • Hyper-linking entities in papers could be an issue
  • Acquire an ID for every gene for every new genome?
  • When mark a paper, separate into two different types of genes :stable vs unstable
    • Stable genes map to stable ID
    • Unstable genes map with a version number
  • WormBase communicate/work with DCC personnel for data wrangling during post-DCC era?


Updating Predicted Gene Interactions

  • Wei wei asking about how to update with WormBase
  • WormBase predicted interaction objects can be updated simply via link to Gene Orienteer
  • If we want a live interaction browser, we would need to pull the data into the build process
  • Do we want a single object per pair of genes? One object per instance of evidence (per paper)?


November 10, 2011

Worm Publication is under review


NAR paper is published - has DOI

  • Williams et al has to be added to bibliography


Citace, Development Release

  • Cron-jobs/scrips used to update the database, but now requires manual manipulation
  • Manual versus automated updating of database
  • Need a plan for the future
  • Migration to EBI affecting the process; will things change again
  • Where will WS release files exist


SVM

  • Collecting feedback from curators
  • Curators need to review results


Motifs in WormBase

  • Xiaodong working on it


Interaction Model

  • Discuss with Todd the integrated model of 4 types (physical, genetic, predicted, regulatory)
  • Change WBID names (WBInteraction versus WBPhysicalInteraction, WBGeneticInteraction, WBRegulatoryInteraction, WBPredictedInteraction)
  • Put together series of example data objects; send to Todd/Paul to test out
  • Need Interaction_type tag to distinguish the 4 basic types
  • Types will need to be parsed/extracted after the build for FTP download etc.


Human Disease Relevance Descriptions

  • Started going in from last build
  • Ranjana will discuss with Web Team
  • Disease-paper connections


November, 17, 2011

Handling 50-100+ genomes

  • Annotation, gene finding, storage, handling
  • ~1 week to build all genome browsers and BLAT/BLAST servers
  • Homology run takes longer; from Compara pipeline
  • Species-by-species build processes?
  • Lookup if genome in ACEDB; if not, look up gene models in GFF databases


Beta site

  • What is needed:
    • Need eyes to look at current pages
    • Suggestions as to what is needed on individual pages
  • Currently not adding a lot of new features
  • Data models browsable/searchable?
  • If we provide AQL, need to provide data models
  • Intermine will eventually make AQL obsolete(?)
  • Run AQL queries from a local ACEDB instance for better performance (and fewer time outs)
  • Time out limits currently set to 2 minutes on AQL queries


Expression pattern curation

  • Confidence flags
  • Ex: expressed in HSNs, in HSNL
  • Annotate to parent term unless explicitly (and confidently) stated observed in child term/object
  • Consistency of curation?
  • Discussion between Raymond, Wen, Daniela, etc.
  • Want list of ALL genes expressed in a cell (with some cutoff)
  • Also want list of genes expressed ONLY in that cell and not others
  • New technology generating paradigm shift; how to best represent data
  • Categorize data sets (based on methods)
  • Use evidence codes (text?) to categorize (ECO?)


ncRNA transcripts

  • To pull out ncRNA genes and GO terms (Sarah Burge)
  • Need to exclude protein-coding genes from query


GO Meeting (Kimberly)

  • A lot of changes to be put in place
  • Infrastructure, annotation changes - we'll need to sort out what this means for WormBase
  • Annotation: "Annotation extension" column to add additional info
  • Make explicit annotations for gene products
  • Next meeting in February at Stanford (focused annotation meeting)
    • Workout specifications for common annotation framework


WormMart status

  • Are we switching over to BioMart-run WormMart?
  • OICR crew working on new data-loading tool
  • Update still pending...
  • Check modmine (http://intermine.modencode.org/)


Reviews in on SVM paper

  • Maybe done in 3 weeks
  • Reviewer wanted more metrics