WormBase-Caltech Weekly Calls October 2012

From WormBaseWiki
Jump to navigationJump to search

October 4, 2012

FlyBase SAB topics

  • Genome Space - can integrate data in different formats
    • Cloud-based data integration
    • File conversion done automatically - no need to write scripts
  • FlyBase talking about joining Genome Space?


Protein-to-GO Tool

  • Ranjana and Kimberly performing data checks for testing
  • If file can get to Rachel and Tony, maybe can discuss this weekend (at GO meeting)


GO Annotator Tool

  • Kimberly looking over
  • James will make a couple slides to introduce tool and provide demo
    • 1-minute demo: can have file in XML/HTML format, can highlight, annotate a sentence, and save annotation as a link
      • Demo: simpler = better
      • Prepare for live feedback and discussion


Curation Status/Statistics Tool

  • How many papers have given data types?
  • How many papers have been curated, how many not?
  • How many objects/connections do we have of a given type?
  • How many objects per paper (average, distribution?)?
  • Estimated number of objects/papers exist vs. how many we have curated?
  • Do we care about types of flagging? Curator first pass, author first pass, etc? Yes
    • All flagging types should be shown; combined and individual statistics would be useful
  • Interactive form vs. static page showing curation/flagging statisitcs
  • Tracking negatives (true vs false)
  • Data types in OA vs not (microarray, Protein-to-GO output)
    • Microarray data - Wen can write script to generate stats for microarray curation
  • Curation stats per paper (Raymond): write comments in a remark field; traceable, transparent, available


Process pages

  • GPML - may be able to automatically map Postgres data to GPML
  • WikiPathways has color, formatting, labels, etc. so we can define types/views of different relationships (should follow standards)
  • AWC-ON/AWC-OFF sample page: only single connection type; can be many different types, as we define it
  • "Too many arrows" editorial?


Physical interaction curation

  • On interaction OA (Tab2) physical interactions; 'Colocalizes', etc.
  • Need to establish our data exchange with BioGRID; get BioGRID data in Postgres/OA (daily cronjob?)
  • Interaction model was modeled to be compatible with BioGRID's data
  • Revive curator first pass for physical interactions? To tag papers (in the meantime)


October 11, 2012

Molecules

  • Changing to WBMolIDs
  • OA dumpers need to be modified


GO Meeting

  • How to make GO more expressive/inclusive
  • How can GO represent more of the biology
  • GO Extensions: LEGO
  • Deeper annotations can help develop the "big picture"
  • Phenotype-2-GO mapping: What is the purpose? How does it work?
    • Broad vs specific term mapping; how good is the mapping?


CSHace merge with CITace?

  • PCR_product info discrepancies
  • CITace data was overwriting CSHace data, but changed
  • 8 classes of data in CSHace
  • CSHace data had never been read into CITace minus
  • Proposal: read CSHace data into ACEDB and then read in CITace minus data (which would overwrite CSHace data with CITace data where there is a conflict)
  • Can we assume that all CITace data is better/newer/more-up-to-date than CSHace data?


Consensus Sequence tag in PWM model

  • OK to add, but leave as 'Text' not '?Text'
  • DNA_text field instead? Xiaodong will run by EBI
  • This data is denormalized from a PWM if both are provided
  • Should only be used if no PWM is provided
  • Perhaps this should not be added


October 25, 2012

Curation Stats Table


WS235 Upload next Thursday (Nov 1)


Microarrays

  • 24 new datasets
  • Five new species
  • We want to perform cluster analysis for end users (SPELL itself not capable)


PWM/PFM data

  • ~5000 PFM/PWM data objects in .ace files from Stormo data set
  • How the motif finder tool should work? (without using 1000s of GBrowse tracks?)
  • Single GBrowse track that consolidates all PFMs/PWMs; one track with all 'binding sites' above threshold?
  • Another tool to look for, for example, co-occurrence?


Disease info

  • Adding 'Disease_relevant_info' tag in Gene model
  • Human ortholog pulled from OMIM by Hinxton/EBI


Ranjana will do a blog item every Thursday

  • Doing series heading, e.g. "Developments to GO"
    • Updates could fall under same headings to organize blog posts


New RNAi clones from Ahringer lab

  • Working with Kevin Howe as to how the new PCR_product and Oligo objects should be created and stored
  • Ultimately would like (for all RNAi clones) to have a one-to-one mapping of best RNAi clones for a gene
  • Gene pages could then display "THE" RNAi clone for a gene per library (e.g. Vidal or Ahringer), along with resource address/location
  • If multiple clones per gene, indicate which clone is unique to that gene


Protein-2-GO Tool

  • Gene-product-information file
  • Working with other MODs


Transgenes up to date


Legacy phenotypes

  • Still have ~800 legacy phenotypes to add through phenotype curation


Process pages

  • Karen writing summaries for processes
  • People can help cull the list and/or help write summaries
  • Look at Processes in the Process OA by querying for "Karen" as a curator
  • Connecting to relevant papers (reviews, research articles, WormBook chapters/abstracts)
  • Linking to WikiPathways, Reactome?
  • What should be editable by users/broader community?