WormBase-Caltech Weekly Calls

From WormBaseWiki
Jump to navigationJump to search

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings


2014 Meetings

January

February

March

April

May

June

July


August 7, 2014

  • Topic Curation
    • Working out pipeline for curation
    • Collecting model/pathway diagrams from papers and reviews: should we make ?Picture objects for these?
      • Do we want to display these published diagrams on ?Paper pages and/or ?Topic pages
      • We need to determine copyrights/accessibility to images
    • Would be good to automate identification of Review articles (or "non-primary" in general) in the Topic OA
      • This could be done at the bulk import step
      • Would require a new (possibly read-only) field in the Topic OA to indicate "primary" or "non-primary"
    • Chris will look into whether "non-primary" articles are automatically excluded from the Curation Status Form
    • The "Curation Status Omit" toggle in the Topic OA may become obsolete once we have a field/column for primary/non-primary status
    • Current topic : Wnt signaling
    • We have a list of 282 papers from PubMed search using "Wnt" search term (not Mesh term) and C. elegans (Mesh term)
    • Finding all papers for a category/topic remains an ad hoc approach; different topics are harder or easier to find all papers for
    • We can create a Textpresso category for topics, like "Wnt signaling"


  • WormBase Ontology Browser
    • Needs documenting
    • Once everything is clarified, will be pushed to the live site

August 14, 2014

  • WormBase Ontology Browser (WOBr)
    • Should be ready but may need some additional testing before pushing to staging site
    • Juancarlos will push to staging during meeting
    • Curators should test this afternoon (on staging) and report any issues before 5pm
  • DataBase call this morning (and every Thursday that doesn't have a site-wide call)
    • Thomas Down testing Datomic (costs money) and Neo4J (very slow)
    • May look again at DynamoDB
    • No DB has been officially chosen
    • Will have an Amazon AWS for collaboration.
    • Expand smallace into bigger mediumace.
    • Demo off one database for advisory board meeting.
  • SAB
    • Paul Sternberg back next week; will settle travel plans then
    • What advisors are attending?
    • What feedback do we want to get from advisors?
    • How to show curation/DB progress? What stats/numbers to show?
    • We want big picture feedback from biologist advisors; what's useful to the community? What should we prioritize?
    • Karen: Perhaps a question to ask is "what are the main questions they are trying to answer when they go to the website?". When they explore a gene or protein function, what is it that they would want to see, and how? I don't think we are missing information so much as a lack of integration of the information at the model level, for example, variation phenotype affects linked to altered protein domain function
  • Belated apologies from Mary Ann - clash of evening events.

August 21, 2014

  • Generating a Site-Map for WormBase
    • Use a crawler to generate? Output would need to be made human-readable
    • We could use the legacy site as a site map
  • Citace upload report modifications
    • wikipage here: Citace_upload_report
    • Goals of this report
      • Summary of the uploaded data classes/objects - this summary should be blind to requests from curators and should alert Caltech to missing data classes or severe changes in numbers of objects within preexisting data classes
      • Summary of curation work - this summary should be curator driven, in some cases the summary will require a more involved aceperl query to get at the actual annotation rather than a straight data class object number.
    • Can we get a comparison for those data that are curated through postgres? It would be very helpful to be able to compare the changes in postgres with the changes in the Citace upload.
    • Can we automate the generation of this report to make it easier to change and track?
    • Regardless of manual or automated report generation when there is a model change or data class addition/subtraction, the responsible curator needs to inform Wen of the need for compensatory modification in the report.

August 28, 2014

  • Curation Statistics for SAB
    • Curators should send Chris all stats for their respective data types: total papers curated, total backlog, false positives
    • Can we ignore SVM results for certain data types?
    • Can we include Textpresso search results for relevant data types?
    • Curators that use a Textpresso pipeline: Karen, Ranjana, Xiaodong, (Daniela?), Mary Ann
    • Can we get detailed web usage statistics on particular datatypes?
    • We want to articulate our priorities to the SAB; get feedback
    • RNAi curation could get up to speed in 5 years if we have two FTEs on RNAi curation
    • Are there certain genes that have less phenotype coverage that we should prioritize?
  • Database migration call
    • MongoDB, CouchDB, Neo4J, Datomic candidates
    • Neo4J likely ruled out because of slow performance
    • Will compare performance of Datomic vs. Postgres and ACEDB, etc.
    • Datomic has good history tracking
    • Thomas Down has experience with Datomic
    • Probably won't go with a relational database
    • We should use the Gene page (webpage) as a demo/example of what we want to try to emulate
    • Adam (from Lincoln's group) working with OrientDB (graph database)
  • Citace Upload Report
    • Classes/datatypes missing from Citace Upload Report
    • Karen started a Wiki page to capture this info: Citace_upload_report
    • Curators should take a look and make sure it is properly filled out for their data types
    • Columns are present in the table to make requests for certain numbers in Citace Upload Report and/or the Build Report
    • Wen will take requests for queries to Citace etc. to add data to report
  • UniProt linking to and from WormBase
  • Paul S going to NIH for data science meeting next week
  • FlyBase pushing human disease curation
  • LEGO backend updates
    • Still need to get some backend logistics sorted out
    • Communication between Michael Muller and Chris Mungall
    • OA-like interface for Noctua?