Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
m
Line 10: Line 10:
 
[[WormBase-Caltech_Weekly_Calls_January_2011|January]]
 
[[WormBase-Caltech_Weekly_Calls_January_2011|January]]
  
 
+
[[WormBase-Caltech_Weekly_Calls_February_2011|February]]
== February 2, 2012 ==
 
 
 
 
 
EPIC data into WormBase
 
*Daniela spoke to John Murray
 
*Need to modify model and display for data
 
*3D movies as ?Movie objects
 
 
 
 
 
Endrov
 
*Tom Burglin and student Johan Henriksson developed Endrov.net (http://www.endrov.net)
 
*May be able to incorporate Endrov visualizations of Blender model and cell lineage into WormBase website?
 
*Need to talk to Todd and Web team
 
 
 
 
 
Elsevier legal issues
 
*Science-direct website links to-and-from WormBase website
 
*Daniela still working with contacts at MGI (Mouse Genomics Institute/Jackson Labs) & Elsevier
 
*MGI already has established links with Elsevier
 
 
 
 
 
New ?Interaction model
 
*Update old objects and speak to web team about new model before officially incorporating new ?Interaction model
 
 
 
 
 
 
 
== February 9, 2012 ==
 
 
 
Interaction model
 
*Added some XREFs (in Interactor tag) and Chromatin_IP to Detection method
 
*Co-Immunoprecipitation would be captured under Detection method "Affinity Capture Western"
 
*Worked out changes needed for old *.ACE files to fit new model; will give to Juancarlos to script
 
 
 
 
 
Transgene
 
*Daniela, Juancarlos, and Karen imported Extrachromosomal array transgenes into the Transgene OA
 
*Extrachromosomal array transgenes for which authors have not provided a name will be named Expr####_Ex
 
*Maybe we will name according to paper e.g. "WBPaper########_Ex####"
 
*Daniela and Karen will discuss cost-benefit of objectifying Ex transgenes
 
*Rather than determine transgene sequence (or partial sequence), curators will (as they have been for Expr_pattern) add free-text describing the sequence
 
**This includes: primer sets, restriction digest sites, etc.
 
**Continue to place this info in the Reporter_gene tag
 
**Maybe add an additional free-text field (Sequence_info tag?)
 
 
 
 
 
Displaying curator names on new website
 
*Should we display curators' names on their curated objects?
 
*Currently only on concise description; why not do all objects?
 
*Keep curator name info only internal?
 
*Prevent curator names from being dumped for build/release?
 
*Should include 'Date-last-updated' Evidence dump
 
*Curator confirmed note could be placed in Tree View for all data types
 
 
 
 
 
Incremental Updates
 
*Should we perform incremental updates?
 
*Update as frequently as possible?
 
*Web display (of updates) served from Postgres/Tazendra?
 
*Serve RESTful widget from Caltech?
 
*Caltech in agreement about pushing forward with incremental updates
 
 
 
== February 16, 2012 ==
 
 
 
 
 
Gene product annotation for variation
 
*Variation affects gene product: absent, disfunctional, isoform-specific effects
 
*How is this best captured?
 
*Report as a phenotype? "RNA expression variant", "protein expression variant", etc...
 
*Capture as a gene_regulation event? Soon will be interaction object...
 
*Sequence ontology?
 
*Captured in ?Variation?
 
*Ask Hinxton/Mary Ann Tuli
 
 
 
 
 
Interaction model
 
*Old objects updated OK and read into ACEDB without problems
 
*Need to discuss updates with Web team
 
*Will send all updated *.ACE files with old Gene_regulaion, Interaction, and YH objects
 
*Add two zeros to the Interaction IDs in postgres OA tables once all current interaction objects uploaded to the Interaction OA
 
 
 
 
 
Transgene naming
 
*New extrachromosomal arrays will get "WBPaper###_Ex###" type of ID
 
 
 
 
 
SPELL Issues
 
*Problems coming up as data size increasing
 
*SPELL server at Amazon running at lowest paid instance ($1000/yr)
 
*4GB memory needed (more than 132-bit machine can provide) for loading data
 
*64-bit machine will cost $4000 per year
 
*Run SPELL on canopus? Yes but, have to take care of sys-admin issues
 
*$500/quarter for IMSS machine maintenance
 
*Talk to Matt Hibbs? Maybe need to optimize the process at that step
 
 
 
 
 
BioCreative/BioCurator meetings
 
*Kimberly working with DictyBase
 
*Kimberly will give 2 talks (CCC and molecular function automated curation)
 
*WormBase workflow; Kimberly will discuss with individual curators in March
 
*Arun submitted abstract
 
 
 
 
 
GO Consotrium meeting in a week and a half
 
*Paul, Michael (Muller) and Kimberly going
 
 
 
 
 
Human diseases will be objectified
 
 
 
 
 
 
 
== February 23, 2012 ==
 
 
 
 
 
BioCurator Meeting
 
*Karen, Arun, Kimberly, Michael and Yuling are going
 
*QCFast poster accepted for presentation
 
 
 
 
 
GO Consortium meeting this weekend
 
*Things to bring up at meeting?
 
*Responses to questionnaires
 
 
 
 
 
Purchase OA Domain name?
 
*Move to a more formal link aside from mangolassi
 
*wormbase domain name?
 
*Cost? ~$11/year
 
*"Ontology Annotator" may not be best name
 
*"Curatool" and "Biocuratool" available; the "curinator"? Curate at your "curinal"? ;)
 
*Currently only a static site
 
*GO Consortium may want to use the tool
 
 
 
 
 
GO Upload
 
*Going back and forth on deciding frequency of upload
 
*Currently back to two-month uploads
 
*We can certainly change frequency
 
*Two-month cycle for upload (in sync with WormBase) is too long of a cycle for GO curation
 
*Change GO curation upload to once per month (twice per month?)
 
*Things will change when curating through a GO curation interface
 
 
 
 
 
SPELL server on Amazon
 
*1)Existing paid service doesn't seem to be stable
 
**Looking at log file did not reveal anything
 
**No reply from Matt Hibbs
 
*2)Amazon installation cannot be updated to WS229 because the dataset demands more memory
 
**Raymond tried to install a 64-bit machine (free); wouldn't work
 
*We're stuck without more technical support (wrt Amazon service)
 
*Host ourselves (on canopus?) until OICR will take over? (When OICR is ~done with Beta site)
 
*Host through IMSS?
 
*WormMart is the only function of caprica; maybe setup SPELL on caprica for time being (next couple of months)
 
*Will Gary Williams et al add more RNA-Seq data?
 
*Athena (8GB memory?)
 
*Instead of virtual machines, have one machine that does everything
 
*SPELL usage? Not very much, but several (consistent users (300-900 queries per month)
 
*Farm out to Matt Hibbs? Matt not serving data
 
*Who is in charge of SPELL at SGD? How does SGD feel about SPELL? Ask Mike Cherry
 
*Problem with hosting at OICR if SPELL needs tinkering...
 
*Can Wen access and manipulate if at OICR? Not easy
 
*SPELL has LOTS of data (millions of lines)
 
*Will try to run locally for short term; meanwhile look for more resilient plan
 
*Kimberly can ask Cara Delinsky (sp?) about SPELL
 
 
 
 
 
RNAi parsing script
 
*Wen would like to work on
 
*Updated more than a year ago
 
*Should be able to parse interaction data directly into OA
 
*Would like to handle new variations and transgenes
 
*Script should look in Postgres tables
 
*Still need to deal with DNA sequence text mapping to genome/genes
 
*Elbrus a very old machine; very slow
 
*Install ace-server on tazendra?
 
*Ideal: OA takes/handles bulk of data; just run a script on the side to handle mapping DNA sequence to the genome
 
*We will build the RNAi OA
 
 
 
 
 
Transgene naming strategy
 
 
 
*We will stick to the Expr1234_Ex naming as naming after WBPaper was giving problems (i.e. dealing with 430 objects with no Paper attached and other minor issues).
 
*For a complete record of the process check wiki: http://wiki.wormbase.org/index.php/Expression_Pattern#Exporting_Reporter_Gene_description_from_Expr_pattern_OA_to_Transgene_OA
 
 
 
  
  

Revision as of 17:48, 1 March 2012

2009 Meetings

2011 Meetings


2012 Meetings

January

February


March 1, 2012

GO Meeting

  • Focused on annotation pipelines; improving efficiency/effectiveness
  • How to make GO annotations more 'expressive'
  • GO would like to move towards more expressive statements
  • Example: If a gene is involved in a function or process, where in the cell does this take place
  • Common Annotation Framework
  • Current/future members of the GO network can annotate using the same version of GO, same tools and standards
  • Quality controls checks: e.g. do you have all the fields necessary to make an annotation
  • GO hopes to centralize all of the data handling, formatting
  • LEGO - Logical Extensions of GO
  • We should pilot how we want to handle this; similar to how concise descriptions are constructed
  • WormBase curates phenotypes, pathways, etc.
  • Defining useful relationships to curate/annotate: Cross-products with defined relations
  • Pilot: Take subdomains, pathways, try extended version of curation on these
  • How do we capture that fly eye development is relevant to human biology?
    • Humans don't have compound eyes - not the point
    • The pathways are the same or similar; EGF signaling
  • WormBase Process curation could really benefit from GO's adoption of this strategy
  • Need to consider what the "right" way to approach this issue; need good pilot
  • Where is the value? How do we focus on this?
  • Another annotation pipeline: Phylogenetic Annotation and INference Tool (PAINT)
    • How best to make these inferences?
    • What kind of inferences can you make about organismal- or organ-specific processes?
      • Uberon has framework for interspecies anatomical comparisons
    • PAINT tool for nematodes?


Upload for WS231

  • Interaction file upload took several hours
  • Check if virtual memory is being used
  • Likely culprit is the extra data and XREFs in the Interactor_info hash
  • Can objectify the Interactor_info to be a tag in the main ?Interaction model
  • We should warn EBI/Hinxton about this


WormBase Curator Interview next Thursday


Migration of Reporter_gene object annotations from Expr_pattern OA to Transgene OA

  • Everything seems OK


SPELL

  • Papers with less than three experiments, statistics calculations cause slow-down, memory limitations
  • Now can bypass this problem
  • We are now operating SPELL on our local machines
  • Do Amazon instances function/behave differently than local server?
    • Need to compare; find benefits & drawbacks
  • Use Amazon server as a dynamic name server
  • Users shouldn't notice a difference
  • We won't need to ask Todd for anything; we can fix it ourselves


GO Meeting breakout session

  • Software architecture for upcoming GO expansion (CAT - Common Annotation Tool)
  • How does Textpresso integrate?
  • What kind of annotation would GO expect Textpresso to do?
  • User will be able to do guided text mining operations
    • Example: regular expressions, then HMM, then export to CAT
  • No forseeable roadblocks
  • Maybe standardize all of the text mining types and methods behind them
  • Develop paper-viewer? Apart from CAT, text mining flow? Separate module