Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchm |
|||
Line 193: | Line 193: | ||
*We will stick to the Expr1234_Ex naming as naming after WBPaper was giving problems (i.e. dealing with 430 objects with no Paper attached and other minor issues). | *We will stick to the Expr1234_Ex naming as naming after WBPaper was giving problems (i.e. dealing with 430 objects with no Paper attached and other minor issues). | ||
*For a complete record of the process check wiki: http://wiki.wormbase.org/index.php/Expression_Pattern#Exporting_Reporter_Gene_description_from_Expr_pattern_OA_to_Transgene_OA | *For a complete record of the process check wiki: http://wiki.wormbase.org/index.php/Expression_Pattern#Exporting_Reporter_Gene_description_from_Expr_pattern_OA_to_Transgene_OA | ||
+ | |||
+ | |||
+ | |||
+ | == March 1, 2012 == | ||
+ | |||
+ | |||
+ | GO Meeting | ||
+ | *Focused on annotation pipelines; improving efficiency/effectiveness | ||
+ | *How to make GO annotations more 'expressive' | ||
+ | *GO would like to move towards more expressive statements | ||
+ | *Example: If a gene is involved in a function or process, where in the cell does this take place | ||
+ | *Common Annotation Framework | ||
+ | *Current/future members of the GO network can annotate using the same version of GO, same tools and standards | ||
+ | *Quality controls checks: e.g. do you have all the fields necessary to make an annotation | ||
+ | *GO hopes to centralize all of the data handling, formatting | ||
+ | *LEGO - Logical Extensions of GO | ||
+ | *We should pilot how we want to handle this; similar to how concise descriptions are constructed | ||
+ | *WormBase curates phenotypes, pathways, etc. | ||
+ | *Defining useful relationships to curate/annotate: Cross-products with defined relations | ||
+ | *Pilot: Take subdomains, pathways, try extended version of curation on these | ||
+ | *How do we capture that fly eye development is relevant to human biology? | ||
+ | **Humans don't have compound eyes - not the point | ||
+ | **The pathways are the same or similar; EGF signaling | ||
+ | *WormBase Process curation could really benefit from GO's adoption of this strategy | ||
+ | *Need to consider what the "right" way to approach this issue; need good pilot | ||
+ | *Where is the value? How do we focus on this? | ||
+ | *Another annotation pipeline: Phylogenetic Annotation and INference Tool (PAINT) | ||
+ | **How best to make these inferences? | ||
+ | **What kind of inferences can you make about organismal- or organ-specific processes? | ||
+ | ***Uberon has framework for interspecies anatomical comparisons | ||
+ | **PAINT tool for nematodes? | ||
+ | |||
+ | |||
+ | Upload for WS231 | ||
+ | *Interaction file upload took several hours | ||
+ | *Check if virtual memory is being used | ||
+ | *Likely culprit is the extra data and XREFs in the Interactor_info hash | ||
+ | *Can objectify the Interactor_info to be a tag in the main ?Interaction model | ||
+ | *We should warn EBI/Hinxton about this | ||
+ | |||
+ | |||
+ | WormBase Curator Interview next Thursday | ||
+ | |||
+ | |||
+ | Migration of Reporter_gene object annotations from Expr_pattern OA to Transgene OA | ||
+ | *Everything seems OK | ||
+ | |||
+ | |||
+ | SPELL | ||
+ | *Papers with less than three experiments, statistics calculations cause slow-down, memory limitations | ||
+ | *Now can bypass this problem | ||
+ | *We are now operating SPELL on our local machines | ||
+ | *Do Amazon instances function/behave differently than local server? | ||
+ | **Need to compare; find benefits & drawbacks | ||
+ | *Use Amazon server as a dynamic name server | ||
+ | *Users shouldn't notice a difference | ||
+ | *We won't need to ask Todd for anything; we can fix it ourselves | ||
+ | |||
+ | |||
+ | GO Meeting breakout session | ||
+ | *Software architecture for upcoming GO expansion (CAT - Common Annotation Tool) | ||
+ | *How does Textpresso integrate? | ||
+ | *What kind of annotation would GO expect Textpresso to do? | ||
+ | *User will be able to do guided text mining operations | ||
+ | **Example: regular expressions, then HMM, then export to CAT | ||
+ | *No forseeable roadblocks | ||
+ | *Maybe standardize all of the text mining types and methods behind them | ||
+ | *Develop paper-viewer? Apart from CAT, text mining flow? Separate module |
Revision as of 17:37, 1 March 2012
Contents
2012 Meetings
February 2, 2012
EPIC data into WormBase
- Daniela spoke to John Murray
- Need to modify model and display for data
- 3D movies as ?Movie objects
Endrov
- Tom Burglin and student Johan Henriksson developed Endrov.net (http://www.endrov.net)
- May be able to incorporate Endrov visualizations of Blender model and cell lineage into WormBase website?
- Need to talk to Todd and Web team
Elsevier legal issues
- Science-direct website links to-and-from WormBase website
- Daniela still working with contacts at MGI (Mouse Genomics Institute/Jackson Labs) & Elsevier
- MGI already has established links with Elsevier
New ?Interaction model
- Update old objects and speak to web team about new model before officially incorporating new ?Interaction model
February 9, 2012
Interaction model
- Added some XREFs (in Interactor tag) and Chromatin_IP to Detection method
- Co-Immunoprecipitation would be captured under Detection method "Affinity Capture Western"
- Worked out changes needed for old *.ACE files to fit new model; will give to Juancarlos to script
Transgene
- Daniela, Juancarlos, and Karen imported Extrachromosomal array transgenes into the Transgene OA
- Extrachromosomal array transgenes for which authors have not provided a name will be named Expr####_Ex
- Maybe we will name according to paper e.g. "WBPaper########_Ex####"
- Daniela and Karen will discuss cost-benefit of objectifying Ex transgenes
- Rather than determine transgene sequence (or partial sequence), curators will (as they have been for Expr_pattern) add free-text describing the sequence
- This includes: primer sets, restriction digest sites, etc.
- Continue to place this info in the Reporter_gene tag
- Maybe add an additional free-text field (Sequence_info tag?)
Displaying curator names on new website
- Should we display curators' names on their curated objects?
- Currently only on concise description; why not do all objects?
- Keep curator name info only internal?
- Prevent curator names from being dumped for build/release?
- Should include 'Date-last-updated' Evidence dump
- Curator confirmed note could be placed in Tree View for all data types
Incremental Updates
- Should we perform incremental updates?
- Update as frequently as possible?
- Web display (of updates) served from Postgres/Tazendra?
- Serve RESTful widget from Caltech?
- Caltech in agreement about pushing forward with incremental updates
February 16, 2012
Gene product annotation for variation
- Variation affects gene product: absent, disfunctional, isoform-specific effects
- How is this best captured?
- Report as a phenotype? "RNA expression variant", "protein expression variant", etc...
- Capture as a gene_regulation event? Soon will be interaction object...
- Sequence ontology?
- Captured in ?Variation?
- Ask Hinxton/Mary Ann Tuli
Interaction model
- Old objects updated OK and read into ACEDB without problems
- Need to discuss updates with Web team
- Will send all updated *.ACE files with old Gene_regulaion, Interaction, and YH objects
- Add two zeros to the Interaction IDs in postgres OA tables once all current interaction objects uploaded to the Interaction OA
Transgene naming
- New extrachromosomal arrays will get "WBPaper###_Ex###" type of ID
SPELL Issues
- Problems coming up as data size increasing
- SPELL server at Amazon running at lowest paid instance ($1000/yr)
- 4GB memory needed (more than 132-bit machine can provide) for loading data
- 64-bit machine will cost $4000 per year
- Run SPELL on canopus? Yes but, have to take care of sys-admin issues
- $500/quarter for IMSS machine maintenance
- Talk to Matt Hibbs? Maybe need to optimize the process at that step
BioCreative/BioCurator meetings
- Kimberly working with DictyBase
- Kimberly will give 2 talks (CCC and molecular function automated curation)
- WormBase workflow; Kimberly will discuss with individual curators in March
- Arun submitted abstract
GO Consotrium meeting in a week and a half
- Paul, Michael (Muller) and Kimberly going
Human diseases will be objectified
February 23, 2012
BioCurator Meeting
- Karen, Arun, Kimberly, Michael and Yuling are going
- QCFast poster accepted for presentation
GO Consortium meeting this weekend
- Things to bring up at meeting?
- Responses to questionnaires
Purchase OA Domain name?
- Move to a more formal link aside from mangolassi
- wormbase domain name?
- Cost? ~$11/year
- "Ontology Annotator" may not be best name
- "Curatool" and "Biocuratool" available; the "curinator"? Curate at your "curinal"? ;)
- Currently only a static site
- GO Consortium may want to use the tool
GO Upload
- Going back and forth on deciding frequency of upload
- Currently back to two-month uploads
- We can certainly change frequency
- Two-month cycle for upload (in sync with WormBase) is too long of a cycle for GO curation
- Change GO curation upload to once per month (twice per month?)
- Things will change when curating through a GO curation interface
SPELL server on Amazon
- 1)Existing paid service doesn't seem to be stable
- Looking at log file did not reveal anything
- No reply from Matt Hibbs
- 2)Amazon installation cannot be updated to WS229 because the dataset demands more memory
- Raymond tried to install a 64-bit machine (free); wouldn't work
- We're stuck without more technical support (wrt Amazon service)
- Host ourselves (on canopus?) until OICR will take over? (When OICR is ~done with Beta site)
- Host through IMSS?
- WormMart is the only function of caprica; maybe setup SPELL on caprica for time being (next couple of months)
- Will Gary Williams et al add more RNA-Seq data?
- Athena (8GB memory?)
- Instead of virtual machines, have one machine that does everything
- SPELL usage? Not very much, but several (consistent users (300-900 queries per month)
- Farm out to Matt Hibbs? Matt not serving data
- Who is in charge of SPELL at SGD? How does SGD feel about SPELL? Ask Mike Cherry
- Problem with hosting at OICR if SPELL needs tinkering...
- Can Wen access and manipulate if at OICR? Not easy
- SPELL has LOTS of data (millions of lines)
- Will try to run locally for short term; meanwhile look for more resilient plan
- Kimberly can ask Cara Delinsky (sp?) about SPELL
RNAi parsing script
- Wen would like to work on
- Updated more than a year ago
- Should be able to parse interaction data directly into OA
- Would like to handle new variations and transgenes
- Script should look in Postgres tables
- Still need to deal with DNA sequence text mapping to genome/genes
- Elbrus a very old machine; very slow
- Install ace-server on tazendra?
- Ideal: OA takes/handles bulk of data; just run a script on the side to handle mapping DNA sequence to the genome
- We will build the RNAi OA
Transgene naming strategy
- We will stick to the Expr1234_Ex naming as naming after WBPaper was giving problems (i.e. dealing with 430 objects with no Paper attached and other minor issues).
- For a complete record of the process check wiki: http://wiki.wormbase.org/index.php/Expression_Pattern#Exporting_Reporter_Gene_description_from_Expr_pattern_OA_to_Transgene_OA
March 1, 2012
GO Meeting
- Focused on annotation pipelines; improving efficiency/effectiveness
- How to make GO annotations more 'expressive'
- GO would like to move towards more expressive statements
- Example: If a gene is involved in a function or process, where in the cell does this take place
- Common Annotation Framework
- Current/future members of the GO network can annotate using the same version of GO, same tools and standards
- Quality controls checks: e.g. do you have all the fields necessary to make an annotation
- GO hopes to centralize all of the data handling, formatting
- LEGO - Logical Extensions of GO
- We should pilot how we want to handle this; similar to how concise descriptions are constructed
- WormBase curates phenotypes, pathways, etc.
- Defining useful relationships to curate/annotate: Cross-products with defined relations
- Pilot: Take subdomains, pathways, try extended version of curation on these
- How do we capture that fly eye development is relevant to human biology?
- Humans don't have compound eyes - not the point
- The pathways are the same or similar; EGF signaling
- WormBase Process curation could really benefit from GO's adoption of this strategy
- Need to consider what the "right" way to approach this issue; need good pilot
- Where is the value? How do we focus on this?
- Another annotation pipeline: Phylogenetic Annotation and INference Tool (PAINT)
- How best to make these inferences?
- What kind of inferences can you make about organismal- or organ-specific processes?
- Uberon has framework for interspecies anatomical comparisons
- PAINT tool for nematodes?
Upload for WS231
- Interaction file upload took several hours
- Check if virtual memory is being used
- Likely culprit is the extra data and XREFs in the Interactor_info hash
- Can objectify the Interactor_info to be a tag in the main ?Interaction model
- We should warn EBI/Hinxton about this
WormBase Curator Interview next Thursday
Migration of Reporter_gene object annotations from Expr_pattern OA to Transgene OA
- Everything seems OK
SPELL
- Papers with less than three experiments, statistics calculations cause slow-down, memory limitations
- Now can bypass this problem
- We are now operating SPELL on our local machines
- Do Amazon instances function/behave differently than local server?
- Need to compare; find benefits & drawbacks
- Use Amazon server as a dynamic name server
- Users shouldn't notice a difference
- We won't need to ask Todd for anything; we can fix it ourselves
GO Meeting breakout session
- Software architecture for upcoming GO expansion (CAT - Common Annotation Tool)
- How does Textpresso integrate?
- What kind of annotation would GO expect Textpresso to do?
- User will be able to do guided text mining operations
- Example: regular expressions, then HMM, then export to CAT
- No forseeable roadblocks
- Maybe standardize all of the text mining types and methods behind them
- Develop paper-viewer? Apart from CAT, text mining flow? Separate module