Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
 
(588 intermediate revisions by 9 users not shown)
Line 17: Line 17:
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
GoToMeeting link: https://www.gotomeet.me/wormbase1
+
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
  
= 2018 Meetings =
+
GoToMeeting link: https://www.gotomeet.me/wormbase1
 
 
[[WormBase-Caltech_Weekly_Calls_January_2018|January]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_February_2018|February]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_March_2018|March]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_April_2018|April]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_May_2018|May]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_June_2018|June]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_July_2018|July]]
 
 
 
  
== August 2, 2018 ==
+
= 2020 Meetings =
  
=== AFP ===
+
[[WormBase-Caltech_Weekly_Calls_January_2020|January]]
  
* The AFP pipeline is currently emailing authors from karen's e-mail address
 
* Use same e-mail account Chris is using for phenotype community curation requests or create a new account for AFP (gmail)
 
* Can use outreach@wormbase.org for consistency
 
* May use the PMID in the subject line so e-mails will not be all in the same thread
 
* Todd and Chris have email credentials
 
** Chris will send to Valerio, Juancarlos, Daniela, and Kimberly
 
* Let Valerio and Juancarlos know what pipelines use AFP before they modify
 
* Do curators still want to receive emails when authors flag their data type?
 
** We will leave the alert emails as is for now
 
  
 +
== February 6, 2020 ==
  
== August 9, 2018 ==
+
=== Worcester Area Worm Meeting talk ===
 +
* Confirmed for December 2020 or February 2021
  
=== AFP ===
+
=== Alaska software ===
* Mei Zhen, SAB member suggested that we include disease models in the AFP form.  
+
* Code developed and maintained by Joseph, but not long term solution
* The AFP group will work with Ranjana to incorporate it. Ranjana will prepare a mock by next week.
+
* Raymond and Eduardo talked about taking it over
* We will then decide about using the existing afp_humdis tables or creating new ones.
+
* Why have a web application vs. a command-line application?
 +
** Wanted to make it easy, but also to capture meta data for WB
 +
* Should/will find out from Joseph about how hard it is to maintain the software
 +
* Maybe it could be taken over by Alliance, as RNA-Seq/Microarray meta data are getting harmonized
 +
* Expression working group working with Brian Oliver to have GEO take in more structured meta data
 +
* Array Express tried requiring more structured meta data, but authors stopped submitting
 +
* May be possible to build a form that collects meta data while simultaneously submitting to GEO in parallel
  
=== Tazendra ===
 
  
* Shall we move tazendra.caltech.edu to the cloud? Either WormBase cloud or Caltech cloud?
+
== February 13, 2020 ==
  
 +
=== Alliance Literature Group ===
 +
* Held first meeting on Monday, February 10th
 +
* Regular meetings will be on Tuesdays at 10am/1pm/6pm
 +
* Representatives from each group will give a brief overview of their literature pipelines before the group gets into details about deliverables
 +
* Question about centralized paper repository; group needs guidance from Alliance PIs on how to proceed
  
 +
=== ?Genotype class model ===
 +
* [https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?pli=1#bookmark=id.7r3e8pg19rd8 Proposal]
 +
* Can aim to implement for WS277 but may have to wait until WS278
  
== August 16, 2018 ==
+
=== Genotype OA ===
 +
* Will put documentation [[Genotype|here]]
  
=== Tazendra ===
+
=== WB All-Hands Meeting ===
* Moving to cloud? To avoid local hardware issues?
+
* [https://doodle.com/poll/7f65p4ba6d88ztzt Doodle poll]
* Need to discuss with Juancarlos and Paul S.
+
* Any thoughts at this point?  Still need to discuss with Hinxton, Toronto.
* Need to consider logistics; put all of Tazendra functionality on cloud? Keep some things local?
 
** Postgres in cloud; forms local? Paper pipeline?
 
** Will consult with Textpresso
 
  
=== ICBO 2018 recap ===
 
* POTATO workshop (Phenotype Ontologies Traversing All The Organisms)
 
** Will work towards generating standardized logical definitions using Dead Simple OWL Design Patterns (DOSDP)
 
*** <Quality> and inheres_in some <Entity> (and has_modifier some <Mod>)
 
*** Exercise: Reconciling logical definitions for apparently equivalent phenotype terms across ontologies (e.g. MP vs. HP)
 
** Can use Protege to edit the OWL ontology and ROBOT for automating generation of many terms and logical definitions in parallel
 
** Will try to align WPO to UPheno as best as we can; will depend (at least in part) heavily on alignment with Uberon for anatomy
 
** Some Uberon alignment challenges: e.g. Fruit fly "tibia" and human "tibia"; human "tibia" parent is "bone" but fly "tibia" is not a bone
 
** Will participate in Phenotype Ontology Developer's call, every 2 weeks on Tuesdays (9am Pacific, 12pm East coast, 5pm UK)
 
*** Next meeting September 4, 2018
 
** Crash course in Protege, ROBOT, Ontology Development Kit, using GitHub to help develop OWL ontologies
 
** PATO needs work
 
** Questions that arose:
 
*** What should the scope of an ontology term be? Context? Life stage? Conditions? Treatment?
 
*** Being weary of ontology term count explosion; what's the right balance?
 
*** When defining phenotype terms, should the cause be included or only the observation? Maybe causes as a subclass (and assuming the observation includes assessment of cause)
 
** Some distinction between human phenotype terms and model organism terms: phenotype of individual vs. population
 
* Xenbase is trying to develop a phenotype ontology (spoke with Troy Pell, developer)
 
** Asked about WPO and how we curate
 
* Lots of plant talks
 
* Many talks on performing quality checks on ontology development and ontology re-use
 
* Domain Informational Vocabulary Extraction (DIVE) tool
 
** Entity recognition/extraction
 
** Working with two plant journals
 
** Tries to identify co-occurrence patterns of words
 
** Web interface and curation tool
 
* Semantic similarity tools and evaluation of them
 
  
=== WormBase Phenotype Ontology working group ===
+
== February 20, 2020 ==
* Chris will send around Doodle poll
 
* Goal is to discuss creation of logical definitions and alignment of phenotypes for Alliance
 
  
 +
=== Genotype ===
 +
* We will equate superficially similar/identical genotypes for now
 +
* What if labs sequence strains later and find out more?
 +
* Labs will have to report strains and their sequence and we back-curate accordingly
  
== August 23, 2018 ==
+
=== VC2010 assembly genes ===
 +
* WormMine now returning double the gene count for C. elegans genes because of incorporation of newest VC2010 assembly
 +
* How to best handle these "extra" genes?
 +
* We could make different species entries that specify the assembly version
  
=== Alliance tables ===
 
* Filtering/sorting priorities
 
* Open question about which tables on the Alliance website should be prioritized for acquiring sorting and filtering functionality
 
  
=== Worm Phenotype Ontology working group ===
+
== February 27, 2020 ==
* Gary S., Karen, Kimberly, and Chris have responded to [https://doodle.com/poll/xzkxet8sb57enver#table Doodle poll]
 
* Looks like 12pm Pacific (3pm Eastern) on Thursdays is the time that works for everyone
 
** May start late on days when WB CIT meeting goes past 12pm Pacific
 
** May want to start a bit past 12pm to allow west coasters to get lunch, etc.?
 
* Goals:
 
** Work on logical definitions for WPO terms
 
** Consider any restructuring of WPO that would facilitate ontology alignment with other MODs and UPheno
 
** Could we eventually create a phenotype annotation tool (and term requester) that allows modular expressions of a phenotype observation to lookup existing terms or create new terms with logical definitions based on those modular elements?
 
  
=== Alliance anatomy ===
+
=== WB Project meeting ===
* Data quartermasters and expression working group are looking to get updated anatomy-Uberon mappings
+
* Early May (~May 7th, 8th)
* How frequent are data updates at the Alliance? Seems to be every ~2 months
+
* Discussion topics:
* Anatomy-Uberon mappings will affect phenotype ontology alignments
+
** Creating more transparency/communication in the Alliance DQM data submission process for QC
 +
*** Can there be a landing page for all uploaded files to check?
 +
*** JSON or TSV, etc.?
  
=== Automated Gene Descriptions ===
+
=== Genotype ===
*Ranjana and Valerio working to finish the new pipeline for automated descriptions for WB, will aim to finish them for the next upload, WS268.
+
* [[Genotype|Genotype Wiki page]] updated
* Working on one of the last data types--tissue expression; will use the anatomy ontology to perform logical trimming for the annotation set of cell/anatomy types (for each gene) including neurons (as opposed to using a file for neuronal term groupings taken from Oliver Hobert paper in the old pipeline)
+
* Will start building Genotype OA
* Currently playing around with thresholds to see how the sentences look, will feedback any ontology related issues to Raymond
 
* Working on incorporating feedback from users for information-poor genes (defined as genes with no human orthology and no GO annotations). Will include other types of information suggested by Users such as human ortholog function and protein domains, etc.
 
* When no other data is available, will include expression cluster data.  Users have complained that they don't find this data useful as it's non-specific and from large scale studies, so will give it the lowest priority for inclusion.
 
*Suggestion to exclude the writing and storing of the thousands of automated descriptions to the Postgres database; there is really no advantage in them being in Postgres. 
 
* At the time of generation of the automated descriptions the related .ace files can also be generated; though will need to include the 6000+ manual descriptions that live in Postgres.  So will need to rethink this part a bit, though skipping Postgres will reduce the number of manual steps in the pipleline and Postgres will have less data that needs to be uploaded and downloaded from future cloud storage.
 
  
== August 30, 2018 ==
+
=== Outreach inbox ===
 +
* 65 unread messages pertaining to AFP, some dating back to June. Can we clean these up?
 +
* Many are bounced emails messages
 +
* Do we want a common place to track bad/bouncing email addresses? We need to distinguish addresses that bounce for policy/SPAM reasons versus those that are likely outdated or no longer in use.
  
=== EPIC dataset in the Alliance import ===
+
=== New nameserver issues ===
 +
* Many variations and strains getting submitted to new nameserver are not making it into the OA/Postgres
 +
* Should be getting dumped nightly to Postgres, but something seems to be wrong
 +
* Ranjana added a list of strains last week, but still doesn't see them in the OA
 +
* Daniela has been using the old CGI to add in new variation objects to Postgres
 +
* These objects are coming up when querying the nameserver, but not showing up in OA
  
* the EPIC dataset has fine-grained anatomy-life stage annotations (e.g. single cell per minute)
+
=== GO CAM/Noctua modeling ===
* This generates thousands of annotations per gene (up to 30,000) for the 127 genes analyzed in the study
+
* Should be major improvements in Noctua in place after recent hackathon
* How to deal with this. In WB we do not display anatomy/life stage pairings but we display one list for anatomy terms and one for life stage. Also, we display the EPIC study in a separate panel on the gene page so that it does not ‘dilute’ small scale annotations (concerned raised by Oliver H at the time of the import. - The EPIC dataset shows inferred expressions. For each of these cells, the peak expression is the maximal reporter intensity observed in that cell or any of its ancestors; this has the effect of transposing earlier expression forward in time to the terminal set of cells. In Wormbase we have a pretty clear disclaimer in the Expression description but in the Alliance we are not yet porting descriptions so it can be misleading).
+
* Would be good to make a big push on curating models; e.g. signalling pathways that act somewhat differently in different cell types (in same organism) and in different organisms (e.g. Wnt signaling)
  
https://www.wormbase.org/species/c_elegans/gene/WBGene00015143#1--10
+
=== SObA ===
 +
* Raymond working on writing paper
 +
* While reviewing pipeline, Raymond encountered some issues
 +
* Seth suggested using Docker
 +
* Raymond wants to go through and explain the development pipeline from the beginning
  
*Possible solutions:
+
=== Alaska ===
** 1. throw away the pairing information. e.g. for WBGene00020093, there are paired 10713 annotations from expression pattern Expr10421. On WormBase, we have a panel for Expr10421 (on the page for WBGene00020093) that shows 413 life-stage associations and 112 anatomy associations, with no pairing information. 
+
* Stalled; no word from Joseph
***This approach will still give big tables (annotation in the hundreds) for the analyzed genes and the dilution problem will still be there.
 
*** This can be implemented for 2.1 as changing the code for 2.0 is fairly involved -Kevin to do upload tomorrow
 
** 2. assign a high-level life stage term (embryo) to the EPIC expression patterns for the alliance import so they will be discoverable on the Alliance website and will be hyperlinked to the WormBase detailed records
 

Latest revision as of 20:01, 27 February 2020

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings


GoToMeeting link: https://www.gotomeet.me/wormbase1

2020 Meetings

January


February 6, 2020

Worcester Area Worm Meeting talk

  • Confirmed for December 2020 or February 2021

Alaska software

  • Code developed and maintained by Joseph, but not long term solution
  • Raymond and Eduardo talked about taking it over
  • Why have a web application vs. a command-line application?
    • Wanted to make it easy, but also to capture meta data for WB
  • Should/will find out from Joseph about how hard it is to maintain the software
  • Maybe it could be taken over by Alliance, as RNA-Seq/Microarray meta data are getting harmonized
  • Expression working group working with Brian Oliver to have GEO take in more structured meta data
  • Array Express tried requiring more structured meta data, but authors stopped submitting
  • May be possible to build a form that collects meta data while simultaneously submitting to GEO in parallel


February 13, 2020

Alliance Literature Group

  • Held first meeting on Monday, February 10th
  • Regular meetings will be on Tuesdays at 10am/1pm/6pm
  • Representatives from each group will give a brief overview of their literature pipelines before the group gets into details about deliverables
  • Question about centralized paper repository; group needs guidance from Alliance PIs on how to proceed

?Genotype class model

  • Proposal
  • Can aim to implement for WS277 but may have to wait until WS278

Genotype OA

  • Will put documentation here

WB All-Hands Meeting

  • Doodle poll
  • Any thoughts at this point? Still need to discuss with Hinxton, Toronto.


February 20, 2020

Genotype

  • We will equate superficially similar/identical genotypes for now
  • What if labs sequence strains later and find out more?
  • Labs will have to report strains and their sequence and we back-curate accordingly

VC2010 assembly genes

  • WormMine now returning double the gene count for C. elegans genes because of incorporation of newest VC2010 assembly
  • How to best handle these "extra" genes?
  • We could make different species entries that specify the assembly version


February 27, 2020

WB Project meeting

  • Early May (~May 7th, 8th)
  • Discussion topics:
    • Creating more transparency/communication in the Alliance DQM data submission process for QC
      • Can there be a landing page for all uploaded files to check?
      • JSON or TSV, etc.?

Genotype

Outreach inbox

  • 65 unread messages pertaining to AFP, some dating back to June. Can we clean these up?
  • Many are bounced emails messages
  • Do we want a common place to track bad/bouncing email addresses? We need to distinguish addresses that bounce for policy/SPAM reasons versus those that are likely outdated or no longer in use.

New nameserver issues

  • Many variations and strains getting submitted to new nameserver are not making it into the OA/Postgres
  • Should be getting dumped nightly to Postgres, but something seems to be wrong
  • Ranjana added a list of strains last week, but still doesn't see them in the OA
  • Daniela has been using the old CGI to add in new variation objects to Postgres
  • These objects are coming up when querying the nameserver, but not showing up in OA

GO CAM/Noctua modeling

  • Should be major improvements in Noctua in place after recent hackathon
  • Would be good to make a big push on curating models; e.g. signalling pathways that act somewhat differently in different cell types (in same organism) and in different organisms (e.g. Wnt signaling)

SObA

  • Raymond working on writing paper
  • While reviewing pipeline, Raymond encountered some issues
  • Seth suggested using Docker
  • Raymond wants to go through and explain the development pipeline from the beginning

Alaska

  • Stalled; no word from Joseph