Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
 
(831 intermediate revisions by 12 users not shown)
Line 17: Line 17:
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
GoToMeeting link: https://www.gotomeet.me/wormbase1
+
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
  
= 2018 Meetings =
 
  
[[WormBase-Caltech_Weekly_Calls_January_2018|January]]
 
  
[[WormBase-Caltech_Weekly_Calls_February_2018|February]]
+
= 2020 Meetings =
  
[[WormBase-Caltech_Weekly_Calls_March_2018|March]]
+
[[WormBase-Caltech_Weekly_Calls_January_2020|January]]
  
[[WormBase-Caltech_Weekly_Calls_April_2018|April]]
+
[[WormBase-Caltech_Weekly_Calls_February_2020|February]]
  
[[WormBase-Caltech_Weekly_Calls_May_2018|May]]
+
[[WormBase-Caltech_Weekly_Calls_March_2020|March]]
  
[[WormBase-Caltech_Weekly_Calls_June_2018|June]]
+
[[WormBase-Caltech_Weekly_Calls_April_2020|April]]
  
[[WormBase-Caltech_Weekly_Calls_July_2018|July]]
+
[[WormBase-Caltech_Weekly_Calls_May_2020|May]]
  
[[WormBase-Caltech_Weekly_Calls_August_2018|August]]
+
[[WormBase-Caltech_Weekly_Calls_June_2020|June]]
  
[[WormBase-Caltech_Weekly_Calls_September_2018|September]]
+
[[WormBase-Caltech_Weekly_Calls_July_2020|July]]
  
 +
[[WormBase-Caltech_Weekly_Calls_August_2020|August]]
  
== October 4, 2018 ==
+
[[WormBase-Caltech_Weekly_Calls_September_2020|September]]
  
=== SimpleMine ===
 
* Automated descriptions will be removed from Postgres/OA
 
* SimpleMine needed to update where it pulls the automated descriptions from
 
* Will add expression cluster and automated description columns output (in addition to concise description)
 
* Added RNAseq FPKM download function for 9 species: http://mangolassi.caltech.edu/~azurebrd/cgi-bin/forms/fpkmmine.cgi
 
* Added SimpleMine-like topic search: http://tazendra.caltech.edu/%7Eazurebrd/cgi-bin/forms/spellmine.cgi
 
* Should put the new tools under the WormBase Tools menu
 
  
 +
== October 1, 2020 ==
  
== October 11, 2018 ==
+
=== Gene association file formats on FTP ===
 +
* For example, current production release ONTOLOGY directory: ftp://ftp.wormbase.org/pub/wormbase/releases/current-production-release/ONTOLOGY/
 +
* Our association files have format "*.wb"; is this useful or necessary?
 +
* Other than referring to GAF in the header, it isn't clear to users what the columns refer to or what the column headers should be
 +
* We could add a README file and/or convert to the new [https://github.com/geneontology/geneontology.github.io/blob/issue-go-annotation-2917-gaf-2_2-doc/_docs/go-annotation-file-gaf-format-22.md GAF 2.2 format] which would have a more expressive file header and possibly column headers(?)
 +
** File headers could possibly link to the format specification page
  
=== Ready for new round of phenotype requests ===
+
=== Phenotype association file idiosyncrasy ===
* Some users are getting confused about the name & email prepopulation based on IP address
+
* As we've discussed previously, there is an oddity to how the phenotype association file we provide lists, or doesn't, references
** May want to stop autopopulating name and email or autopopulate based on email recipient only (encode in URL sent in email)
+
* According to the GAF spec, column 6 is for reference and is required, whereas column 8 is "With (or) From" and is optional
** Could we use cookies? Possibly, but may only help if a computer is shared but the browser isn't
+
* When we have a reference, the WBPaper ID is provided in column 6 and the WBVar ID or RNAi ID is provided in column 8
* Current autocomplete expects exact match to person primary name; e.g. "Scott Emmons" will not match the official name "Scott Wilson Emmons"
+
* However, when we have no reference (personal communication, e.g. from NBP allele submissions), the WBVar ID is instead put in column 6 (because we need something there), and column 8 is blank.
** Maybe we could improve search matching; algorithm from Cecilia/Juancarlos? Elastic search by Valerio?
+
** This results in (1) column 6 having a mix of paper/reference IDs (good) and WBVar IDs (not good) and (2) WBVar IDs split between column 6 and 8; thus making it tedious to parse this file
** Can we capture incomplete sessions? We may be able to learn from them. May be flooded by robot visits? Is it worth going through all the logs/sessions? Info is there if we want to look at it.
+
* Proposed solution: Can we come up with some type of reference object ID to associate to the personal communications (or any annotations currently lacking a formal reference)?
* Will go ahead and send emails for only new set of papers (won't resend requests for papers that had emails sent in June/July)
+
* With the proposed solution, we can always have a reference ID in column 6 (the intended purpose of the column) and WBVar IDs for alleles can always remain consistently in column 8
* Maybe go back to papers that already had a request sent at the 6 month time point
+
* Proposal is to put WBPerson IDs in column 6 for personal communications. Chris & Karen will check if this will work.
* Include other papers in need of curation at bottom of email; possibly, would it turn off users?
 
  
=== Worm Phenotype Ontology ===
+
=== Server space in Chen Building ===
* WPO has a new home on GitHub
+
* It looks like that we will not have a specific space for server computers.
** https://github.com/obophenotype/c-elegans-phenotype-ontology
 
* Edits should only be made to the edit file
 
** https://github.com/obophenotype/c-elegans-phenotype-ontology/blob/master/src/ontology/wbphenotype-edit.owl
 
* Anyone interested in contributing to the WPO should contact Chris for update pipeline info
 
* Need to make sure that all users of the WPO have the updated link information
 
  
=== Provide provenance in query tools ===
 
* Prompted by user question/request
 
* Specifically in WOBr, Anatomy pages
 
* WOBr provides genes annotated to term; should provide provenance of each gene and its annotations
 
* Expression pattern and expression cluster gene lists (in context of Anatomy WOBr); want to provide provenance for this data
 
* Provenance = an object ID, like "Expr1234" or "WBPaper00032062:age_regulated_genes" with link to relevant page
 
  
=== WOBr disease associations ===
+
== October 8, 2020 ==
* Ranjana wondering if WOBr is using updated disease-gene associations
 
* Gene association file (for disease) being generated by script; likely need to update where the data is coming from
 
* Ranjana will discuss with Raymond and Kevin
 
  
=== New WormMine superuser ===
+
=== Webinar Announcement ===
* Now all template queries are owned by a new superuser
+
* Here is the live registration site: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/webinar.cgi
* If people are interested in adding or editing templates talk to Chris or Paulo for superuser access
+
* Caltech zoom allows 300 attendees.
* We were running into login, template ownership, and consistency issues
 
  
=== Next upload Nov 16 ===
+
=== Descriptions from GO-CAM models ===
* WS269 citace upload Tuesday, November 13
+
* One suggestion for the Alliance is to create a description based on a GO-CAM model
* Can we add upload dates to Google calendar for WormBase?
+
* Could also micropublish some descriptions (semi-automated?)
 +
* Can make curators authors of micropublications for GO-CAM models/pathways
  
 +
=== Transcription Factors in WormBase ===
 +
* WormBase has a ?Transcription_factor class that is currently underutilized
 +
* Chris spoke with Gary Williams about the status as he has done much of the work on the class
 +
* Because transcription factors can often be complexes, it was decided to create the ?Transcription_factor class rather than simply an extension of tags to the existing ?Gene class
 +
* The class seems reasonably complete; it's important to note that some TFs are general transcription factors, not necessarily gene-specific or sequence-specific DNA-binding TFs; it will be good to make that distinction clear to users
 +
* Chris has compiled a [https://docs.google.com/spreadsheets/d/1KdmvybWDWHXdlJwZgfleL4xHDoyPoYR13WAUcERF82g/edit?usp=sharing Google sheet] to assess the class before Gary W. leaves WB in the next couple of weeks
 +
* The Google sheet has several tabs/worksheets, including one for the ACEDB data model (and notes about usage of tags), a summary table of associated genes, bound sequence features, existence of other protein-DNA binding data, etc.
 +
* It would be good to make TF binding info (per gene and globally) more accessibly to our users, maybe via a new widget on gene pages (e.g. list incoming, regulating TFs and, for TF genes themselves, list potential target genes)
  
== October 18, 2018 ==
+
== October 15, 2020 ==
  
=== Upload for WS269===
+
=== BioGRID data sharing ===
* To Hinxton Nov 16
+
* Rose from BioGRID proposed that BioGRID curate high-throughput C. elegans interaction datasets, capturing confidence scores when available, and making those annotations available to WormBase for regular ingest
* Citace upload to Wen Nov 13
+
* Will need to consider a few points:
 +
** BioGRID doesn't curate protein-DNA interactions
 +
** We don't yet know the turn-around timeline for BioGRID curation of worm datasets; WB may be able to curate them much sooner
 +
* Chris and Jae will work with Rose et al. to coordinate HTP curation
  
=== Data provenance in WOBr tools ===
+
=== Enriched genes ===
* Juancarlos and Raymond have been working on
+
* Some genes are considered "enriched" for an expression cluster data set even if the enrichment was in comparison to another cell or tissue (not whole animal)
* Awaiting pull request
+
* We should reconsider the ?Expression_cluster model to make sure we can appropriately model and communicate enrichment or subtypes thereof
* Can test: juancarlos.wormbase.org
 
** Go to WOBr and test anatomy ontology
 
** Now WOBr gene count results show data objects from which the associations come (Expr_pattern and Expression_cluster objects)
 
  
=== New SPELL server ===
 
* New server on Amazon (modifying server SGD uses)
 
* Raymond, Wen, and Todd working on
 
* Currently only have an SGD mirror running
 
* Wen will swap the data later today
 
* WormBase header link (to WormBase) or only link to Alliance site? We want a unified site for Alliance
 
* Each MOD would still support their own server for their data (MOD-specific grants support each server, for now)
 
  
=== Alliance expression data ===
+
== October 22, 2020 ==
* Anatomy-LifeStage pair required for Alliance expression annotations
 
* Since many expression pattern annotations don't have both, the missing entity would default to ontology root term
 
* Need to link anatomy root term to Uberon for ribbon display; root term annotations fall under the "Other" category in the ribbon slim
 
* Create "Anatomical_part" term to serve as the default "Other"/root term?
 
* All life-stage-only annotations will fall into anatomy "Other" and flood the list; should these be filtered out?
 
  
 +
=== CHEBI ===
 +
* Karen spoke to CHEBI personnel on Tuesday
 +
* CHEBI only has ~2 curators to create new entities
 +
* CHEBI had submitted a proposal to establish pipelines to process requests from MODs
 +
* Chemical Translation Service (CTS)
 +
* OxO = https://www.ebi.ac.uk/spot/oxo/search
  
== October 25, 2018 ==
+
=== Training Webinar ===
 
+
* Scheduled for tomorrow at 1pm Pacific/4pm Eastern
=== WormBase SPELL on Amazon Web Service ===
 
* http://34.224.93.60/ is running WormBase SPELL on WS267, based on the SPELL code supported by SGD. It is more stable and faster than the current Caltech server.
 
* Waiting for Todd to respond if we can use this site as the official server for WormBase SPELL.
 
* Also need to get an instance (preferably also from WormBase AWS as a development site of WormBase SPELL.
 
* Caltech will focus on generating new data instead of SPELL code development.
 
 
 
=== Linking annotation evidence to Anatomy Ontology Browser ===
 
Each gene expression annotation shown on WOBr is linked to the object so that users can more easily examine the evidence.
 
[https://juancarlos.wormbase.org/tools/ontology_browser/show_genes?focusTermName=neuron&focusTermId=WBbt:0003681 Example]
 

Latest revision as of 18:40, 22 October 2020

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings



2020 Meetings

January

February

March

April

May

June

July

August

September


October 1, 2020

Gene association file formats on FTP

  • For example, current production release ONTOLOGY directory: ftp://ftp.wormbase.org/pub/wormbase/releases/current-production-release/ONTOLOGY/
  • Our association files have format "*.wb"; is this useful or necessary?
  • Other than referring to GAF in the header, it isn't clear to users what the columns refer to or what the column headers should be
  • We could add a README file and/or convert to the new GAF 2.2 format which would have a more expressive file header and possibly column headers(?)
    • File headers could possibly link to the format specification page

Phenotype association file idiosyncrasy

  • As we've discussed previously, there is an oddity to how the phenotype association file we provide lists, or doesn't, references
  • According to the GAF spec, column 6 is for reference and is required, whereas column 8 is "With (or) From" and is optional
  • When we have a reference, the WBPaper ID is provided in column 6 and the WBVar ID or RNAi ID is provided in column 8
  • However, when we have no reference (personal communication, e.g. from NBP allele submissions), the WBVar ID is instead put in column 6 (because we need something there), and column 8 is blank.
    • This results in (1) column 6 having a mix of paper/reference IDs (good) and WBVar IDs (not good) and (2) WBVar IDs split between column 6 and 8; thus making it tedious to parse this file
  • Proposed solution: Can we come up with some type of reference object ID to associate to the personal communications (or any annotations currently lacking a formal reference)?
  • With the proposed solution, we can always have a reference ID in column 6 (the intended purpose of the column) and WBVar IDs for alleles can always remain consistently in column 8
  • Proposal is to put WBPerson IDs in column 6 for personal communications. Chris & Karen will check if this will work.

Server space in Chen Building

  • It looks like that we will not have a specific space for server computers.


October 8, 2020

Webinar Announcement

Descriptions from GO-CAM models

  • One suggestion for the Alliance is to create a description based on a GO-CAM model
  • Could also micropublish some descriptions (semi-automated?)
  • Can make curators authors of micropublications for GO-CAM models/pathways

Transcription Factors in WormBase

  • WormBase has a ?Transcription_factor class that is currently underutilized
  • Chris spoke with Gary Williams about the status as he has done much of the work on the class
  • Because transcription factors can often be complexes, it was decided to create the ?Transcription_factor class rather than simply an extension of tags to the existing ?Gene class
  • The class seems reasonably complete; it's important to note that some TFs are general transcription factors, not necessarily gene-specific or sequence-specific DNA-binding TFs; it will be good to make that distinction clear to users
  • Chris has compiled a Google sheet to assess the class before Gary W. leaves WB in the next couple of weeks
  • The Google sheet has several tabs/worksheets, including one for the ACEDB data model (and notes about usage of tags), a summary table of associated genes, bound sequence features, existence of other protein-DNA binding data, etc.
  • It would be good to make TF binding info (per gene and globally) more accessibly to our users, maybe via a new widget on gene pages (e.g. list incoming, regulating TFs and, for TF genes themselves, list potential target genes)

October 15, 2020

BioGRID data sharing

  • Rose from BioGRID proposed that BioGRID curate high-throughput C. elegans interaction datasets, capturing confidence scores when available, and making those annotations available to WormBase for regular ingest
  • Will need to consider a few points:
    • BioGRID doesn't curate protein-DNA interactions
    • We don't yet know the turn-around timeline for BioGRID curation of worm datasets; WB may be able to curate them much sooner
  • Chris and Jae will work with Rose et al. to coordinate HTP curation

Enriched genes

  • Some genes are considered "enriched" for an expression cluster data set even if the enrichment was in comparison to another cell or tissue (not whole animal)
  • We should reconsider the ?Expression_cluster model to make sure we can appropriately model and communicate enrichment or subtypes thereof


October 22, 2020

CHEBI

  • Karen spoke to CHEBI personnel on Tuesday
  • CHEBI only has ~2 curators to create new entities
  • CHEBI had submitted a proposal to establish pipelines to process requests from MODs
  • Chemical Translation Service (CTS)
  • OxO = https://www.ebi.ac.uk/spot/oxo/search

Training Webinar

  • Scheduled for tomorrow at 1pm Pacific/4pm Eastern