Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
 
(92 intermediate revisions by 6 users not shown)
Line 22: Line 22:
  
 
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
 
  
 
= 2021 Meetings =
 
= 2021 Meetings =
Line 28: Line 27:
 
[[WormBase-Caltech_Weekly_Calls_January_2021|January]]
 
[[WormBase-Caltech_Weekly_Calls_January_2021|January]]
  
 +
[[WormBase-Caltech_Weekly_Calls_February_2021|February]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_March_2021|March]]
 +
 +
 +
== April 1, 2021 ==
 +
 +
=== Antibodies ===
 +
* Alignment of the antibody class to Alliance:
 +
** Propose to move possible_pseudonym (192) and Other_animal (37) to remarks. Those tags are not currently used for curation.
 +
*** Other animal is sometimes used for older annotations, e.g. authors say that the antibodies were raised both  in rats and rabbits. Standard practice would create 2 records, one for the rat antibody and one for the rabbit.
 +
*** Possible pseudonym was used when  a curator was not able to unambiguously assign a previous antibody to a record. (we have a Other name -synonym- tag to capture unambiguous ones). When moving to remarks we can keep a controlled vocabulary for easy future parsing, e.g. “possible_pseudonym:”
 +
** Antigen field: currently separated into Protein, peptide, and other_antigen (e.g.: homogenate of early C.elegans embryos, sperm). Propose to use just one antigen field to capture antigen info.
 +
 +
All changes proposed above were approved by the group
 +
 +
=== textpress-dev clean up ===
 +
* Michael has asked curators to assess what they have on textpresso-dev as it will not be around forever :-(
 +
* is it okay to transfer data and files we want to keep to tazendra? and then to our own individual machines?
 +
* Direct access may be possible via Caltech VPN
 +
* Do we want to move content to AWS? May be complicated; it is still easy and cheap to maintain local file systems/machines
 +
 +
=== Braun servers ===
 +
* 3 servers stored in Braun server room; is there a new contact person for accessing these servers?
 +
* Mike Miranda replacement just getting settled; Paul will find out who is managing the server room and let Raymond know
  
== Feb 4th, 2021 ==
+
=== Citace upload ===
===How the "duplicate" function works in OAs with respect to object IDs (Ranjana and Juancarlos)===
+
* Next Friday, April 9th, by end of the day
*A word of caution: when you duplicate a row, for those OAs with Object IDs (eg., WBGenotype00000014) note that the object ID gets duplicated as well and does not advance to the next ID like the PGID does
+
* Wen will contact Paul Davis for the frozen WS280 models file
*If you do use the "duplicate" function, remember to manually change the Object ID
 
* We can implement checks to make sure distinct annotations/objects don't share the same ID
 
  
=== GAF Wiki and headers ===
 
* Any more comments about the Wiki page and the proposal? https://wiki.wormbase.org/index.php/WormBase_gene_association_file
 
  
=== Missing references in expression GAFs ===
+
== April 8, 2021 ==
* ~300 missing from anatomy association file and ~45 missing from development association file
 
* Daniela looking into missing refs; many are personal communications or very old papers
 
* Will change ?Expr_pattern model to possibly remove ?Author reference and add in a ?Person reference instead
 
** 399 objects in WS279 reference an author; Daniela will take a look
 
* Would be good to have some reference for those objects in the GAF file on the FTP site; could use WBPerson when ready
 
  
 +
=== Braun server outage ===
 +
* Raymond fixed; now Spica, wobr and wobr2 are back up
  
== Feb 11th, 2021 ==
+
=== Textpresso API ===
=== Alliance Literature Paper Tags ===
+
* Was down yesterday affecting WormiCloud; Michael has fixed
*What do we definitely want to transfer to the Alliance?
+
* Valerio will learn how to manage the API for the future
*Alliance literature group [https://docs.google.com/spreadsheets/d/1d3Y73x1BFiARkbxrvPPX2tCh5rFeBQRoBMHOcaXmijA/edit#gid=1866989939 spreadsheet]
 
*Current flags vs legacy flags
 
*Can we map everything to the proposed hierarchy or do we need to add some more classes?
 
  
=== Personal communications in Expr_pattern ===
+
=== Grant opportunities ===
* 27 objects missing reference (personal communications)
+
* Possibilities to apply for supplements
** Even if we capture the WBPerson in the Person tag, how are we submitting these to Alliance? The evidence required by the expression JSON spec https://github.com/alliance-genome/agr_schemas/blob/master/ingest/expression/wildtypeExpressionModelAnnotation.json (and other specs) must be a publication, as defined by the publicationRef.json https://github.com/alliance-genome/agr_schemas/blob/master/ingest/publicationRef.json. If there's no PMID for a publication listed as evidence, a MOD ID will suffice for the "publicationId" but we have no WBPaperID created for such  objects.
+
* May 15th deadline
** One way to solve this: Daniela can go over the list and see if the initial personal communication resulted in  a publication later on. One example is Expr181 (expression of cpl-1 in hypodermis and pharynx), communicated  via email by Sarwar Hashmi in 2000, Expr450 (expression of cpl-1 in hypodermis, intestine) communicated by  Britton in 2001. The pattern was published in 2002 by Hashmi and Britton in 2002 (WBPaper00005099). Daniela can then associate WBPaper00005099 to Expr181 and Expr450.  
+
* Druggable genome project
** The solution above still  does not work for all: An example is lad-2 personal communication from Oliver Hobert, 2002. Later published by Lishia Chen (2008). Removing Oliver’s personal communication will remove evidence of data provenance from the Hobert’s lab. Unless Oliver published this in a paper that was eluded from our flagging system (e.g. flagged SVM negative).
+
** Pharos: https://pharos.nih.gov/
** Daniela can go over the entire list and contact the authors for such cases.
+
** could we contribute?
** Are personal communications used in other classes?
+
* Visualization, tools, etc.
 +
* Automated person descriptions?
 +
* Automated descriptions for proteins, ion channels, druggable targets, etc.?
  
=== Author data in Expr_pattern ===
+
=== New WS280 ONTOLOGY FTP directory ===
* 399 Expression objects have the author tag populated. Most of them were submitted even prior Wen started working on Expr_pattern.  
+
* Changes requested here: https://github.com/WormBase/website/issues/7900
** out of 399 objects, we have 32 for which the authors partially match. One example is Expr60, which has Bauer as extra author in the .ace file. Bauer is not listed as author in the paper.
+
* Here's the FTP URL: ftp://ftp.wormbase.org/pub/wormbase/releases/WS280/ONTOLOGY/
** should we keep the author info and store it in the Person tag? Even if we do, how are we submitting these to Alliance?
+
* Known issues (Chris will report):
 +
** Ontology files are provided as ".gaf" in addition to ".obo"; we need to remove the ".gaf" OBO files
 +
** Some files are duplicated and/or have inappropriate file extensions
  
=== Date tag in Expr_pattern ===
+
=== Odd characters in Postgres ===
* The date tag seems to be  populated for objects that have authors (above) to probably capture when the submission occurred.
+
* Daniela and Juancarlos discovered some errors with respect to special characters pasted into the OA
* In addition, Date is populated for a large scale submission from Ian hope (2006-03), later published.
+
* Daniela would like to automatically pull in micropublication text (e.g. figure captions) into Postgres
* We can still keep this info as is for WB but what are we going to do for the Alliance submission? The tag was used last time in 2006 for the Hope study but prior to this was used in  the ‘90s (1990, 1998).
+
* We would need an automated way to convert special characters, like degree symbols ° into html unicode \&deg\;
 +
* Juancarlos and Valerio will look into possibly switching from a Perl module to a Python module to handle special characters

Latest revision as of 19:05, 8 April 2021

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings

2020 Meetings

2021 Meetings

January

February

March


April 1, 2021

Antibodies

  • Alignment of the antibody class to Alliance:
    • Propose to move possible_pseudonym (192) and Other_animal (37) to remarks. Those tags are not currently used for curation.
      • Other animal is sometimes used for older annotations, e.g. authors say that the antibodies were raised both in rats and rabbits. Standard practice would create 2 records, one for the rat antibody and one for the rabbit.
      • Possible pseudonym was used when a curator was not able to unambiguously assign a previous antibody to a record. (we have a Other name -synonym- tag to capture unambiguous ones). When moving to remarks we can keep a controlled vocabulary for easy future parsing, e.g. “possible_pseudonym:”
    • Antigen field: currently separated into Protein, peptide, and other_antigen (e.g.: homogenate of early C.elegans embryos, sperm). Propose to use just one antigen field to capture antigen info.

All changes proposed above were approved by the group

textpress-dev clean up

  • Michael has asked curators to assess what they have on textpresso-dev as it will not be around forever :-(
  • is it okay to transfer data and files we want to keep to tazendra? and then to our own individual machines?
  • Direct access may be possible via Caltech VPN
  • Do we want to move content to AWS? May be complicated; it is still easy and cheap to maintain local file systems/machines

Braun servers

  • 3 servers stored in Braun server room; is there a new contact person for accessing these servers?
  • Mike Miranda replacement just getting settled; Paul will find out who is managing the server room and let Raymond know

Citace upload

  • Next Friday, April 9th, by end of the day
  • Wen will contact Paul Davis for the frozen WS280 models file


April 8, 2021

Braun server outage

  • Raymond fixed; now Spica, wobr and wobr2 are back up

Textpresso API

  • Was down yesterday affecting WormiCloud; Michael has fixed
  • Valerio will learn how to manage the API for the future

Grant opportunities

  • Possibilities to apply for supplements
  • May 15th deadline
  • Druggable genome project
  • Visualization, tools, etc.
  • Automated person descriptions?
  • Automated descriptions for proteins, ion channels, druggable targets, etc.?

New WS280 ONTOLOGY FTP directory

Odd characters in Postgres

  • Daniela and Juancarlos discovered some errors with respect to special characters pasted into the OA
  • Daniela would like to automatically pull in micropublication text (e.g. figure captions) into Postgres
  • We would need an automated way to convert special characters, like degree symbols ° into html unicode \&deg\;
  • Juancarlos and Valerio will look into possibly switching from a Perl module to a Python module to handle special characters