Revision as of 17:55, 13 May 2021

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2021 Meetings

@@ Line 31: / Line 31: @@
 [[WormBase-Caltech_Weekly_Calls_March_2021|March]]
+[[WormBase-Caltech_Weekly_Calls_April_2021|April]]
-== April 1, 2021 ==
-=== Antibodies ===
-* Alignment of the antibody class to Alliance:
-** Propose to move possible_pseudonym (192) and Other_animal (37) to remarks. Those tags are not currently used for curation.
-*** Other animal is sometimes used for older annotations, e.g. authors say that the antibodies were raised both  in rats and rabbits. Standard practice would create 2 records, one for the rat antibody and one for the rabbit.
-*** Possible pseudonym was used when  a curator was not able to unambiguously assign a previous antibody to a record. (we have a Other name -synonym- tag to capture unambiguous ones). When moving to remarks we can keep a controlled vocabulary for easy future parsing, e.g. “possible_pseudonym:”
-** Antigen field: currently separated into Protein, peptide, and other_antigen (e.g.: homogenate of early C.elegans embryos, sperm). Propose to use just one antigen field to capture antigen info.
-All changes proposed above were approved by the group
-=== textpress-dev clean up ===
-* Michael has asked curators to assess what they have on textpresso-dev as it will not be around forever :-(
-* is it okay to transfer data and files we want to keep to tazendra? and then to our own individual machines?
-* Direct access may be possible via Caltech VPN
-* Do we want to move content to AWS? May be complicated; it is still easy and cheap to maintain local file systems/machines
-=== Braun servers ===
-* 3 servers stored in Braun server room; is there a new contact person for accessing these servers?
-* Mike Miranda replacement just getting settled; Paul will find out who is managing the server room and let Raymond know
-=== Citace upload ===
-* Next Friday, April 9th, by end of the day
-* Wen will contact Paul Davis for the frozen WS280 models file
-== April 8, 2021 ==
-=== Braun server outage ===
-* Raymond fixed; now Spica, wobr and wobr2 are back up
-=== Textpresso API ===
-* Was down yesterday affecting WormiCloud; Michael has fixed
-* Valerio will learn how to manage the API for the future
-=== Grant opportunities ===
-* Possibilities to apply for supplements
-* May 15th deadline
-* Druggable genome project
-** Pharos: https://pharos.nih.gov/
-** could we contribute?
-* Visualization, tools, etc.
-* Automated person descriptions?
-* Automated descriptions for proteins, ion channels, druggable targets, etc.?
-=== New WS280 ONTOLOGY FTP directory ===
-* Changes requested here: https://github.com/WormBase/website/issues/7900
-* Here's the FTP URL: ftp://ftp.wormbase.org/pub/wormbase/releases/WS280/ONTOLOGY/
-* Known issues (Chris will report):
-** Ontology files are provided as ".gaf" in addition to ".obo"; we need to remove the ".gaf" OBO files
-** Some files are duplicated and/or have inappropriate file extensions
-=== Odd characters in Postgres ===
-* Daniela and Juancarlos discovered some errors with respect to special characters pasted into the OA
-* Daniela would like to automatically pull in micropublication text (e.g. figure captions) into Postgres
-* We would need an automated way to convert special characters, like degree symbols ° into html unicode \&deg\;
-* Juancarlos and Valerio will look into possibly switching from a Perl module to a Python module to handle special characters
-== April 15, 2021 ==
-=== Special characters in Postgres/OA ===
-* Juancarlos working on/proposing a plan to store UTF-8 characters in Postgres and the OA which would then get converted, at dumping, to HTML entities (e.g. &alpha;) for the ACE files
-* There is still a bit of cleanup needed to fix or remove special characters (not necessarily UTF-8) that apparently got munged upon copy/pasting into the OA in the past
-* Note: copy/paste from a PDF often works fine, but sometimes does not work as expected so manual intervention would be needed (e.g. entering Greek characters by hand in UTF-8 format)
-* Would copy/pasting from HTML be better than PDF?
-* For Person curation it would be good to be able to faithfully store and display appropriate foreign characters (e.g. Chinese characters, Danish characters, etc.)
-* Mangolassi script called "get_summary_characters.pl" located here: /home/postgres/work/pgpopulation/grg_generegulation/20200618_summary_characters
-** Juancarlos will modify script to take a data type code as an argument on the command line and return all Postgres tables (and their respective PGIDs) that have special characters, e.g.
-*** $ ./get_summary_characters.pl exp
-*** $ ./get_summary_characters.pl int
-*** $ ./get_summary_characters.pl grg
-** or could pass just the datatype + field (postgres table). e.g.
-*** $ ./get_summary_characters.pl pic_description
-** Juancarlos will email everyone once it's ready.  It's ready, email sent.  Script is at /home/postgres/work/pgpopulation/oa_general/20210411_unicode_html/get_summary_characters.pl  Symlink this to your directory and run it from there, it will create files in the directory you are at when running it.
-* Action items:
-** Juancarlos will update the "get_summary_characters.pl" script as described above
-** Curators should use the "get_summary_characters.pl" to look for (potentially) bad characters in their OAs/Postgres tables
-** Need to perform bulk (automated) replacement of existing HTML entities into corresponding UTF-8 characters
-** Curators will need to work with Juancarlos for each OA to modify the dumper
-** Juancarlos will write (or append to existing) Postgres/OA dumping scripts to:
-*** 1) Convert UTF-8 characters to HTML entities in ACE files
-*** 2) Convert special quote and hyphen characters into simple versions that don't need special handling
-=== CeNGEN pictures ===
-* Model change went in to accommodate images from the CeNGEN project
-* Want gene page images for CeNGEN data; have the specifications for such images been worked out? Maybe not yet
-* Raymond and Daniela will work with data producers to acquire images when ready
-=== Supplement opportunities ===
-* Money available for software development to "harden" existing software
-* Might be possible to make Eduardo's single cell analysis tools more sustainable
-* Could make WormiCloud adapted to Alliance?
-* Put Noctua on more stable production footing? (GO cannot apply as they are in final year of existing grant)
-=== Student project for Textpresso ===
-* Create tool to allow user to submit text and return a list of similar papers
-* Use cases:
-** curator wants an alert to find papers similar to what they've curated
-** look for potential reviewers of a paper based on similar text content
-== April 22, 2021 ==
-=== LinkML hackathon ===
-* Need to consider who works on what and how to coordinate
-* Need to practice good Git practice
-** Merge main branch into local branch before merging back into main branch to make sure everything works
-* How will we best handle AceDB hash structures? likely use something like Mark QT demonstrated
-** Do we have any/many hash-within-hash structures? #Molecular_change is used as a hash and tags within that model all reference the #Evidence hash
-* GO annotation extensions offer an interesting challenge
-=== IWM workshop ===
-* Need to submit a workshop schedule (who speaks about what and when) by next Thursday April 29th
-* An initial idea was to promote data in ACEDB that may be underutilized or many users may be unaware of
-** An example might be transcription factor data: the ?Transcription_factor class and the modENCODE TF data
-** Single cell data and tools? CeNGEN, Eduardo's single cell tools
-** RNA-Seq FPKM values for genes and related data; Wen will write script to pull out FPKM values from SRA data and send to Magdalena
-* In addition to WB data types, we will cover Alliance, AFP, and community curation
-* Google doc for workshop here: https://docs.google.com/document/d/1H9ARhBRMKBNuOhjyxVQ_1o6cysvpppI7uA-TJrO_UZ4/edit?usp=sharing
-=== WB Progress Report ===
-* Due April 30th
-* There will be two documents: progress and plans
-* Place text in the appropriate places (don't write as a single integrated unit)
-* Paul S will put together a Google doc
-* We CAN include Alliance harmonization efforts
-* 2020 Progress report: https://docs.google.com/document/d/1f3ettnkvwoKKiaAA4TSrpSQPEF7FmVVn6u2UdflA_So/edit?usp=sharing
-* Last year milestone was WS276; we will compare to WS280
-* Google "WormBase Grants" folder: https://drive.google.com/drive/folders/1p8x9tEOfZ4DQvTcPSdNR5-JoPJu--ZAu?usp=sharing
-* 2021 Progress Report document here: https://docs.google.com/document/d/13E9k5JvDpUN4kWnrTm4M2iphnAJSTpk02ZiGl8O6bM4/edit?usp=sharing
-== April 29, 2021 ==
-=== IWM Workshop Schedule ===
-* Schedule format due today (April 29th)
-* [https://docs.google.com/document/d/1H9ARhBRMKBNuOhjyxVQ_1o6cysvpppI7uA-TJrO_UZ4/edit#bookmark=id.jrjo4xhfnh7b Tentative schedule here]
-* Format proposal is 4, 15-minute talks followed by 30 minutes of open discussion / Q&A
-* Still need someone to speak (~15 minutes) about the Alliance
-=== WB Progress Report ===
-* 2021 documents in [https://drive.google.com/drive/folders/1p8x9tEOfZ4DQvTcPSdNR5-JoPJu--ZAu?usp=sharing this Google Drive folder]
-* Note: there is one [https://docs.google.com/document/d/13E9k5JvDpUN4kWnrTm4M2iphnAJSTpk02ZiGl8O6bM4/edit?usp=sharing 2021 "Progress" document] and a second (separate) [https://docs.google.com/document/d/1j0HkCwuimK6DD-ui1tAkYMNpLRhxR9xb1FdSDZXFXCI/edit?usp=sharing "Future Plans" document]
-* Existing future plans text has been moved to the "Future Plans" document
-=== OpenBiosystems RNAi clone IDs ===
-* User looking to map Open Biosystems RNAi clone names to WB clone names
-* We may need to get a mapping file from Open Biosystems
-=== FPKM data ===
-* Wen has produced a csv file of FPKM values; can generate as part of the SPELL pipeline
-* May be better to generate at Hinxton
-=== OA Dumpers ===
-* Daniela and Juancarlos have been working on the Picture OA and Expr OA dumpers
-* Inconsistencies have accumulated for all OA dumpers as each has been made separately
-* Juancarlos is working on a generalized, modular way to handle dumping
-* Should we handle historical genes in the same way across OAs?
-** Sure, but we need the "Historical_gene" tag in the respective ACEDB model
-** Decision: we will continue to only dump historical genes for specific OAs, with a plan to maybe make consistent across OAs in the future
-* Could we retroactively deal with paper-gene connections? We could possibly look in Postgres history tables to see which genes had been replaced previously (by Kimberly)
-=== Gene name ambiguities ===
-* Jae noticed that some gene names associated with multiple WBGene IDs (e.g. one public name is the same as another gene's other name) have the same references attached
-* May require updating the paper-gene connections for some of these
-* One example is cep-1 gene. It associates with 3 diff WBgeneID and sharing papers in the reference widget.
-=== NIH Supplement for AI readiness ===
-* Could we set up curation for neural circuits using a knowledge graph (e.g. GO-CAM)?
-** Maybe we could convert the anatomy function model to LinkML -> OWL statements?
-** Maybe setup a graphical curation interface?
-* Transcriptional regulation
-** Would be good to establish a common model (for the Alliance?)
-** CeNGEN project produced lots of predictions of TF binding sites based on single-cell expression data; Eduardo: these models should be able to be regenerated each time new data sets are published, but this requires greater integration in a central, sustainable resource
-* Paul S can send a link for the supplement
-=== Variant First Pass Pipeline ===
-* Valerio: Are there any existing pipelines to make allele-paper and/or strain-paper associations?
-* Not sure, should ask Karen

Difference between revisions of "WormBase-Caltech Weekly Calls"

Revision as of 17:55, 13 May 2021

Previous Years

2021 Meetings

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools