Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
 
Line 19: Line 19:
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
GoToMeeting link: https://www.gotomeet.me/wormbase1
+
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2021|2021 Meetings]]
  
= 2019 Meetings =
+
[[WormBase-Caltech_Weekly_Calls_2022|2022 Meetings]]
  
[[WormBase-Caltech_Weekly_Calls_January_2019|January]]
+
[[WormBase-Caltech_Weekly_Calls_2023|2023 Meetings]]
  
[[WormBase-Caltech_Weekly_Calls_February_2019|February]]
 
  
[[WormBase-Caltech_Weekly_Calls_March_2019|March]]
+
==March 14, 2024==
  
[[WormBase-Caltech_Weekly_Calls_April_2019|April]]
+
=== TAGC debrief ===
  
[[WormBase-Caltech_Weekly_Calls_May_2019|May]]
+
==February 22, 2024==
  
[[WormBase-Caltech_Weekly_Calls_June_2019|June]]
+
===NER with LLMs===
  
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
+
* Wrote scripts and configured an LLM for Named Entity Recognition. Trained an LLM on gene names and diseases. Works well so far (F1 ~ 98%, Accuracy ~ 99.9%)
  
[[WormBase-Caltech_Weekly_Calls_August_2019|August]]
+
* Is this similar to the FlyBase system? Recording of presentation  https://drive.google.com/drive/folders/1S4kZidL7gvBH6SjF4IQujyReVVRf2cOK
  
[[WormBase-Caltech_Weekly_Calls_September_2019|September]]
+
* Textpresso server is kaput. Services need to be transferred onto Alliance servers.
  
 +
* There are features on Textpresso, such as link to PDF, that are desirable to curators but should be blocked from public access.
  
== October 3, 2019 ==
+
* Alliance curation status form development needs use cases. ref https://wiki.wormbase.org/index.php/WormBase-Caltech_Weekly_Calls#February_15.2C_2024
  
=== SObA comparison graphs ===
 
* Raymond and Juancarlos have worked on a SObA-graph based comparison tool to compare two genes for ontology-based annotations
 
* [http://wobr2.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=Gene+Pair+to+SObA+Graph Prototype 1]
 
** [http://wobr2.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=annotSummaryCytoscape&filterForLcaFlag=1&filterLongestFlag=1&showControlsFlag=0&datatype=phenotype&geneOneValue=lin-3%20(Caenorhabditis%20elegans,%20WB:WBGene00002992,%20-,%20F36H1.4)&autocompleteValue=let-23%20(Caenorhabditis%20elegans,%20WB:WBGene00002299,%20-,%20ZK1067.1 Example comparison between lin-3 and let-23]
 
* [http://wobr1.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=Gene+Pair+to+SObA+Graph Prototype 2]
 
** [http://wobr1.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=annotSummaryCytoscape&filterForLcaFlag=1&filterLongestFlag=1&showControlsFlag=0&datatype=phenotype&geneOneValue=lin-3%20(Caenorhabditis%20elegans,%20WB:WBGene00002992,%20-,%20F36H1.4)&autocompleteValue=let-23%20(Caenorhabditis%20elegans,%20WB:WBGene00002299,%20-,%20ZK1067.1) Example comparison between lin-3 and let-23]
 
* What information does a user most care about?
 
# What terms (nodes) are annotated to gene 1 and what terms to gene 2
 
# For a given term, what is the relative number of annotations between gene 1 and gene 2.
 
# For a given node, what is the relative number of annotations each gene has to the total annotations of that gene.
 
* # 3 is actually what we applied to size the nodes in the single-gene version of SObA. Thus, not surprisingly, I think it is important.
 
* Generally people like Prototype 2 as a default view; we could possibly have a toggle to see the other view
 
* In either case users need a good legend and/or documentation
 
* Jae, it would be good if a user could specifically highlight nodes specific to each gene and gray-out or de-emphasize the common nodes
 
  
=== Germ line discussion ===
 
* Currently, the anatomy ontology has "germ line" as a type of "Cell" and a type of "Tissue", and "germ cell" as a type of "germ line"
 
* Chris would like to (1) remove "germ line" from under "Cell" and leave it under "Tissue" and (2) move "germ cell" out from under "germ line" and place directly under "Cell"
 
** [https://github.com/obophenotype/c-elegans-gross-anatomy-ontology/pull/23 Made pull request]
 
* Chris will update pull request to include a change to move "germline precursor cell" out from under "germ line" and place it under "Cell" (done)
 
  
=== Script to remove blank entries from Postgres ===
+
==February 15, 2024==
* Chris stumbled across several entries in the OA that were blank (empty strings) or consisted of only whitespace, some of which were causing errors upon upload to ACEDB
 
* Juancarlos has written a script to look for all such entries; 66 tables have them on sandbox (likely same on live OA)
 
* Does anyone object to removing these entries throughout Postgres?
 
* Juancarlos will remove all the empty fields identified by his script
 
  
 +
=== Literature Migration to the Alliance ABC ===
 +
==== Use Cases for Searches and Validation in the ABC (or, what are your common actions in the curation status form)? ====
 +
===== Find papers with a high confidence NN classification for a given topic that have also been flagged positive by an author in a community curation pipeline and that haven’t been curated yet for that topic =====
 +
*Facet for topic
 +
*Facet for automatic assertion
 +
**neural network method
 +
*Facet for confidence level
 +
**High
 +
*Facet for manual assertion
 +
**author assertion
 +
***ACKnowledge method
 +
**professional biocurator assertion
 +
***curation tools method - NULL
  
== October 10, 2019 ==
+
===== Manually validate paper - topic flags without curating =====
 +
*Facet for topic
 +
*Facet for manual assertion
 +
**professional biocurator assertion
 +
***ABC - no data
  
=== Biocuration 2020 ===
+
===== View all topic and entity flags for a given paper and validate, if needed =====
* Held in Bar Harbor, Maine (organized by JAX, including MGI's Sue Bello and Cindy Smith)
+
* Search ABC with paper identifier
* Dates: Sunday May 17th to Wednesday May 20th, 2020
+
* Migrate to Topic and Entity Editor
* Will have 3rd POTATO workshop
+
* View all associated data
* [https://www.jax.org/education-and-learning/education-calendar/2020/05-may/biocuration-2020-conference Meeting website]
+
* Manually validate flags, if needed
* Key Dates
 
** October 31, 2019 - Paper Submission Deadline
 
** January 24, 2020 - Abstract  and Workshop Submission Deadline
 
** March 6, 2020 - Notification of Acceptance
 
** April 6, 2020 - Early Bird Registration Ends
 
** May 8, 2020 - Registration Deadline
 
* Academic ISB Member, early bird registration fee is $250
 
* Author First Pass form paper, submitting to Database, biocuration issue (managed by biocuration group); authors have an opportunity to present at Biocuration conference
 
  
=== ICBO 2020 ===
+
=== PDF Storage ===
* International Conference on Biomedical Ontologies
+
* At the Alliance PDFs will be stored in Amazon s3
* [https://icbo2020.inf.unibz.it/ Meeting website]
+
* We are not planning to formally store back-up copies elsewhere
* Held in Bozen-Bolzano, Italy
+
* Is this okay with everyone?
* 16 - 19 September 2020
 
  
=== SObA comparison tool ===
+
==February 8, 2024==
* [http://wobr2.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=Gene+Pair+to+SObA+Graph Prototype #1] updated
+
* TAGC
 +
** Prominent announcement on the Alliance home page?
  
=== Textpresso derived paper connections ===
+
* Fixed login on dockerized system (dev). Can everybody test their forms?
* For example for strains and constructs, maybe anatomy terms?
 
* May want to flag Textpresso predictions (as opposed to manually connected)
 
* Couple of options:
 
** 1) At time of build, populate the papers (in ACEDB/Datomic) into a 'Putative_reference' tag and display in a distinct 'Putative references' widget
 
** 2) Not part of database build, but make associations live (using RESTful API to link out to Textpresso and submit search with URL) using Textpresso with links to Textpresso and Textpresso results, giving users chance to see context of matches in sentences at the Textpresso site
 
*** A link to Textpresso could be done regardless of other approaches; low-hanging fruit?
 
*** Do a diff so that Textpresso pulls up only additional papers (not already associated)?
 
** 3) Could populate WB page with connections made through a Textpresso API call (could cache results? maybe, but might as well choose 1st option?)
 
* Transgene pipeline:
 
** Arun wrote script, matching transgene names (using regex; Is and Si transgenes) to papers, automatically populate OA
 
** Another script, captures Ex transgenes as well, automatically connects to construct objects
 
** WB only displays verified papers; unverified (predicted) associations are not dumped
 
* Could integrate author verification as part of AFP pipeline, even for older papers? Would we want to re-request AFP results for authors that have already replied in the past? Probably not
 
* Could embed AFP predictions in WB display with link to AFP form for authors (and others?) to verify, via logged-in users? Or via a validation token sent via email?
 
* Chris will make GitHub ticket to ask WB web team to add a link to Textpresso search from References widget on respective page; will require a Textpresso URL constructor
 
* Can apply to: genes, transgenes, constructs, strains, alleles, AFP-vetted entities
 
  
 +
==February 1, 2024==
 +
* Paul will ask Natalia to take care of pending reimbursements
 +
* Dockerized system slow pages (OA and FPKMMine). Will monitor these pages in the future. Will look for timeouts in the nginx logs.
  
== October 17, 2019 ==
+
==January 25, 2024==
  
=== Alliance All Hands Face-to-Face ===
+
=== Curator Info on Curation Forms ===
* Flights: has everyone already booked? No, not yet
+
* Saving curator info using cookies in dockerized forms. Can we deploy to prod?
* Any coordination of flights from Pasadena/LA?
 
** Ranjana and Valerio got a direct flight from Burbank to Boston on Sunday for premeetings
 
  
=== SObA Comparison Tool ===
+
=== ACKnowledge Author Request - WBPaper00066091 ===
* http://wobr2.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=Gene+Pair+to+SObA+Graph
+
* I am more than willing to assist; however, the task exceeds the capabilities of the normal flagging process.
* Prototype discussed last week, updated with feedback from prior discussions
 
* Would this be a stand-alone tool discoverable under the Tools menu?
 
** Possibly; could be a gene page widget, but may be out of place
 
** Stand-alone tool probably makes more sense
 
* Life stage graph doesn't specify expression pattern vs. expression cluster; pretty much only expression patterns (no clusters)
 
  
=== SObA ===
+
* The paper conducts an analysis of natural variations within 48 wild isolates. To enhance the reliability of the variant set, I utilized the latest variant calling methods along with a custom filtering approach. The resulting dataset comprises 1,957,683 unique variants identified using Clair3. Additionally, Sniffles2 was used to identify indels of >30 bp, which numbered in the thousands to tens of thousands for most wild isolates. It is worth noting that variants identified with Sniffles2 have less reliable nucleotide positions in the genome.
* Raymond intending to share progress on SObA at December Alliance All-Hands Face-to-Face
 
* For example, share GO SObA graph for other species
 
* Will need to be dependent on a SOLR server with all species data
 
** Raymond has run into problems trying to setup his own SOLR server
 
** Raymond asked Seth Carbon if we could us GO server, but he prefers not
 
** Appear to be software versioning issues, possible memory issues
 
  
=== GO meeting ===
+
* I am reaching out to inquire whether WormBase would be interested in incorporating this dataset. An argument in favor is the higher quality of my data. However, I am mindful of the potential substantial effort involved for WormBase, and it is unclear whether this aligns with your priorities.
* Kimberly can give update on recent updates to GO from the recent GO meeting
 
* Slides are shared online
 
  
=== "all stages Ce" life stage ===
+
* Should WormBase decide to use my variant data set, I am more than willing to offer my assistance.
* Currently used to annotate that RNA was collected from, or a gene was observed to be expressed during, all C. elegans life stages
 
* "all stages Ce" is currently the root node of the C. elegans branch, but needs to change to generic "C. elegans life stage"
 
* Should we:
 
** 1) Create a "C. elegans life span" or "C. elegans life cycle" term to represent the entire life span and annotate to that?
 
*** Does this mean that, for example, a gene is expressed at some point during the life cycle or consistently throughout the entire life span?
 
** 2) Annotate instead to, for example, "embryo Ce", "larva Ce", and "adult Ce" individually
 
* Note: authors are often vague in their descriptions simply saying "during all stages" possibly stating a beginning and end of the range
 
* Wen: not comfortable making a decision right now; want to discuss with Gary Williams and with other MOD members about how to handle large scale expression data
 
* Daniela: will look through existing (old and new) expression pattern annotations made to "all stages Ce" to see if it would be reasonable to annotate each case individually to "embryo Ce", "larva Ce", and "adult Ce" individually
 
  
=== Gene class missing description ===
+
=== Update on NN Classification via the Alliance ===
* The gene class "aatf" has no description, so in the aatf-1 gene page Overview widget, the gene has empty parentheses next to the gene name where there should be a description of what "aatf" stands for (coming from the ?Gene_class description)
+
* Use of primary/not primary/not designated flag to filter papers
* Jae or Ranjana will create a ticket and assign it to someone at Hinxton
+
* Secondary filter on papers with at least C. elegans as species
 +
* Finalize sources (i.e. evidence) for entity and topic tags on papers
 +
* Next NN clasification scheduled for ~March
  
 +
* We decided to process all papers (even non-elegans species) and have filters on species after processing.
 +
* NNC html pages will show NNC values together with species.
 +
* Show all C. elegans papers first and other species in a separate bin.
  
== October 24, 2019 ==
+
=== Travel Reimbursements ===
 +
* Still waiting on October travel reimbursement (Kimberly)
 +
* Still waiting on September and October travel reimbursements (Wen)
  
=== Textpresso links from WB page References widgets ===
+
=== UniProt ===
* Discussed with Sibyl and Adam on last week's Textpresso call
+
* Jae found some genes without uniProt IDs, but the genes are there on uniProt but without WBGene IDs.
* Sibyl working on mockups for including Textpresso derived paper associations in the References widget
+
* Wen reached to Stavros and Chris to investigate WormBase and AGR angles.
* Had originally considered for the following classes
+
* Stavros escalates the issue on Hinxton Standup.
** Strain
+
* Mark checks Build scripts and WS291 results. After that, he contacted UniProt and he's working with them to figure this out.
** Gene
 
** Variation
 
** Transgene
 
** Construct
 
** Anatomy_term
 
* Sibyl asks: could we put this link in all References widgets in WB?
 
** Classes with References widgets include:
 
*** Antibody
 
*** Clone
 
*** Construct
 
*** Expression cluster
 
*** Expression pattern
 
*** Gene
 
*** Interaction
 
*** Life stage
 
*** Rearrangement
 
*** RNAi
 
*** Strain
 
*** Transgene
 
*** Variation
 
*** Analysis
 
*** Molecule
 
*** Process
 
** Any Textpresso search on internal WB identifiers, like WBInteraction000###### or WBRNAi00######, or on long complicated names is certainly meaningless, so the following classes are probably ruled out:
 
*** Antibody
 
*** Expression cluster
 
*** Expression pattern
 
*** Interaction
 
*** RNAi
 
*** Analysis
 
** Otherwise it could probably be beneficial to include the additional classes, searching on public name:
 
*** Clone
 
*** Life stage
 
*** Rearrangement
 
*** Molecule
 
*** Process
 
* Created Google Doc summary [https://docs.google.com/document/d/19y5wNIHLmBRm4z7Rz6NG-NQYlZbtobXuQcH6dz0s8Rg/edit?usp=sharing here]
 
* So the current list of candidate classes is:
 
** Strain
 
** Gene
 
** Variation
 
** Transgene
 
** Construct
 
** Anatomy_term (no References widget currently)
 
** Clone
 
** Life stage
 
** Rearrangement
 
** Molecule
 
** Process
 
* Eventually, we may want to incorporate papers found by Textpresso directly into the References widget
 
** We would want predicted associations to be made explicitly so
 
** We could populate the database, but there are reservations about doing this
 
  
 +
==January 18, 2024==
 +
* OA showing different names highlighted when logging in the OA, now fixed on staging
  
=== Updates from GOC Meeting, Berkeley ===
 
* Status of MOD Imports into Noctua
 
** Starting with WormBase and MGI GO annotations into GO-CAM
 
** GPAD/GPI is standard import format
 
* GO-CAM Specifications and Validation
 
* Noctua Form
 
* Pathways2GO
 
* Priorities for next 6 months:
 
** Create gene-centric Noctua views
 
** Create pathway-centric Noctua views
 
  
 +
==January 11, 2024==
 +
* Duplicate function in OA was not working when using special characters. Valerio debugged and is now fixed.
 +
** Curators should make sure that, when pasting special characters, the duplicate function works
 +
* OA showing different names highlighted when logging in the OA, Valerio will debug and check what IP address he sees
 +
** If you want to bookmark an OA url for your datatype and user, log on once, and bookmark that page (separately for prod and dev)
 +
* Chris tested on staging and production the phenotype form and the data are still going to tazendra
 +
** Chris will check with Paulo. Once it is resolved we need to take everything that is on tazendra and put it on the cloud with different PGIDs
 +
** Raymond: simply set up forwarding at our end?
 +
* AI working group: Valerio is setting up a new account for open AI -paid membership for ChatGPT4. We can also use Microsoft Edge copilot (temporary?)
 +
* Chris getting ready to deploy a 7.0.0. public release - February 7th. Carol wanted to push out monthly releases. This will include WS291. For subsequent releases the next several releases will be WS 291 until WS292 is available.
 +
* Valerio would like to use an alliancegenome.org email address for the openAI account
 +
* New alliance drive: https://drive.google.com/drive/folders/0AFkMHZOEQxolUk9PVA
 +
** note: please move shared files that you own to new Alliance Google Drive.  Here is the link to the information that Chris Mungall sent:  For more instructions see the video and SOP here:https://agr-jira.atlassian.net/browse/SCRUM-925?focusedCommentId=40674
 +
* Alliance logo and 50 word description for TAGC> Wen will talk to the outreach WG
 +
* Name server. Manuel working on this, Daniela and Karen will reach out to him and let him know that down the road micropublication would like to use the name server API to generate IDs in bulk
 +
* Karen asking about some erroneous IDs used in the name server. Stavros says that this is not a big deal because the "reason" is not populating the name server
 +
* It would be good to be able to have a form to capture additional fields for strains and alleles (see meeting minutes August 31st 2023. https://wiki.wormbase.org/index.php/WormBase-Caltech_Weekly_Calls_2023#August_31st.2C_2023). This may happen after Manuel is done with the authentication.
 +
* Michael: primary flag with Alliance. Kimberly talked about this with the blue team. They will start bringing that over all papers and fix the remaining 271 items later.
  
== October 31, 2019 ==
+
==January 4, 2024==
 
+
* ACKnowlegde pipeline help desk question:
=== ?Genotype class ===
+
** Help Desk: Question about Author Curation to Knowledgebase (Zeng Wanxin) [Thu 12/14/2023 5:48 AM]
* [https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?usp=sharing Working data model document]
+
* Citace upload, current deadline: Tuesday January 9th
* Several classes have a "Genotype" tag with text entry
+
** All processes (dumps, etc.) will happen on the cloud machine
** Strain
+
** Curators need to deposit their files in the appropriate locations for Wen
** 2_point_data
+
* Micropublication pipeline
** Pos_neg_data
+
** Ticketing system confusion
** Multi_pt_data
+
** Karen and Kimberly paper ID pipeline; may need sorting out of logistics
** RNAi
 
** Phenotype_info
 
** Mass_spec_experiment (no data as of WS273)
 
** Condition
 
* Collecting all genotype text entries yields ~33,000 unique entries, with many different forms:
 
** Species entries, like "Acrobeloides butschlii wild isolate" or "C. briggsae"
 
** Strain entries, like "BA17[fem-1(hc-17)]" or "BB21" or "BL1[pK08F4.7::K08F4.7::GFP; rol-6(+)]"
 
** Anonymous transgenes, like "BEC-1::GFP" or "CAM-1-GFP" or "Ex[Pnpr-9::unc-103(gf)]"
 
** Complex constructs, like "C56C10.9(gk5253[loxP + Pmyo-2::GFP::unc-54 3' UTR + Prps-27::neoR::unc-54 3' UTR + loxP]) II"
 
** Text descriptions, like "Control" or "WT" or "Control worms fed on HT115 containing the L4440 vector without insert" or "N.A."
 
** Bacterial genotypes, like "E. coli [argA, lysA, mcrA, mcrB, IN(rrnD-rrnE)1, lambda-, rcn14::Tn10(DE3 lysogen::lavUV5 promoter -T7 polymerase]"
 
** Including balancers, like "F26H9.8(ok2510) I/hT2 [bli-4(e937) let-?(q782) qIs48] (I;III)"
 
** Reference to parent strain, like "Parent strain is AG359"
 
** Referring to RNAi, like "Pglr-1::wrm-1(RNAi)" or "Phsp-6::gfp; phb-1(RNAi)"
 
** Referring to apparent null or loss of function alleles, like "Phsp-4::GFP(zcIs4); daf-2(-)"
 

Latest revision as of 18:18, 14 March 2024

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings

2020 Meetings

2021 Meetings

2022 Meetings

2023 Meetings


March 14, 2024

TAGC debrief

February 22, 2024

NER with LLMs

  • Wrote scripts and configured an LLM for Named Entity Recognition. Trained an LLM on gene names and diseases. Works well so far (F1 ~ 98%, Accuracy ~ 99.9%)
  • Textpresso server is kaput. Services need to be transferred onto Alliance servers.
  • There are features on Textpresso, such as link to PDF, that are desirable to curators but should be blocked from public access.


February 15, 2024

Literature Migration to the Alliance ABC

Use Cases for Searches and Validation in the ABC (or, what are your common actions in the curation status form)?

Find papers with a high confidence NN classification for a given topic that have also been flagged positive by an author in a community curation pipeline and that haven’t been curated yet for that topic
  • Facet for topic
  • Facet for automatic assertion
    • neural network method
  • Facet for confidence level
    • High
  • Facet for manual assertion
    • author assertion
      • ACKnowledge method
    • professional biocurator assertion
      • curation tools method - NULL
Manually validate paper - topic flags without curating
  • Facet for topic
  • Facet for manual assertion
    • professional biocurator assertion
      • ABC - no data
View all topic and entity flags for a given paper and validate, if needed
  • Search ABC with paper identifier
  • Migrate to Topic and Entity Editor
  • View all associated data
  • Manually validate flags, if needed

PDF Storage

  • At the Alliance PDFs will be stored in Amazon s3
  • We are not planning to formally store back-up copies elsewhere
  • Is this okay with everyone?

February 8, 2024

  • TAGC
    • Prominent announcement on the Alliance home page?
  • Fixed login on dockerized system (dev). Can everybody test their forms?

February 1, 2024

  • Paul will ask Natalia to take care of pending reimbursements
  • Dockerized system slow pages (OA and FPKMMine). Will monitor these pages in the future. Will look for timeouts in the nginx logs.

January 25, 2024

Curator Info on Curation Forms

  • Saving curator info using cookies in dockerized forms. Can we deploy to prod?

ACKnowledge Author Request - WBPaper00066091

  • I am more than willing to assist; however, the task exceeds the capabilities of the normal flagging process.
  • The paper conducts an analysis of natural variations within 48 wild isolates. To enhance the reliability of the variant set, I utilized the latest variant calling methods along with a custom filtering approach. The resulting dataset comprises 1,957,683 unique variants identified using Clair3. Additionally, Sniffles2 was used to identify indels of >30 bp, which numbered in the thousands to tens of thousands for most wild isolates. It is worth noting that variants identified with Sniffles2 have less reliable nucleotide positions in the genome.
  • I am reaching out to inquire whether WormBase would be interested in incorporating this dataset. An argument in favor is the higher quality of my data. However, I am mindful of the potential substantial effort involved for WormBase, and it is unclear whether this aligns with your priorities.
  • Should WormBase decide to use my variant data set, I am more than willing to offer my assistance.

Update on NN Classification via the Alliance

  • Use of primary/not primary/not designated flag to filter papers
  • Secondary filter on papers with at least C. elegans as species
  • Finalize sources (i.e. evidence) for entity and topic tags on papers
  • Next NN clasification scheduled for ~March
  • We decided to process all papers (even non-elegans species) and have filters on species after processing.
  • NNC html pages will show NNC values together with species.
  • Show all C. elegans papers first and other species in a separate bin.

Travel Reimbursements

  • Still waiting on October travel reimbursement (Kimberly)
  • Still waiting on September and October travel reimbursements (Wen)

UniProt

  • Jae found some genes without uniProt IDs, but the genes are there on uniProt but without WBGene IDs.
  • Wen reached to Stavros and Chris to investigate WormBase and AGR angles.
  • Stavros escalates the issue on Hinxton Standup.
  • Mark checks Build scripts and WS291 results. After that, he contacted UniProt and he's working with them to figure this out.

January 18, 2024

  • OA showing different names highlighted when logging in the OA, now fixed on staging


January 11, 2024

  • Duplicate function in OA was not working when using special characters. Valerio debugged and is now fixed.
    • Curators should make sure that, when pasting special characters, the duplicate function works
  • OA showing different names highlighted when logging in the OA, Valerio will debug and check what IP address he sees
    • If you want to bookmark an OA url for your datatype and user, log on once, and bookmark that page (separately for prod and dev)
  • Chris tested on staging and production the phenotype form and the data are still going to tazendra
    • Chris will check with Paulo. Once it is resolved we need to take everything that is on tazendra and put it on the cloud with different PGIDs
    • Raymond: simply set up forwarding at our end?
  • AI working group: Valerio is setting up a new account for open AI -paid membership for ChatGPT4. We can also use Microsoft Edge copilot (temporary?)
  • Chris getting ready to deploy a 7.0.0. public release - February 7th. Carol wanted to push out monthly releases. This will include WS291. For subsequent releases the next several releases will be WS 291 until WS292 is available.
  • Valerio would like to use an alliancegenome.org email address for the openAI account
  • New alliance drive: https://drive.google.com/drive/folders/0AFkMHZOEQxolUk9PVA
  • Alliance logo and 50 word description for TAGC> Wen will talk to the outreach WG
  • Name server. Manuel working on this, Daniela and Karen will reach out to him and let him know that down the road micropublication would like to use the name server API to generate IDs in bulk
  • Karen asking about some erroneous IDs used in the name server. Stavros says that this is not a big deal because the "reason" is not populating the name server
  • It would be good to be able to have a form to capture additional fields for strains and alleles (see meeting minutes August 31st 2023. https://wiki.wormbase.org/index.php/WormBase-Caltech_Weekly_Calls_2023#August_31st.2C_2023). This may happen after Manuel is done with the authentication.
  • Michael: primary flag with Alliance. Kimberly talked about this with the blue team. They will start bringing that over all papers and fix the remaining 271 items later.

January 4, 2024

  • ACKnowlegde pipeline help desk question:
    • Help Desk: Question about Author Curation to Knowledgebase (Zeng Wanxin) [Thu 12/14/2023 5:48 AM]
  • Citace upload, current deadline: Tuesday January 9th
    • All processes (dumps, etc.) will happen on the cloud machine
    • Curators need to deposit their files in the appropriate locations for Wen
  • Micropublication pipeline
    • Ticketing system confusion
    • Karen and Kimberly paper ID pipeline; may need sorting out of logistics