WBConfCall 2020.02.20-Agenda and Minutes

From WormBaseWiki
Jump to navigationJump to search


Project Meeting

  • Plan is to have Caltech host
  • Doodle poll for April and May dates
  • How many days do we need?
  • Will also host a virtual SAB meeting
  • AI: Start a draft agenda - flesh out by mid-March
    • Link to draft agenda sent out to staff email; please start adding topics

Help Desk

  • Estimate of number of C. elegans protein-coding genes with no function
    • What criteria would give us a reasonable estimate of that number?
      • No phenotype, interaction, GO, expression? Anything else?
      • Answering this kind of question requires us to have "complete" curation on certain data types. Are we going to keep this goal with the move into AGR? If "complete" on all data types is not a practical goal, which data types should be targeted to "complete"?
  • Interesting discussion on what criteria could be used to define a set of C. elegans protein-coding genes with no known function
  • The automated description pipeline currently uses several types of data to write descriptions:
    • Orthology
    • GO
    • Expression
    • Disease
  • Genes lacking an automated description might constitute a reasonable set to start from, but since phenotype data is not included in the descriptions, there may be false negatives in the gene description set
  • The Alliance pipeline generates reports that might be helpful for getting an estimate number of genes with no function, based on lack of an automated description.
  • Differential gene expression under a certain condition was generally not thought to be sufficient criteria to say a gene has a function
  • IEA GO annotations are predictors of function, but not sufficient evidence to say a gene definitely does enable that function (or participate in a process)
  • Analysis of the literature could also be informative as it may get at how many genes haven't even been studied. TextpressoCentral searches could help get this data.
    • Even with literature searches, though, we'd need to have some sense of the context in which a gene is mentioned.
  • A matrix approach might be particularly helpful here. Given a set of criteria that might inform function, how many does each C. elegans gene meet?
  • This raises the issue of overall annotation coverage in WormBase and how we could prioritize curation better to fill in knowledge gaps.
  • Paul S. also mentioned the Pharos project: https://pharos.nih.gov/ that is trying to "illuminate the uncharacterized and/or poorly annotated portion of the DG (Druggable Genome)" and thus also grapples with the issue of determining functions for perhaps understudied proteins.
  • AI: write back to user with our thoughts and see if we can get a better idea of how they would define 'no known function'.

  • I was directed here from a link on the CGC
    • Do we need to take any action to fix the broken Strain links from CGC to WormBase? (notify CGC or point the old URLs to the current strain URLs ...)
    • This may be an isolated case. KVA contacted the CGC to let them know and ask if they need any help from us.
    • Ongoing correspondence with CGC; Sibyl added suggestions for links to the github ticket.

Website Widgets

  • Any specific issues to discuss here?
  • Widgets Protein page motif_details and overview widget, Go page Associations widget, and Gene page Homology widget
  • AI: WormBase staff should check the new widgets and give feedback to Adam and Sibyl on the appropriate github tickets.


  • orthology updates - downstream effects
    • It seems that the overall best solution to the stale orthology data is to see if the Alliance can update DIOPT orthology calls on a more frequent basis.
    • Manual intervention would not really be feasible.
  • AI: Jae will write to the Alliance orthology working group email list to ask about this.
  • Valerio will present on the WB literature acquisition and flagging pipelines on Tuesday, the 25th, at the Alliance Literature Working Group
    • Double-checking with Jim K. and Stacia about meeting next week.

?Genotype class

  • Genotype Wiki page updated
  • Have removed strain associations, maternal/paternal genotypes and zygosity from proposal for now
  • Caltech will mint new ?Genotype IDs for now, but eventually this may come under the purview of the name server.
  • A controlled vocabulary for genotype would be nice to have in the future, but for now curators will follow the C. elegans nomenclature guidelines for populating that field.
  • It might be a good idea to create and populate an Other_name or Synonym tag to capture what are believed to be the same genotypes that might not have been written the same way in publications.
  • The information in the ?Genotype class is additive for WB, but there may be some changes to data display that currently refer to genotypes, e.g. Disease.