Difference between revisions of "WBConfCall 2014.06.05-Agenda and Minutes"

From WormBaseWiki
Jump to navigationJump to search
Line 87: Line 87:
 
Prx-10 - Lots of ESTS displaying a single block containing the end and start.
 
Prx-10 - Lots of ESTS displaying a single block containing the end and start.
 
</pre>
 
</pre>
 +
 +
== WormBase ParaSite ==
 +
 +
With ParaSite close to going into production, we should start discussing how to proceed. Topics might include:
 +
* How best to integrate it with the main WormBase site, and with the WormBase release cycle
 +
* The strategy for incorporation of parasitic genomes into WormBase from this point
 +
* Current parasitic species in WormBase -  should we "move" them into ParaSite?
 +
* Curation for parasitic species?
 +
 +
This is a big discussion topic, so the intent for this call is to seed some of these discussion, and come up with more concrete plans off-line or in future calls.

Revision as of 09:50, 5 June 2014

Agenda

New Staff Member Introduction

Sibyl Gao, joining the webdev team at OICR. Email: sibyl@wormbase.org

nurf-1 Gene structure

In C. elegans, nurf-1 is a complex gene structure that is composed of two main regions. Many isoforms are apparent in this locus. Some of these terminate halfway through the locus, some start in the second half and some span both halves.

In C. briggsae, C. japonica and C. remanei the homologous region to Cel-nurf-1 is composed of two completely separate genes, according to our normal gene curation standards.

In Drosophila and Human, the homologous region is a single complex gene producing many different isoforms, some of which span the two halves, as in C. elegans.

We have a user who is requesting help in naming this locus in C. briggsae for a paper.

There are other examples of complex gene loci that have been annotated in a variety ways. Often our hand has been forced as authors have imposed their view of how these should be named before a discussion of the implications. This has resulted in loci where we have the annotation as:

Category 1) A single gene locus - single CGC name and single WBGene ID with non-overlapping isoforms.

Category 2) Two completely different genes - two CGC names and two WBGene IDs

Category 3) Two genes with shared Isoform naming - two CGC names and two WBGene IDs

Options:

1) Author and Tim: Proposes having a single "locus name" Cbr-nurf-1 having a single CGC name shared between two different genes would work in the database and preserve the current naming but could potentially result in problems for the website. Is this the case?

2) Referee: Proposed nurf-1.1 and nurf-1.2 (Usually reserved for paralogous genes)

3) Precedent:lin-15A/B Two distinct genes with no shared naming nurf-1A nurf-1B - shows that there is something different in species X compared to species Y

4) Could merge the genes into a single gene with non-overlapping transcript populations....simplifies for website

How should we represent and name these sort of loci in the future?

Specific examples of unusual loci for reference:


* Non overlapping transcript populations with a bridging population single gene (like the C. elegans nurf-1)
Category 1) A single gene locus - single CGC name and single WBGene ID with non-overlapping isoforms.
-------------------------------
Unc-105
Hop-12
D2023.1
DH11.5
F11F1.1 - Unusual loci
F26H11.2 - nurf-1

Category 3) Two genes with shared Isoform naming - two CGC names and two WBGene IDs
* Complex transcript populations with some shared space.
Lev-10::eat-18
-------------------------------
Y105E8A.7 a/b/e = lev-10
Y105E8A.7 c/d = eat-18
Small amounts of out of frame coding overlap

Cha-1::unc-17
-------------------------------
ZC416.8a = unc-17
ZC416.8b = cha-1
No coding overlap just shared transcriptional space in a UTR exon.


* Single transcript population giving rise to 2 gene products
-----------------------------------------------------------
Category 2) Two completely different genes - two CGC names and two WBGene IDs

** 2 genes curated
---------------
Maf-1::Mgl-2
F32D8.12::F32D8.11 - One appears to be in UTR of other.
Nola-3::c25a1.16 - Unusual as nola-3 is very similar to a yeast protein, small and would have been ignored in normal curation
Y53C12A.11::Y53C12A.6 - No coding overlap - Lots of ESTS displaying a single block containing the end and start.

W01A8.2::W01A8.8
-------------------------------
Non coding transcript populations overlaps 2 sets of coding transcripts also some evidence for a retained intron so could be a single transcript that is processed to give 2 functional forms?

Category 1) A single gene locus - single CGC name and single WBGene ID with non-overlapping isoforms.
** 1 gene curated
--------------
Prx-10 - Lots of ESTS displaying a single block containing the end and start.

WormBase ParaSite

With ParaSite close to going into production, we should start discussing how to proceed. Topics might include:

  • How best to integrate it with the main WormBase site, and with the WormBase release cycle
  • The strategy for incorporation of parasitic genomes into WormBase from this point
  • Current parasitic species in WormBase - should we "move" them into ParaSite?
  • Curation for parasitic species?

This is a big discussion topic, so the intent for this call is to seed some of these discussion, and come up with more concrete plans off-line or in future calls.