Agenda

modENCODE Data

LaDeana Hillier (Waterston Lab) produces much of the C.elegans modENCODE RNASeq data and analyses. She now has a large amount of data that they wish to make available to the public. Normallly, this data would be sent to the modENCODE DCC at UCSC for incorporation into the modENCODE website.

As the modENCODE DCC is overwhelmed with other work, she has proposed that WormBase make the data available to the public. She will still be submitting the data to the DCC.

The raw reads are submitted to the SRA, so she is proposing sending us only the results of processing the reads. These include 203 sample, each of which comprises wig coverage, SL, intron and polyA and transcripts GFF files, together with expression coverage, SL, intron and polyA read files.

Providing this data to the public could be as simple as just providing a directory where we let all of this sit, or something more complicated.

Our position with providing primary data to users has always been that WormBase isn't a primary repository of data. We prefer to extract all our data from primary resources that can be referred to by other groups. We have been extracting RNAseq data from the SRA and then doing analyses of this data for all our species.

We would certainly like to take a look at this data, but we would not be making these files directly available to our users. We would instead be incorporating aspects of it like the SL sites into our genomic features and making an updated aggregate modENCODE CDS track.

Is this a reasonable position to take with this proposal?

Searching WB with Gene/Protein Names from Other Species

The issue of searching WB with a gene and/or protein name from another species has come up again.

We include some of these names in the Concise Descriptions when discussing sequence similarity, but this hasn't been done systematically, nor comprehensively (i.e., names from humans and all major MODs, all synonyms, etc.).

I know we've discussed this issue before, but from a user perspective, what do we think the intended search behavior(s) should be, and do we have all of the name/synonym information available to allow users to perform these types of searches effectively?

Do we need use cases to help address this issue?

--Kimberly

WBConfCall 2014.03.06-Agenda and Minutes

Contents

Agenda

modENCODE Data

Searching WB with Gene/Protein Names from Other Species

Minutes

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools