Expression Cluster

From WormBaseWiki
Jump to navigationJump to search

Expression clusters are listed of genes that are differentially expressed in high throughput expression studies including microarray, tiling array, RNA sequencing, single-cell RNA Sequencing, and proteomics. Expression clusters allow users to get function and tissue expression information for unknown genes in multiple nematode species.

A WormBase curator gets differentially expressed gene lists from supplementary files of publications. To get these data into WormBase, we need to map all the entities into WormBase IDs. These include gene ID, anatomy term, life stage, and molecule ID for chemicals used in the experiments. We also need to know how the authors processed raw data to get these differentially expressed genes. It is critical that authors provide all necessary information to assure that their data is interpreted correctly by curators. Here are some examples of information that we expect to see in published papers.

Software Package (example: DESeq2 v 1.22.16)
Threshold (example: FDR < 0.05 and fold change > 2)
Version of WormBase genome used for mapping (example: WS288)
Life stage of animals studied (example: L4 larva)
Sample Supplementary file for the differentially expressed gene list. This is an example of a single-cell RNASeq study

Cell Type Anatomy name Anatomy term Gene ID Gene Name P value avg_logFC Adjusted p value
AVF AVF Wbbt:0003851 WBGene00017850 F27B10.1 0 4.7088800284 0
AVF AVF Wbbt:0003851 WBGene00003752 nlp-14 0 3.3727323703 0
AVF AVF Wbbt:0003851 WBGene00016114 flp-27 0 2.710326377 0
Intestine intestine Wbbt:0005772 WBGene00012615 dct-16 0 3.2522967876 0
Intestine intestine Wbbt:0003851 WBGene00015913 C17F4.7 0 1.9188317409 0
Intestine intestine Wbbt:0003851 WBGene00022497 Y119D3B.21 0 1.876397092 0
I5 I5 neuron Wbbt:0004740 WBGene00022144 pghm-1 2.89711642079677E-116 0.7731186731 5.85565170971444E-112
I5 I5 neuron WBbt:0004740 WBGene00011392 sbt-1 2.60560798327927E-113 0.7449088966 5.26645485580406E-109
I5 I5 neuron Wbbt:0004740 WBGene00010059 F54E4.3 1.45086508671097E-109 0.3023744639 2.9324885132602E-105
RMD_LR RMDL, RMDR Wbbt:0005037, Wbbt:0005033 WBGene00012760 Y41C4A.17 4.6219153647023E-60 0.8402976236 9.34181533513628E-56
RMD_LR RMDL, RMDR Wbbt:0005037, Wbbt:0005033 WBGene00022144 pghm-1 3.05192716357593E-53 0.6003905586 6.16855518301966E-49
RMD_LR RMDL, RMDR Wbbt:0005037, Wbbt:0005033 WBGene00012853 Y44A6D.2 8.59653038778404E-52 0.5616972229 1.73753072197891E-47
PHso phasmid socket cell WBbt:0008410 WBGene00009724 F45D3.4 1.05021281935678E-133 1.3380881797 2.12269015048392E-129
PHso phasmid socket cell WBbt:0008410 WBGene00018250 F40H3.2 7.33293398018014E-131 1.0461751837 1.48213261607401E-126
PHso phasmid socket cell WBbt:0008410 WBGene00017748 F23F1.7 4.4503443463399E-127 2.4747040271 8.9950359928222E-123

To map your gene names (public name, sequence name or Uniprot IDs) into WormBase IDs, you can use WormBase SimpleMine: https://wormbase.org/tools/mine/simplemine.cgi

To map your cell names into WormBase Anatomy term, you can use this table: http://caltech.wormbase.org/pub/wormbase/spell_download/tables/AnatomyTable.csv