Sequence Feature

From WormBaseWiki
Jump to navigationJump to search

Rules for marking up regions

  • If a region is necessary and sufficient to drive a reporter gene, then mark it as an 'enhancer' or 'silencer'.
  • If a region is both an enhancer and a silencer, then it should have the SO_term tags for both of these.
  • If mobility shift experiments or similar experimental evidence is available to assert that a short region is a TF binding site, then mark it as a TF_binding_site.
  • Similarity to a known binding motif is not evidence of being a TF_binding_site.
  • If there is no evidence for a TF binding site and it has an effect on expression when mutated or deleted, but is not sufficient to drive a reporter gene, then we cannot assert that it is an enhancer or a TF binding site. Mark it as an anonymous 'regulatory_region'.
  • If a region has the properties of being both a TF binding site and an enhancer then mark it up as two Features, one a TF_binding_site and one an enhancer.
  • If a region is asserted to be a promoter region in the paper and it is within 200bp (or thereabouts?) of the 5' of the target gene and it is neccessary and sufficient to promote a reporter gene, mark it as a promoter. If in doubt, consider marking it as an enhancer.


Example for sequence feature curation

the example is from WBPaper00003631


Feature : "egl-1_temp_1.1"
Sequence VF23B12L
Mapping_target VF23B12L
Flanking_sequences cagctcaattattaaattttattgggtattgttta cataaaattctattgtcccagatttaggatacatcg
DNA_text CTCCTAACCGGGTGGTC
Description "This is a TRA-1 binding site that represses egl-1."
Remark "This is the TF_binding_site for TRA-1 which silences egl-1. 
N.B. a 'silencer' Feature has also been made at this location to aid expression and interaction curation
[2013-07-23 gw3]"
Associated_with_gene WBGene00001170 // egl-1
Bound_by_product_of WBGene00006604 // tra-1
Transcription_factor WBTranscriptionFactor000029 // tra-1
Method  TF_binding_site
SO_term "SO:0000235" // TF_binding_site
Defined_by_paper WBPaper00003631
Public_name "TRA-1 binding site"

Feature : "egl-1_temp_1.2"
Sequence VF23B12L
Mapping_target VF23B12L
Flanking_sequences cagctcaattattaaattttattgggtattgttta cataaaattctattgtcccagatttaggatacatcg
DNA_text CTCCTAACCGGGTGGTC
Description "This is the silencer of egl-1, containing a single TF_binding_site bound by TRA-1."
Remark "Made this 'silencer' feature in addition to the TRA-1 TF_binding_site Feature to aid expression 
and interaction curation [2013-07-23 gw3]"
Associated_with_gene WBGene00001170 // egl-1
Method  silencer
SO_term "SO:0000625" // silencer
Defined_by_paper WBPaper00003631
Public_name "TRA-1 binding site silencer"

Most Expr_pattern and Interaction objects will be attached to the 'enhancer/silencer' Features rather than the TF_binding_site Features

Link to Gene Regulation/Regulatory interaction

A regulatory interaction object is generated and added to the Sequence feature object. In the specific example:

//Associated_with_interaction WBInteraction000520178

Link to Expression pattern

When do we link sequence features to Expression Pattern objects and how.

current pipeline

  • [Pegl-1::gfp] transcriptional fusion from WBPaper00003631.
  • Curator creates an Expression object for egl-1 in the pharynx and links it to pegl-1::GFP transgene.
  • Sequence curator creates a sequence feature for that object -we are not there yet but we should aim for it.
  • In the sequence feature object there will be a link to the expression.
Expr_pattern : "Expr11092"
Anatomy_term	"WBbt:0004757" Certain //HSNR
Anatomy_term	"WBbt:0004758" Certain //HSNL
Anatomy_term	"WBbt:0007850" Certain //male
Gene	"WBGene00001170"//egl-1
Pattern	"The egl-1 gene appears to be expressed in the HSNs in males, in which the HSNs normally undergo 
programmed cell death, but not in hermaphrodites, in which the HSNs normally survive."
Reference	"WBPaper00003631"
Reporter_gene	"[Pegl-1::gfp] transcriptional fusion. To construct Pegl-1::gfp, bases +174 to +5820 (5'-3') 
downstream of the stop codon of the egl-1 gene and bases -1914 to -837 (5'-3') upstream of the stop codon were
amplified with appropriate primers and cloned into the SpeI-ApaI (5'-3') and PstI-BamHI (5'-3') sites of 
vector pPD95.69, respectively (A. Fire et al., personal communication). --precise ends."

proposed pipeline

  • Immunohistochemistry and in situ hybridization
  • GFP reporter fusions


As of now few Expression Patterns are linked to the Genome Browser (Vancouver set is the only data set). The ultimate goal is to map, whenever we can, expression constructs to the genome browser.