Sequence Feature
From WormBaseWiki
Contents
Rules for marking up regions
- If a region is necessary and sufficient to drive a reporter gene, then mark it as an 'enhancer' or 'silencer'.
- If a region is both an enhancer and a silencer, then it should have the SO_term tags for both of these.
- If mobility shift experiments or similar experimental evidence is available to assert that a short region is a TF binding site, then mark it as a TF_binding_site.
- Similarity to a known binding motif is not evidence of being a TF_binding_site.
- If there is no evidence for a TF binding site and it has an effect on expression when mutated or deleted, but is not sufficient to drive a reporter gene, then we cannot assert that it is an enhancer or a TF binding site. Mark it as an anonymous 'regulatory_region'.
- If a region has the properties of being both a TF binding site and an enhancer then mark it up as two Features, one a TF_binding_site and one an enhancer.
- If a region is asserted to be a promoter region in the paper and it is within 200bp (or thereabouts?) of the 5' of the target gene and it is neccessary and sufficient to promote a reporter gene, mark it as a promoter. If in doubt, consider marking it as an enhancer.
Example for sequence feature curation
the example is from WBPaper00003631
Feature : "egl-1_temp_1.1" Sequence VF23B12L Mapping_target VF23B12L Flanking_sequences cagctcaattattaaattttattgggtattgttta cataaaattctattgtcccagatttaggatacatcg DNA_text CTCCTAACCGGGTGGTC Description "This is a TRA-1 binding site that represses egl-1." Remark "This is the TF_binding_site for TRA-1 which silences egl-1. N.B. a 'silencer' Feature has also been made at this location to aid expression and interaction curation [2013-07-23 gw3]" Associated_with_gene WBGene00001170 // egl-1 Bound_by_product_of WBGene00006604 // tra-1 Transcription_factor WBTranscriptionFactor000029 // tra-1 Method TF_binding_site SO_term "SO:0000235" // TF_binding_site Defined_by_paper WBPaper00003631 Public_name "TRA-1 binding site" Feature : "egl-1_temp_1.2" Sequence VF23B12L Mapping_target VF23B12L Flanking_sequences cagctcaattattaaattttattgggtattgttta cataaaattctattgtcccagatttaggatacatcg DNA_text CTCCTAACCGGGTGGTC Description "This is the silencer of egl-1, containing a single TF_binding_site bound by TRA-1." Remark "Made this 'silencer' feature in addition to the TRA-1 TF_binding_site Feature to aid expression and interaction curation [2013-07-23 gw3]" Associated_with_gene WBGene00001170 // egl-1 Method silencer SO_term "SO:0000625" // silencer Defined_by_paper WBPaper00003631 Public_name "TRA-1 binding site silencer"
Most Expr_pattern and Interaction objects will be attached to the 'enhancer/silencer' Features rather than the TF_binding_site Features
Link to Gene Regulation/Regulatory interaction
A regulatory interaction object is generated and added to the Sequence feature object. In the specific example:
//Associated_with_interaction WBInteraction000520178
Link to Expression pattern
When do we link sequence features to Expression Pattern objects and how.
current pipeline
- The authors generated a transcriptional fusion. e.g.: in an hypothetical example the promoter of egl-1 (~2kb upstream the ATG) was fused to GFP and expression was observed in the pharynx.
- Curator creates an Expression object for egl-1 in the pharynx and links it to pegl-1::GFP transgene.
- Sequence curator creates a sequence feature for that object -we are not there yet but we should aim for it.
- In the sequence feature object there will be a link to the expression.
Expr_pattern : "Expr11092" Anatomy_term "WBbt:0004757" Certain //HSNR Anatomy_term "WBbt:0004758" Certain //HSNL Anatomy_term "WBbt:0007850" Certain //male Gene "WBGene00001170"//egl-1 Pattern "The egl-1 gene appears to be expressed in the HSNs in males, in which the HSNs normally undergo programmed cell death, but not in hermaphrodites, in which the HSNs normally survive." Reference "WBPaper00003631" Reporter_gene "[Pegl-1::gfp] transcriptional fusion. To construct Pegl-1::gfp, bases +174 to +5820 (5'-3') downstream of the stop codon of the egl-1 gene and bases -1914 to -837 (5'-3') upstream of the stop codon were amplified with appropriate primers and cloned into the SpeI-ApaI (5'-3') and PstI-BamHI (5'-3') sites of vector pPD95.69, respectively (A. Fire et al., personal communication). --precise ends."
proposed pipeline
- Immunohistochemistry and in situ hybridization
- GFP reporter fusions
As of now few Expression Patterns are linked to the Genome Browser (Vancouver set is the only data set). The ultimate goal is to map, whenever we can, expression constructs to the genome browser.