Difference between revisions of "Construct"

From WormBaseWiki
Jump to navigationJump to search
m
(\q)
 
(20 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
===Overview===
 
===Overview===
 
A construct is defined as any single contiguous stretch of engineered DNA sequence. Constructs will be curated as objects used in the generation of transgenes, engineered alleles, and expression markers used in analysis of expression patterns, gene regulation, and biological topics.
 
A construct is defined as any single contiguous stretch of engineered DNA sequence. Constructs will be curated as objects used in the generation of transgenes, engineered alleles, and expression markers used in analysis of expression patterns, gene regulation, and biological topics.
 
  
 
===Model===
 
===Model===
 
The official model page is [[WormBase_Model:Construct | here]] along with compensatory changes to related models and test data.
 
The official model page is [[WormBase_Model:Construct | here]] along with compensatory changes to related models and test data.
Constructs will be curation for the benefit of a number of other data classes: Variation, Transgene, Gene Regulation, Topic, and Expression Pattern.  
+
Constructs will be curated for the benefit of a number of other data classes: Variation, Transgene, Gene Regulation, Topic, and Expression Pattern.
  
 
===Construct OA===
 
===Construct OA===
Line 13: Line 12:
 
* assign all transgene objects a unique WBConstruct ID.
 
* assign all transgene objects a unique WBConstruct ID.
 
* move/rename/rededicate(?) many of the trp tables to be construct tables
 
* move/rename/rededicate(?) many of the trp tables to be construct tables
* reorganize the transgene OA to include construct curation - don't know if this is possible as both constructs and transgenes need to be assigned unique ids also any given transgene could have multiple constructs and any given construct could be used by multiple transgenes
+
* reorganize the transgene OA to include construct curation  
* add functionality to exprpat, genereg, and topic OAs to enter construct requests
+
* add functionality to exprpat, genereg, int and topic OAs to enter construct requests
 +
 
 +
===Connecting constructs to Transgenic Alleles===
 +
''NOTE: Transgenes are termed Transgenic Alleles by the Alliance''<br>
 +
WB has both construct and transgenic alleles. The Alliance Expression Pattern schema only allows Transgenic Alleles to be attached to a pattern; WB **constructs without an associated transgene** will therefore not be captured by the Alliance.  To simplify expression schema changes we will **give transgene IDs to all constructs that are not already associated with a transgene**, used in Expression
 +
 
 +
The Construct OA should allow the minting of WBTransgene IDs from within the OA based on an individual construct. <br>
 +
 
 +
The workflow would be:
 +
*Curator creates a Construct
 +
*Curator indicates (through a button or field value?) that a transgene ID is needed for that construct (a cron job to be triggered by flag?)
 +
*Postgres creates a transgene pgid, and populates trp tables as follows
 +
**cns_summary copied to trp_summary
 +
**cns_name copied to trp_construct
 +
**cns_paper copied to trp_paper
 +
**cns_curator copied to trp_curator
 +
**trp_name copied to cns_usedfortransgene
  
==Postgres script to populate cns tables===
+
 
Construct tables = cns_<br>
+
For the Expression OA (this has also been added to the [https://wiki.wormbase.org/index.php/Expression_Pattern#Populating_exp_transgene_based_on_exp_construct  Expression wiki]<br>
 +
For back population of trp tables with existing constructs in the expression tables for which there is no associated transgene:
 +
 
 +
* find all exp_constructs that are not associated with a trp_publicname
 +
* create transgene IDs for each exp_construct following the data porting as laid out below
 +
**Postgres creates a transgene pgid, and populates trp tables as follows
 +
***cns_summary copied to trp_summary
 +
***cns_name copied to trp_construct
 +
***cns_paper copied to trp_paper
 +
***cns_curator copied to trp_curator
 +
***trp_name copied to cns_usedfortransgene
 +
***trp_name copied to exp_transgene
 +
 
 +
- after  data clean up will need to suppress data [not dump] in the exp_construct <br>
 +
See <br>
 +
 
 +
 
 +
 
 +
*original notes 1/27/21
 +
Ask Juancarlos:
 +
- how many constructs used in expression do not have a transgene?
 +
- need to create transgene objects in bulk for that
 +
- need to remove the construct from the construct field and add the corresponding transgene in the transgene field
 +
- for curation purposes need to set up a mechanism to flag constructs that need transgeneID. A cronjob will create a transgene ID to attach to the
 +
- after data clean up will need to suppress the construct field on Expression OA
 +
 
 +
===Adding new Reporter types===
 +
the cns_reporter obo_ tables don't seem to get often updated (looking at timestamp values).
 +
When there is a new reporter just add it directly to the postgres test db through the command line (example for mNeonGreen):
 +
 
 +
* INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen', 'mNeonGreen');
 +
* INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen', 'id: mNeonGreen');<br>
 +
Will look like this
 +
 
 +
  > psgl testdb
 +
  [testdb=> INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen','mNeonGreen');
 +
  [INSERT 0 1
 +
  [testdb=> INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen','id:mNeonGreen');
 +
  [INSERT 0 1
 +
  \q
 +
 
 +
 
 +
To enter multiple values at once :
 +
*INSERT INTO obo_name_cnsreporter VALUES ('mNeptune2.1', 'mNeptune2.1'), ('mNeptune2.2', 'mNeptune2.2'), ('mNeptune2.3', 'mNeptune2.3'), ('mNeptune2.4', 'mNeptune2.4');
 +
*INSERT INTO obo_data_cnsreporter VALUES ('mNeptune2.1', 'id:mNeptune2.1'), ('mNeptune2.2', 'id:mNeptune2.2'), ('mNeptune2.3', 'id:mNeptune2.3'), ('mNeptune2.4', 'id:mNeptune2.4');
 +
 
 +
If you think there is a mistake, query for it with:
 +
*SELECT * FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';<br>
 +
When you see that it's the only thing you want to delete, you can
 +
up-arrow and replace the SELECT *  with a  DELETE
 +
*DELETE FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';
 +
 
 +
===Postgres script to populate cns tables===
 +
Construct tables = cns_:<br>
 
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa<br>
 
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa<br>
transfer_trp_to_cst.pl<br>
+
script for data population of cns_ tables from trp_tables: transfer_trp_to_cst.pl<br>
  
 
Summary of transfer instructions:  
 
Summary of transfer instructions:  
Line 27: Line 95:
 
| align="center" style="background:#f0f0f0;"|'''OA label'''
 
| align="center" style="background:#f0f0f0;"|'''OA label'''
 
| align="center" style="background:#f0f0f0;"|'''data transfer'''
 
| align="center" style="background:#f0f0f0;"|'''data transfer'''
|-
 
|cns_id||pgid||
 
|-
 
|cns_timestamp||not a table||
 
 
|-
 
|-
 
|cns_curator||Curator||one-time copy from trp_curator
 
|cns_curator||Curator||one-time copy from trp_curator
Line 40: Line 104:
 
|cns_name||Name||WBCnstr: $cnstid = &pad8Zeros($newPgid)//assigned WBConstructID
 
|cns_name||Name||WBCnstr: $cnstid = &pad8Zeros($newPgid)//assigned WBConstructID
 
|-
 
|-
|cns_publicname||Public Name||extract Clone = /Clone: values from trp_constructionsummary and use to populated this field
+
|cns_publicname||Public Name||/Clone: values from trp_constructionsummary
 
|-
 
|-
|cns_othername||Other Name||
+
|cns_othername||Other Name||transfer Expr from trp_publicname or trp_synonym <br>In cases all cases where "Expr" is in trp_publicname, delete object from trp tables. <br> in cases where Expr is in trp_synonym and trp_publicname is blank, delete object from trp_tables.
 
|-
 
|-
|cns_newtransgene||New Transgene||for production-  will send a daily alert to the transgene curator based on new entries.
+
|'''hide table'''  cns_newtransgene||New Transgene||for production
 
|-
 
|-
 
|cns_summary||Summary||one time copy from trp_summary
 
|cns_summary||Summary||one time copy from trp_summary
|-
 
|cns_merge||Merge||
 
 
|-
 
|-
 
|cns_drivenbygene||Driven By Gene||transfer all values from trp_driven_by_gene<br>delete trp_driven_by_gene
 
|cns_drivenbygene||Driven By Gene||transfer all values from trp_driven_by_gene<br>delete trp_driven_by_gene
Line 64: Line 126:
 
|cns_constructtype||ConstructType||transfer all values from reporter_type: Chimera, Domain_swap,Engineered mutation, Fusion, Complex (e.g., GFP fusion plus point mutations), Transcriptional_fusion, Translational_fusion, N-terminal_translational_fusion, C-terminal_translational_fusion, Internal_coding_fusion<br>transfer all values from trp_reporter_type <br>delete trp_reporter_type
 
|cns_constructtype||ConstructType||transfer all values from reporter_type: Chimera, Domain_swap,Engineered mutation, Fusion, Complex (e.g., GFP fusion plus point mutations), Transcriptional_fusion, Translational_fusion, N-terminal_translational_fusion, C-terminal_translational_fusion, Internal_coding_fusion<br>transfer all values from trp_reporter_type <br>delete trp_reporter_type
 
|-
 
|-
|cns_selectionmarker||SelectionMarker|| (for elements stitched into contiguous sequence, coinjected elements will get their own construct ID, these will be joined together to create transgene within the Transgene OA)
+
|cns_selectionmarker||SelectionMarker||
|-
 
|cns_threeutr||3 UTR|| create cns_threeutr table then move data from trp_threeutr <br>delete after move
 
|-
 
|cns_constructionsummary||ConstructionSummary||transfer values trp_driven_by_construct <br>delete trp_driven_by_construct; <br>transfer trp_rconstructionsummary for all trp_publicname lines that are not "Is" or are blank
 
|-
 
|cns_fwdprimer||FWDprimer||for mapping to genome, can include entire construct sequence
 
|-
 
|cns_revprimer||REVprimer||for mapping to genome, can include entire construct sequence
 
|-
 
|cns_dna||DNAText||for mapping to genome, can include entire construct sequence
 
 
|-
 
|-
|cns_feature||Feature||See [[Sequence_Feature | sequence feature wiki]], implementation would be like that for expression pattern and gene regulation
+
|cns_threeutr||3 UTR|| create cns_threeutr table, transfer from trp_threeutr <br>delete trp_threeutr
 
|-
 
|-
|cns_proposedfeature||ProposedSeqFeature||where trp_threeutr had a value, populate with "3'UTR"
+
|cns_constructionsummary||Construction Details||transfer values in trp_driven_by_construct <br>delete trp_driven_by_construct; <br>transfer trp_constructionsummary for all trp_publicname lines that are not "Is" or are blank; transfer all associated Expr in trp_publicname or trp_synonym to cns_othername
 
|-
 
|-
|cns_genewithfeature||FeatureGene||transfer from trp_threeutr
+
|cns_clone|| Clone|| copy from trp_clone NOTE: might have been deemed empty and deleted already, if not, delete trp_clone table
|-
 
|cns_clone||Clone||
 
 
|-
 
|-
 
|cns_laboratory||Laboratory||copy from trp_laboratory
 
|cns_laboratory||Laboratory||copy from trp_laboratory
Line 89: Line 139:
 
|-
 
|-
 
|}
 
|}
 +
 +
==Going live to do list==
 +
Make new tables and run scripts :
 +
 +
*create exp_construct + exp_variation tables for expression pattern :
 +
/home/postgres/work/pgpopulation/exp_exprpattern/20140606_exp_construct_variation/create_datatype_tables.pl
 +
 +
*create grg_construct table for gene regulation :
 +
/home/postgres/work/pgpopulation/grg_generegulation/20140606_grg_construct/create_datatype_tables.pl
 +
 +
*create int_construct table for interaction :
 +
/home/postgres/work/pgpopulation/interaction/20140710_int_construct/create_datatype_tables.pl
 +
 +
*create pro_construct table for process :
 +
/home/postgres/work/pgpopulation/pro_process/20140708_pro_construct/create_datatype_tables.pl
 +
 +
*create trp_variation trp_construct trp_coinjectionconstruct trp_integratedfrom tables for transgene :
 +
/home/postgres/work/pgpopulation/transgene/20140605_construct_tables/create_transgene_construct_tables.pl
 +
 +
*create all cns_ tables for construct objects :
 +
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/create_construct_tables.pl
 +
 +
 +
*transfer construct data from transgene trp_ to construct cns_ :
 +
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/transfer_trp_to_cns.pl
 +
 +
*populate construct field of exp_ grg_ and int_ tables based on transgene to construct mappings :
 +
/home/postgres/work/pgpopulation/cns_construct/20140710_transfer_construct_exp_grg_int/transfer_construct.pl
 +
 +
 +
 +
Sync dumpers at /home/postgres/work/citace_upload/
 +
*process/get_process_curation_ace.pm
 +
*transgene/get_transgene_ace.pm
 +
*expr_pattern/get_expr_pattern_ace.pm
 +
*interaction/get_interaction_ace.pm
 +
*gene_regulation/get_gene_regulation_ace.pm
 +
*whole new directory cns_construct/
 +
 +
Sync OA
 +
 +
==Dealing with duplicate lines==
 +
All Construct IDS should match PGIDs<br>
 +
Scripts in place to find nonmatching IDs<br>
 +
*Find lines where cns_name does not match PGID<br>
 +
 +
*script to assess what will be overwritten when data for cns_names are moved to the corresponding PGID<br>
 +
/home/postgres/work/pgpopulation/transgene/20200627_fix_joinkey_duplicate

Latest revision as of 17:05, 28 May 2021

This page documents the creation of the construct class and all tools needed for its curation.

Overview

A construct is defined as any single contiguous stretch of engineered DNA sequence. Constructs will be curated as objects used in the generation of transgenes, engineered alleles, and expression markers used in analysis of expression patterns, gene regulation, and biological topics.

Model

The official model page is here along with compensatory changes to related models and test data. Constructs will be curated for the benefit of a number of other data classes: Variation, Transgene, Gene Regulation, Topic, and Expression Pattern.

Construct OA

  • create construct tables see tables on OA tables page
  • assign all transgene objects a unique WBConstruct ID.
  • move/rename/rededicate(?) many of the trp tables to be construct tables
  • reorganize the transgene OA to include construct curation
  • add functionality to exprpat, genereg, int and topic OAs to enter construct requests

Connecting constructs to Transgenic Alleles

NOTE: Transgenes are termed Transgenic Alleles by the Alliance
WB has both construct and transgenic alleles. The Alliance Expression Pattern schema only allows Transgenic Alleles to be attached to a pattern; WB **constructs without an associated transgene** will therefore not be captured by the Alliance. To simplify expression schema changes we will **give transgene IDs to all constructs that are not already associated with a transgene**, used in Expression

The Construct OA should allow the minting of WBTransgene IDs from within the OA based on an individual construct.

The workflow would be:

  • Curator creates a Construct
  • Curator indicates (through a button or field value?) that a transgene ID is needed for that construct (a cron job to be triggered by flag?)
  • Postgres creates a transgene pgid, and populates trp tables as follows
    • cns_summary copied to trp_summary
    • cns_name copied to trp_construct
    • cns_paper copied to trp_paper
    • cns_curator copied to trp_curator
    • trp_name copied to cns_usedfortransgene


For the Expression OA (this has also been added to the Expression wiki
For back population of trp tables with existing constructs in the expression tables for which there is no associated transgene:

  • find all exp_constructs that are not associated with a trp_publicname
  • create transgene IDs for each exp_construct following the data porting as laid out below
    • Postgres creates a transgene pgid, and populates trp tables as follows
      • cns_summary copied to trp_summary
      • cns_name copied to trp_construct
      • cns_paper copied to trp_paper
      • cns_curator copied to trp_curator
      • trp_name copied to cns_usedfortransgene
      • trp_name copied to exp_transgene

- after data clean up will need to suppress data [not dump] in the exp_construct
See


  • original notes 1/27/21

Ask Juancarlos: - how many constructs used in expression do not have a transgene? - need to create transgene objects in bulk for that - need to remove the construct from the construct field and add the corresponding transgene in the transgene field - for curation purposes need to set up a mechanism to flag constructs that need transgeneID. A cronjob will create a transgene ID to attach to the - after data clean up will need to suppress the construct field on Expression OA

Adding new Reporter types

the cns_reporter obo_ tables don't seem to get often updated (looking at timestamp values). When there is a new reporter just add it directly to the postgres test db through the command line (example for mNeonGreen):

  • INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen', 'mNeonGreen');
  • INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen', 'id: mNeonGreen');

Will look like this

 > psgl testdb
 [testdb=> INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen','mNeonGreen');
 [INSERT 0 1
 [testdb=> INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen','id:mNeonGreen');
 [INSERT 0 1
 \q


To enter multiple values at once :

  • INSERT INTO obo_name_cnsreporter VALUES ('mNeptune2.1', 'mNeptune2.1'), ('mNeptune2.2', 'mNeptune2.2'), ('mNeptune2.3', 'mNeptune2.3'), ('mNeptune2.4', 'mNeptune2.4');
  • INSERT INTO obo_data_cnsreporter VALUES ('mNeptune2.1', 'id:mNeptune2.1'), ('mNeptune2.2', 'id:mNeptune2.2'), ('mNeptune2.3', 'id:mNeptune2.3'), ('mNeptune2.4', 'id:mNeptune2.4');

If you think there is a mistake, query for it with:

  • SELECT * FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';

When you see that it's the only thing you want to delete, you can up-arrow and replace the SELECT * with a DELETE

  • DELETE FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';

Postgres script to populate cns tables

Construct tables = cns_:
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa
script for data population of cns_ tables from trp_tables: transfer_trp_to_cst.pl

Summary of transfer instructions:

table OA label data transfer
cns_curator Curator one-time copy from trp_curator
cns_paper Paper one-time copy from trp_paper
cns_person Person one-time copy from trp_person
cns_name Name WBCnstr: $cnstid = &pad8Zeros($newPgid)//assigned WBConstructID
cns_publicname Public Name /Clone: values from trp_constructionsummary
cns_othername Other Name transfer Expr from trp_publicname or trp_synonym
In cases all cases where "Expr" is in trp_publicname, delete object from trp tables.
in cases where Expr is in trp_synonym and trp_publicname is blank, delete object from trp_tables.
hide table cns_newtransgene New Transgene for production
cns_summary Summary one time copy from trp_summary
cns_drivenbygene Driven By Gene transfer all values from trp_driven_by_gene
delete trp_driven_by_gene
cns_gene Gene transfer all values from trp_gene
delete trp_gene
cns_reporter Reporter transfer all values from trp_reporter_product except when the value is equal to a purification_tag value
reporter values are: GFP, GFP(S65C), EGFP, pGFP(photoactivated GFP), YFP, EYFP, BFP, CFP, Cerulian, RFP, mRFP, tagRFP, mCherry, wCherry, tdTomato, mStrawberry, DsRed, DsRed2, Venus, YC2.1 (yellow cameleon), YC12.12 (yellow cameleon),YC3.60 (yellow cameleon), Yellow cameleon, Dendra, Dendra2, tdimer2(12)/dimer2, GCaMP, mkate2, Luciferase, LacI, LacO, LacZ
delete trp_reporter
cns_otherreporter OtherReporter transfer all values from trp_other_reporter
delete trp_other_reporter
cns_purificationtag PurificationTag Transfer all values from trp_reporter_product that equals any of the following values: His-tag, FLAG, HA-tag, MYC/c-myc, Stag, Histone H2B
cns_recombinationsite RecombinationSite LoxP, FRT
cns_constructtype ConstructType transfer all values from reporter_type: Chimera, Domain_swap,Engineered mutation, Fusion, Complex (e.g., GFP fusion plus point mutations), Transcriptional_fusion, Translational_fusion, N-terminal_translational_fusion, C-terminal_translational_fusion, Internal_coding_fusion
transfer all values from trp_reporter_type
delete trp_reporter_type
cns_selectionmarker SelectionMarker
cns_threeutr 3 UTR create cns_threeutr table, transfer from trp_threeutr
delete trp_threeutr
cns_constructionsummary Construction Details transfer values in trp_driven_by_construct
delete trp_driven_by_construct;
transfer trp_constructionsummary for all trp_publicname lines that are not "Is" or are blank; transfer all associated Expr in trp_publicname or trp_synonym to cns_othername
cns_clone Clone copy from trp_clone NOTE: might have been deemed empty and deleted already, if not, delete trp_clone table
cns_laboratory Laboratory copy from trp_laboratory
cns_remark Remark transfer trp_remark for all trp_publicname lines that are not "Is" or are blank

Going live to do list

Make new tables and run scripts :

  • create exp_construct + exp_variation tables for expression pattern :

/home/postgres/work/pgpopulation/exp_exprpattern/20140606_exp_construct_variation/create_datatype_tables.pl

  • create grg_construct table for gene regulation :

/home/postgres/work/pgpopulation/grg_generegulation/20140606_grg_construct/create_datatype_tables.pl

  • create int_construct table for interaction :

/home/postgres/work/pgpopulation/interaction/20140710_int_construct/create_datatype_tables.pl

  • create pro_construct table for process :

/home/postgres/work/pgpopulation/pro_process/20140708_pro_construct/create_datatype_tables.pl

  • create trp_variation trp_construct trp_coinjectionconstruct trp_integratedfrom tables for transgene :

/home/postgres/work/pgpopulation/transgene/20140605_construct_tables/create_transgene_construct_tables.pl

  • create all cns_ tables for construct objects :

/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/create_construct_tables.pl


  • transfer construct data from transgene trp_ to construct cns_ :

/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/transfer_trp_to_cns.pl

  • populate construct field of exp_ grg_ and int_ tables based on transgene to construct mappings :

/home/postgres/work/pgpopulation/cns_construct/20140710_transfer_construct_exp_grg_int/transfer_construct.pl


Sync dumpers at /home/postgres/work/citace_upload/

  • process/get_process_curation_ace.pm
  • transgene/get_transgene_ace.pm
  • expr_pattern/get_expr_pattern_ace.pm
  • interaction/get_interaction_ace.pm
  • gene_regulation/get_gene_regulation_ace.pm
  • whole new directory cns_construct/

Sync OA

Dealing with duplicate lines

All Construct IDS should match PGIDs
Scripts in place to find nonmatching IDs

  • Find lines where cns_name does not match PGID
  • script to assess what will be overwritten when data for cns_names are moved to the corresponding PGID

/home/postgres/work/pgpopulation/transgene/20200627_fix_joinkey_duplicate