Construct

From WormBaseWiki
Jump to navigationJump to search

This page documents the creation of the construct class and all tools needed for its curation.

Overview

A construct is defined as any single contiguous stretch of engineered DNA sequence. Constructs will be curated as objects used in the generation of transgenes, engineered alleles, and expression markers used in analysis of expression patterns, gene regulation, and biological topics.

Model

The official model page is here along with compensatory changes to related models and test data. Constructs will be curated for the benefit of a number of other data classes: Variation, Transgene, Gene Regulation, Topic, and Expression Pattern.

Construct OA

  • create construct tables see tables on OA tables page
  • assign all transgene objects a unique WBConstruct ID.
  • move/rename/rededicate(?) many of the trp tables to be construct tables
  • reorganize the transgene OA to include construct curation
  • add functionality to exprpat, genereg, int and topic OAs to enter construct requests

Connecting constructs to Transgenic Alleles

NOTE: Transgenes are termed Transgenic Alleles by the Alliance
WB has both construct and transgenic alleles. The Alliance Expression Pattern schema only allows Transgenic Alleles to be attached to a pattern; WB **constructs without an associated transgene** will therefore not be captured by the Alliance. To simplify expression schema changes we will **give transgene IDs to all constructs that are not already associated with a transgene**, used in Expression

The Construct OA should allow the minting of WBTransgene IDs from within the OA based on an individual construct.

The workflow would be:

  • Curator creates a Construct
  • Curator indicates (through a button or field value?) that a transgene ID is needed for that construct (a cron job to be triggered by flag?)
  • Postgres creates a transgene pgid, and populates trp tables as follows
    • cns_summary copied to trp_summary
    • cns_name copied to trp_construct
    • cns_paper copied to trp_paper
    • cns_curator copied to trp_curator
    • trp_name copied to cns_usedfortransgene


For the Expression OA (this has also been added to the Expression wiki
For back population of trp tables with existing constructs in the expression tables for which there is no associated transgene:

  • find all exp_constructs that are not associated with a trp_publicname
  • create transgene IDs for each exp_construct following the data porting as laid out below
    • Postgres creates a transgene pgid, and populates trp tables as follows
      • cns_summary copied to trp_summary
      • cns_name copied to trp_construct
      • cns_paper copied to trp_paper
      • cns_curator copied to trp_curator
      • trp_name copied to cns_usedfortransgene
      • trp_name copied to exp_transgene

- after data clean up will need to suppress data [not dump] in the exp_construct
See


  • original notes 1/27/21

Ask Juancarlos: - how many constructs used in expression do not have a transgene? - need to create transgene objects in bulk for that - need to remove the construct from the construct field and add the corresponding transgene in the transgene field - for curation purposes need to set up a mechanism to flag constructs that need transgeneID. A cronjob will create a transgene ID to attach to the - after data clean up will need to suppress the construct field on Expression OA

Adding new Reporter types

the cns_reporter obo_ tables don't seem to get often updated (looking at timestamp values). When there is a new reporter just add it directly to the postgres test db through the command line (example for mNeonGreen):

  • INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen', 'mNeonGreen');
  • INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen', 'id: mNeonGreen');

Will look like this

 > psgl testdb
 [testdb=> INSERT INTO obo_name_cnsreporter VALUES ('mNeonGreen','mNeonGreen');
 [INSERT 0 1
 [testdb=> INSERT INTO obo_data_cnsreporter VALUES ('mNeonGreen','id:mNeonGreen');
 [INSERT 0 1
 \q


To enter multiple values at once :

  • INSERT INTO obo_name_cnsreporter VALUES ('mNeptune2.1', 'mNeptune2.1'), ('mNeptune2.2', 'mNeptune2.2'), ('mNeptune2.3', 'mNeptune2.3'), ('mNeptune2.4', 'mNeptune2.4');
  • INSERT INTO obo_data_cnsreporter VALUES ('mNeptune2.1', 'id:mNeptune2.1'), ('mNeptune2.2', 'id:mNeptune2.2'), ('mNeptune2.3', 'id:mNeptune2.3'), ('mNeptune2.4', 'id:mNeptune2.4');

If you think there is a mistake, query for it with:

  • SELECT * FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';

When you see that it's the only thing you want to delete, you can up-arrow and replace the SELECT * with a DELETE

  • DELETE FROM obo_name_cnsreporter WHERE joinkey = 'mNepture2.5';

Postgres script to populate cns tables

Construct tables = cns_:
/home/postgres/work/pgpopulation/cns_construct/20140605_newOa
script for data population of cns_ tables from trp_tables: transfer_trp_to_cst.pl

Summary of transfer instructions:

table OA label data transfer
cns_curator Curator one-time copy from trp_curator
cns_paper Paper one-time copy from trp_paper
cns_person Person one-time copy from trp_person
cns_name Name WBCnstr: $cnstid = &pad8Zeros($newPgid)//assigned WBConstructID
cns_publicname Public Name /Clone: values from trp_constructionsummary
cns_othername Other Name transfer Expr from trp_publicname or trp_synonym
In cases all cases where "Expr" is in trp_publicname, delete object from trp tables.
in cases where Expr is in trp_synonym and trp_publicname is blank, delete object from trp_tables.
hide table cns_newtransgene New Transgene for production
cns_summary Summary one time copy from trp_summary
cns_drivenbygene Driven By Gene transfer all values from trp_driven_by_gene
delete trp_driven_by_gene
cns_gene Gene transfer all values from trp_gene
delete trp_gene
cns_reporter Reporter transfer all values from trp_reporter_product except when the value is equal to a purification_tag value
reporter values are: GFP, GFP(S65C), EGFP, pGFP(photoactivated GFP), YFP, EYFP, BFP, CFP, Cerulian, RFP, mRFP, tagRFP, mCherry, wCherry, tdTomato, mStrawberry, DsRed, DsRed2, Venus, YC2.1 (yellow cameleon), YC12.12 (yellow cameleon),YC3.60 (yellow cameleon), Yellow cameleon, Dendra, Dendra2, tdimer2(12)/dimer2, GCaMP, mkate2, Luciferase, LacI, LacO, LacZ
delete trp_reporter
cns_otherreporter OtherReporter transfer all values from trp_other_reporter
delete trp_other_reporter
cns_purificationtag PurificationTag Transfer all values from trp_reporter_product that equals any of the following values: His-tag, FLAG, HA-tag, MYC/c-myc, Stag, Histone H2B
cns_recombinationsite RecombinationSite LoxP, FRT
cns_constructtype ConstructType transfer all values from reporter_type: Chimera, Domain_swap,Engineered mutation, Fusion, Complex (e.g., GFP fusion plus point mutations), Transcriptional_fusion, Translational_fusion, N-terminal_translational_fusion, C-terminal_translational_fusion, Internal_coding_fusion
transfer all values from trp_reporter_type
delete trp_reporter_type
cns_selectionmarker SelectionMarker
cns_threeutr 3 UTR create cns_threeutr table, transfer from trp_threeutr
delete trp_threeutr
cns_constructionsummary Construction Details transfer values in trp_driven_by_construct
delete trp_driven_by_construct;
transfer trp_constructionsummary for all trp_publicname lines that are not "Is" or are blank; transfer all associated Expr in trp_publicname or trp_synonym to cns_othername
cns_clone Clone copy from trp_clone NOTE: might have been deemed empty and deleted already, if not, delete trp_clone table
cns_laboratory Laboratory copy from trp_laboratory
cns_remark Remark transfer trp_remark for all trp_publicname lines that are not "Is" or are blank

Going live to do list

Make new tables and run scripts :

  • create exp_construct + exp_variation tables for expression pattern :

/home/postgres/work/pgpopulation/exp_exprpattern/20140606_exp_construct_variation/create_datatype_tables.pl

  • create grg_construct table for gene regulation :

/home/postgres/work/pgpopulation/grg_generegulation/20140606_grg_construct/create_datatype_tables.pl

  • create int_construct table for interaction :

/home/postgres/work/pgpopulation/interaction/20140710_int_construct/create_datatype_tables.pl

  • create pro_construct table for process :

/home/postgres/work/pgpopulation/pro_process/20140708_pro_construct/create_datatype_tables.pl

  • create trp_variation trp_construct trp_coinjectionconstruct trp_integratedfrom tables for transgene :

/home/postgres/work/pgpopulation/transgene/20140605_construct_tables/create_transgene_construct_tables.pl

  • create all cns_ tables for construct objects :

/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/create_construct_tables.pl


  • transfer construct data from transgene trp_ to construct cns_ :

/home/postgres/work/pgpopulation/cns_construct/20140605_newOa/transfer_trp_to_cns.pl

  • populate construct field of exp_ grg_ and int_ tables based on transgene to construct mappings :

/home/postgres/work/pgpopulation/cns_construct/20140710_transfer_construct_exp_grg_int/transfer_construct.pl


Sync dumpers at /home/postgres/work/citace_upload/

  • process/get_process_curation_ace.pm
  • transgene/get_transgene_ace.pm
  • expr_pattern/get_expr_pattern_ace.pm
  • interaction/get_interaction_ace.pm
  • gene_regulation/get_gene_regulation_ace.pm
  • whole new directory cns_construct/

Sync OA

Dealing with duplicate lines

All Construct IDS should match PGIDs
Scripts in place to find nonmatching IDs

  • Find lines where cns_name does not match PGID
  • script to assess what will be overwritten when data for cns_names are moved to the corresponding PGID

/home/postgres/work/pgpopulation/transgene/20200627_fix_joinkey_duplicate