Difference between revisions of "Gene Regulation"

From WormBaseWiki
Jump to navigationJump to search
Line 100: Line 100:
 
== Populating postgres ==
 
== Populating postgres ==
  
On mangolassi at /home/postgres/work/pgpopulation/grg_generegulation/ create tables with create_grg_tables.pl  populate tables based on GR_OA.ace , CellAO.txt , obo_ and other postgres tables.  Errors at populate_pg.err (there's a few check them out, need to fix the source file at /home/acedb/xiaodong/gene_regulation/GR_OA.ace )  '''TODO on tazendra when live.'''
+
On mangolassi at /home/postgres/work/pgpopulation/grg_generegulation/ create tables with create_grg_tables.pl  populate tables based on GR_OA.ace , CellAO.txt , obo_ and other postgres tables.  Errors at populate_pg.err (there's a few check them out, need to fix the source file at /home/acedb/xiaodong/gene_regulation/GR_OA.ace )  '''DONE on tazendra, live on 2010 11 08 -- J.'''
  
 
- fixed errors from pupulated_pg.err file in GR_OA source file, except there are two valid items, WBGene00003004 (lin-15, again), and WBbt:0005118 -X. WBbt:0005118 is not in the list of valie Expr_pattern objects see the autocomplete -- J  WBbt:0005118 is in autocomplete list in Anatomy Term. -X Yes, but the tag in the .ace file is for Expr_pattern -J got you. I fixed the source file in mangolassi: /home/postgres/work/pgpopulation/grg_generegulation/GR_OA.ace. -X
 
- fixed errors from pupulated_pg.err file in GR_OA source file, except there are two valid items, WBGene00003004 (lin-15, again), and WBbt:0005118 -X. WBbt:0005118 is not in the list of valie Expr_pattern objects see the autocomplete -- J  WBbt:0005118 is in autocomplete list in Anatomy Term. -X Yes, but the tag in the .ace file is for Expr_pattern -J got you. I fixed the source file in mangolassi: /home/postgres/work/pgpopulation/grg_generegulation/GR_OA.ace. -X

Revision as of 03:52, 9 November 2010

Types of fields Juancarlos can implement:

  • text : text
  • bigtext : like longtext, but makes the text box expand when you click in it so you can see everything you've written
  • dropdown : few values
  • ontology : controlled vocabulary (tell me where they come from)
  • multiontology / multidropdown : (allows multiple values)
  • toggle : on / off, yes/no etc.


Gene_regulation OA

OA fields

  • Pgdbid - postgras database ID, entered automatically
  • Reference - WBPaperID paper ontology - will you ever need to enter more than one paper? No
  • Name (free text) - this gets appended to the reference to be the object name. usually a WBPaperID followed by regulated gene name, i.e.WBPaper00036472_dod-3.a
  • Summary (long text) - most of time copy and paste from paper directly with curator's slight modifications do you actually want a big text box, i.e., something that expands to see all the info when you click in the box? yes, that's what I want. Then, I should say 'Big Text'
  • Antibody Info - antibody multiontology, fill the field with existing antibody, inform the curator (myself) to curate non-existing antibody objects otherwise you could have this work the way Molecule works, when you need a new antibody, you go to the Antibody OA and create, then update the ontology so you can use the new antibody right away. yes, I would like to do so. Autocomplete on abp_ antibody table -> name. in term information show -> original publication, location. .ace -> Antibody_info
  • Antibody Remark (free text). .ace -> Antibody If data exactly matches abp_name then put in Antibody info.
  • Reporter_gene (free text) -reporter gene construct, eg. [daf-16::gfp], translational fusion. is this just the transgene summary?
    yes.
  • Transgene - transgene multiontology, fill the field with existing transgene, (use app_tempname, or whatever gets split into for transgenes) need a way to inform transgene curator(Karen) when transgene is not existing yet


I'm not sure about what you need for these following method fields, would it work to have one field called Method and then a drop down list? or would you like to have a toggle (yes/no with default of no) for each of these values? I thought about the dropdown list for method first, then later I think, sometimes, I will need to comment in some fields. i.e. For 'RT_PCR', I will need to write ' semi-quantitative'. so a toggle with yes/no will not be enough, right?

  • In_situ (toggle text) - don't have to write anything in the field, but will need method (In_situ) shown.
  • Northern (toggle text) - don't have to write anything in the field, but will need method (Northern) shown.
  • Western (toggle text) - don't have to write anything in the field, but will need method (Western) shown.
  • RT_PCR (toggle text) - don't have to write anything in the field, but will need method (RT_PCR) shown.
  • Other_method (toggle text)


  • Type - multidropdown list with 'Change_of_localization' and 'Change_of_level'
  • Regulation_level - multidropdown list with 'Transcriptional', 'Post_transcriptional' and 'Post_translational'
  • Allele - variation multiontology and a way to inform relevant curator about non-existing allele. The work flow will be that you go to the NS, create the variation, update the postges variation list so the new VarID is on it (this is done by a script you will run by pressing a button), then you can enter the ID from the list.


  • RNAi - text with pipes. need a way to inform RNAi curators (Chris and Gary) about new RNAi id requirement. there need to be a way they can come in later and fill the field. I'm not sure how Juancarlos will deal with this.


  • Trans_regulator_gene - (multiontology on gene) I will enter gene by public/sequence name, please autocomplete with WBGeneID
  • Molecule_regulator - (multiontology like app_molecule) Karen's molecule ontology, and a way to inform her non-existing molecules. you will be able to create add new molecules or synonyms to molecules and update the postgres list.
  • Trans_regulator_seq multiontology of sequence objects off of gin_sequence
  • Other_regulator- free text separate by pipe for multivalues, but if any match mop_publicname, put them in Molecule_regulator.


  • Expr_pattern - (text until expr pattern OA only one value) since expr_pattern is not in postgres, I am not sure how to do this field. currently, I look up relevant expr_pattern, and copy paste into my ace file.For new expr_pattern objects, I have to wait wen to curate them first, then filled in my ace file. multiontology like gene picture OA's expr_pattern field. Based on obo_<name|data>_pic_exprpattern, needs to be based on Expr_pattern OA when that's live
  • No Dump - toggle to prevent dumping object if it should have expr_pattern but doesn't yet.
  • Trans_regulated_gene - I will enter gene by public/sequence name, please autocomplete with WBGeneID
  • Trans_regulated_seq multiontology of sequence objects off of gin_sequence
  • Other_regulated - free text separate by pipe for multivalues


  • Result - dropdown of Positive_regulate, Negative_regulate, Does_not_regulate
  • Anatomy_term - (multiontology app_anatomy) anatomy ontolgy
  • Life_stage - (multiontology app_lifestage - check the names) life_stage ontology
  • Subcellular_localization (toggle text) wouldn't this be a cell component GO term? , so could be the GO cell component Ontology. The model was designed to enter the free text


  • Remark (long text) perhaps big text? yes

.ace template

different OA rows will have the same object, so dump like phenotype does, grouping all rows with the same name into a single .ace object.

postgres table fields <> corresponding to .ace tag -X 11/05 added

//Template for Gene_regulation

  • Gene_regulation : "<grg_name>"
  • Summary "<grg_summary>
  • Antibody "<grg_antibodyremark>"
  • Antibody_info "<grg_antibody>"
  • Reporter_gene "<grg_reportergene>"
  • Transgene "<grg_transgene>"
  • In_situ "<grg_insitu>" (toggle just writes In_situ toggle + text or just text write In_situ "text")
  • Northern "<grg_northern>"
  • Western "<grg_western>"
  • RT_PCR "<grg_rtpcr">
  • Other_method "<grg_othermethod>"
  • Allele "<grg_allele>"
  • RNAi "<grg_rnai>"
  • Molecule_regulator "<grg_moleculeregulator>"
  • Trans_regulator_seq "<grg_transregulatorseq>"
  • Other_regulator "<grg_otherregulator>"
  • Trans_regulator_gene "<grg_transregulator>"
  • Expr_pattern "<grg_exprpattern>"
  • Trans_regulated_gene "<grg_transregulated>"
  • Trans_regulated_seq "<grg_transregulatedseq>"
  • Other_regulated "<grg_otherregulated>"
  • Positive_regulate Anatomy_term "<grg_anat_term>"
  • Positive_regulate Life_stage "<grg_lifestage>"
  • Positive_regulate Subcellular_localization "<grg_subcellloc>"
  • Negative_regulate Anatomy_term "<grg_anat_term>"
  • Negative_regulate Life_stage "<grg_lifestage>"
  • Negative_regulate Subcellular_localization "<grg_subcellloc>"
  • Does_not_regulate Anatomy_term "<grg_anat_term>"
  • Does_not_regulate Life_stage "<grg_lifestage>"
  • Does_not_regulate Subcellular_localization "<grg_subcellloc>"
  • Type "<grg_type>"
  • Regulation_level "<grg_regulationlevel>"
  • Remark "<grg_remark>"
  • Reference "<grg_paper>"

Populating postgres

On mangolassi at /home/postgres/work/pgpopulation/grg_generegulation/ create tables with create_grg_tables.pl populate tables based on GR_OA.ace , CellAO.txt , obo_ and other postgres tables. Errors at populate_pg.err (there's a few check them out, need to fix the source file at /home/acedb/xiaodong/gene_regulation/GR_OA.ace ) DONE on tazendra, live on 2010 11 08 -- J.

- fixed errors from pupulated_pg.err file in GR_OA source file, except there are two valid items, WBGene00003004 (lin-15, again), and WBbt:0005118 -X. WBbt:0005118 is not in the list of valie Expr_pattern objects see the autocomplete -- J WBbt:0005118 is in autocomplete list in Anatomy Term. -X Yes, but the tag in the .ace file is for Expr_pattern -J got you. I fixed the source file in mangolassi: /home/postgres/work/pgpopulation/grg_generegulation/GR_OA.ace. -X

- run ./use_package.pl in mangolassi at:/home/acedb/xiaodong/gene_regulation, no errors out in 'err.out.20101102' file, no errors in 'populate_pg.err' show up again in 'gene_regulation.ace.20101102' (except WBGene00003004 and WBbt:0005118 are still there, since I did nothing to them in GR_OA source file in same directory). -X I don't understand this, what shows up again ? What's wrong ? -- J Nothing wrong. don't worry about this. -X

Testing OA

Not sure where things were with the OA before, now that expr_pattern is multiontology, at least test that.

I saw it. it looks good! thanks. -X

Antibody info was storing data from Antibody OA, but it was storing the postgres ID instead of the antibody name. Double check that it works in OA and dumper after changing test data. -J

see comments in next section-X

.ace dumper script

Main code at /home/postgres/work/citace_upload/gene_regulation/ get_gene_regulation_ace.pm and use_package.pl Symlinked to be run at /home/acedb/xiaodong/gene_regulation

I'm not sure antibody / antibody_info is dumping correctly, please find example (pgid) and tell me how it is, and how it should be. 162 has both, but you can make your own. -J

They are not dumped correctly. Take the example of pgid 6 : in source file GR_OA.ace, it is Antibody "To detect GLP-1, a mixture of anti-EGFL, anti-LNG, and anti-ANK polyclonal antibodies was used.", this information got transferred into OA in Antibody Remark field, which is correct. However, later, when it is dumped in gene_regulation.ace.20111103 file, it got dumper as Antibody_info "To detect GLP-1, a mixture of anti-EGFL, anti-LNG, and anti-ANK polyclonal antibodies was used.", which is wrong. it should be under Antibody tag again as it is in the source file. Antibody in .ace and Antibody Remark in OA are equivalent fields, which is free text. Antibody_info in .ace and Antibody Info in OA are the same field, which is ontology.

This doesn't make sense to me. From above, the .ace output says :

  • Antibody ""
  • Antibody_info ""

Never mind. pgid 6 is correctly dumped in .ace now as I just checked.

There are 18 objects in OA currently which Antibody_info are mapped in Antibody Remark field mistakenly, where they should be in Antibody Info field. If you query 'Antibody_info' in Antibody Remark field, you will get them (18 values). -X The .ace data looks like this : Antibody "Antibody_info: [cgc4906]:odr-7" so it goes into the Antibody_remark field because that's the tag in the .ace field. If you want that fixed you should fix those in the .ace file and let me know to repopulate it (you probably should, rather than wait utnil the OA is live, it's only 18 of them -- J I have fixed the GR_OA.ace file in mangolassi: /home/postgres/work/pgpopulation/grg_generegulation/GR_OA.ace. you can repopulate it. I fixed WBbt:0005118 problem in source file too. -X

Please clarify which table should go in which tag. I've change the Antibody Info field to use the tag Antibody_info. Antibody Remark field now uses Antibody tag. I don't think this is correct, but I don't understand what you want. Here's the mapping of OA fields to postgres table names :

  • Curator -> grg_curator
  • Reference -> grg_paper
  • Name -> grg_name
  • Summary -> grg_summary
  • Antibody Info -> grg_antibody
  • Antibody Remark -> grg_antibodyremark
  • Reporter Gene -> grg_reportergene
  • Transgene -> grg_transgene
  • In Situ -> grg_insitu
  • IS Text -> grg_insitu_text
  • Northern -> grg_northern
  • N Text -> grg_northern_text
  • Western -> grg_western
  • W Text -> grg_western_text
  • RT PCR -> grg_rtpcr
  • RP Text -> grg_rtpcr_text
  • Other Method -> grg_othermethod
  • OM Text -> grg_othermethod_text
  • Allele -> grg_allele
  • RNAi -> grg_rnai
  • Type -> grg_type
  • Regulation Level -> grg_regulationlevel
  • Trans Regulator Gene -> grg_transregulator
  • Molecule Regulator -> grg_moleculeregulator
  • Trans Regulator Seq -> grg_transregulatorseq
  • Other Regulator -> grg_otherregulator
  • Trans Regulated Gene -> grg_transregulated
  • Trans Regulated Seq -> grg_transregulatedseq
  • Other Regulated -> grg_otherregulated
  • Expression Pattern -> grg_exprpattern
  • NO DUMP -> grg_nodump
  • Result -> grg_result
  • Anatomy Term -> grg_anat_term
  • Life Stage -> grg_lifestage
  • Subcellular Localization -> grg_subcellloc
  • SCL Text -> grg_subcellloc_text
  • Remark -> grg_remark

Please keep track of this for when you need to refer to postgres tables. For example, when dumping the .ace you should write it up something like :

RNAi<tab>"<datafrom rnai table>"



Another example, pgid 319: in GR_OA.ace source file, it is Antibody "[cgc2045]:glp-1", this infomation got transferred into OA in Antibody Remark field, which is wrong. As we stated in wiki above, if Antibody in source file exactly matches abp_name then put in Antibody info. However, later, when it is dumped in gene_regulation.ace.20111103 file, it got dumped correctly as Antibody_info "[cgc2045]:glp-1".

162 is transferred and dumped perfectly right. -X


Other_method doubling should be fixed, double check -J

it is fixed. I checked in 'gene_regulation.ace.20101102' file. -X