Difference between revisions of "OA-phenotype"

Revision as of 01:25, 29 February 2012

package dump script

tazendra /home/acedb/work/allele_phenotype/use_package.pl

use_package.pl is a perl script that uses the perl module( ), to generate an error file and two .ace files (since May 2011)- one for phenotypes and one for molecule-phenotype.

Constraints:

does not dump record when value in Curation status is "down right disgusted", this acts as our no dump toggle
when there is no value for phenotype -this will stop non-curated NBP data from being dumped
will dump if phenotype is present regardless of curator

requested changes

5/17/2011

add a constraint - for rearrangement objects, dump all phenotype information except molecule values (molecule data should not be annotated for rearrangements so this is just a fail safe in case it does happen).

in addition to the varphene.ace output file, please output a separate file called mol_phene.ace, which contains:
molecule variation phenotype
molecule strain phenotype
molecule transgene phenotype

2/2012

I can't tell if all 5 of these are new fields, or just some. They are all new fields.
If they're all fields, are they all at the bottom of tab2, or just the first 2 fields ? Yes, add them all to tab 2 bottom
Are you sure you want those field names, they're really long (which is fine by me, but will take up more space for you) -- J Good point, I shortened them

Add fields to OA - add to TAB2 at the bottom in the following order
- (NEW FIELD) rescued by - multi-ontology from transgene tables, autocomplete on transgene name
- (NEW FIELD) legacy info - parsed data from legacy data- all entries with [celegans] from file on tazendra and mangolassi at /home/acedb/work/allele_phenotype This file ? /home/acedb/work/allele_phenotype/legacy_information.txt There are no entries with "[celegans]" Do you mean lines where it says --;"[C.elegansII]-- ? yes

What do you mean by parse ? Enter everything in each line into a new app_ OA line with its own pgid ? Or split on ";" and only enter stuff from the third column ? or something else ? -- J take everything in quotes starting with the third column where there is "[C. elegansII], in some cases there are more semi-colons, these will need to be ignored after the third column

- - each entry gets its own pgid
  - add curation status (app_curation_status) of "down right disgusted" so lines that I have not touched do not get dumped.
  - make legacy data editable text or bigtext ? -- J bigtext
  - what other fields ? no app_name ? you for app_curator ? -- J sure, me for app_curator

- (NEW FIELD) ES - drop down list, with values
  - "ES0_Impossible_to_score", "ES1_Very_difficult_to_score", "ES2_Difficult_to_score", "ES3_Easy_to_score"
- (NEW FIELD) ME - drop down list with values
  - "ME0_Mating_not_successful", "ME1_Mating_rarely_successful", "ME2_Mating_usually_successful", "ME3_Mating_always_successful"
- (NEW FIELD) HME - drop down list with values
  - "HME0_Mating_not_successful", "HME1_Mating_rarely_successful", "HME2_Mating_usually_successful", "HME3_Mating_always_successful"

CHANGES TO DUMP SCRIPT

mating_efficiency
- constrain lines with mating_efficiency values to be NOT NULL in app_curator, app_tempname (variation), app_person OR app_paper what does constrain mean ? check_data button on OA ? so not in dumping script ? I can't find stuff like this in the dumping script. Unless it's a new thing, but I thought this was in the check_Data button. Did we ever wiki the dumping script ? -- J https://bitbucket.org/kyook/ky_wbprojects/wiki/use_package.pl we did go over the dumping script, there are some constraints (rules? you probably have another term for this) that I thought were employed, see the bit bucket page.

- lines with mating_efficiency can be blank for phenotype (app_phenotype) Do you mean app_term ? yes, sorry
  This implies they can't be blank for other stuff, the script only does some stuff for pgids with data in app_term, does it already ever dump stuff when there isn't an app_term, and this new thing should join that, or is this the first time it will do that ? If we haven't gone the dumping script, we should, this seems like a pretty big change -- J You are correct the script already has rules in it to not dump data when there is no phenotype (app_term); however, lines with mating efficiency values (ME and or HME) will need to escape that rule. I hope this is not a big change(!)

- .ace should look like the example is good -- J

Variation : "WBVar00266499"
Male	"ME2_Mating_usually_successful"	Curator_confirmed	"WBPerson712"
Male	"ME2_Mating_usually_successful"	Person_evidence	"WBPerson261"

ease_of_scoring
- .ace should be like

Variation : "WBVar00266499"
Species	"Caenorhabditis_elegans"
Phenotype	"WBPhenotype:0000456"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Person_evidence	"WBPerson261"
Phenotype	"WBPhenotype:0000456"	Remark	"touch-insensitive"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Remark	"touch-insensitive"	Person_evidence	"WBPerson261"
Phenotype	"WBPhenotype:0000456"	Ease_of_scoring	"ES2_Difficult_to_score"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Ease_of_scoring	"ES2_Difficult_to_score"	Person_evidence	"WBPerson261"

rescued_by_transgene
- .ace should be like

Variation : "WBVar00266499"
Species	"Caenorhabditis_elegans"
Phenotype	"WBPhenotype:0000456"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Person_evidence	"WBPerson261"
Phenotype	"WBPhenotype:0000456"	Remark	"touch-insensitive"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Remark	"touch-insensitive"	Person_evidence	"WBPerson261"
Phenotype	"WBPhenotype:0000456"	Rescued_by_Transgene	"asIs432248"	Curator_confirmed	"WBPerson712"
Phenotype	"WBPhenotype:0000456"	Rescued_by_Transgene	"asIs432248"	Person_evidence	"WBPerson712"

legacy data- needs to be mapped to corresponding gene (app_wbgene)
while the table app_wbgene still exists, it is 1) not in the OA, 2) does not have WBGene objects, it has data in what looks like a bad format : WBGene00003883 (osm-1 (WBGene00003883)) or WBGene00000058 (acr-19). I don't know where this came from, but it seems bad. This is most likely from the Variation_gene.txt I create after each build so alleles can be mapped to genes.

If you meant that we should dump whatever is in legacy pgids to existing app_wbgene, I don't see how since the legacy pgids are new, and the app_wbgene pgids are old and you can't add data to them without going to postgres directly. no, you are correct that is not what I meant.

If you know where this data came from, let me know if it's good. If it's not good, let's back it up and get rid of the table -- J if it isn't being used then yes, by all means we should get rid of it. Can you get rid of it on mangolassi so we can see what happens? As it is an app_wbgene table it is only looked at by the phenotype OA correct?

- .ace should be like

Gene :	"WBGene00003173"
Legacy_information	"[C.elegansII] u27amb : OA>30: e1494, e1852, u164amb etc. Cloned: 2 kb and 3 kb transcripts, transpliced to  SL1 and SL2, encode protein with Kunitztype serine protease inhibitor domains, EGF-like repeats.  mec-9:GFP expressed in touch  cells and PVD (3kb promoter) or more extensively (2kb promoter). [Chalfie and Au 1989\; Huang and Chalfie 1994\; TU]"

--kjy 23:33, 13 February 2012 (UTC)

Phenotype OA postgres tables

postgres tables
app_allele_status	app_finished_hst	app_not_hst	app_range_hst
app_allele_status_hst	app_func	app_obj_remark	app_range_start
app_anat_term	app_func_hst	app_obj_remark_hst	app_range_start_hst
app_anat_term_hst	app_genotype	app_paper	app_remark
app_caused_by	app_genotype_hst	app_paper_hst	app_rnai_brief
app_caused_by_hst	app_go_sug	app_paper_remark	app_rnai_brief_hst
app_caused_by_other	app_go_sug_hst	app_paper_remark_hst	app_species
app_caused_by_other_hst	app_haplo	app_pat_effect	app_species_hst
app_child_of	app_haplo_hst	app_pat_effect_hst	app_strain
app_child_of_hst	app_heat_degree	app_penetrance	app_strain_hst
app_cold_degree	app_heat_degree_hst	app_penetrance_hst	app_suggested
app_cold_degree_hst	app_heat_sens	app_percent	app_suggested_definition
app_cold_sens	app_heat_sens_hst	app_percent_hst	app_suggested_definition_hst
app_cold_sens_hst	app_intx_desc	app_person	app_suggested_hst
app_control_isolate	app_intx_desc_hst	app_person_hst	app_sug_ref
app_control_isolate_hst	app_laboratory	app_phenotype	app_sug_ref_hst
app_curation_status	app_laboratory_hst	app_phenotype_hst	app_temperature
app_curation_status_hst	app_lifestage	app_phen_remark	app_temperature_hst
app_curator	app_lifestage_hst	app_phen_remark_hst	app_tempname
app_curator_hst	app_mat_effect	app_preparation_hst	app_tempname_hst
app_delivered	app_mat_effect_hst	app_quality	app_term
app_delivered_hst	app_molecule	app_quality_hst	app_term_hst
app_entity	app_molecule_hst	app_quantity	app_treatment
app_entity_hst	app_nature	app_quantity_hst	app_treatment_hst
app_filereaddate	app_nature_hst	app_quantity_remark	app_type
app_filereaddate_hst	app_nbp	app_quantity_remark_hst	app_type_hst
app_finalname	app_nbp_hst	app_range_end	app_wbgene
app_finalname_hst	app_not	app_range_end_hst	app_wbgene_hst

@@ Line 19: / Line 19: @@
 ====2/2012====
-'''I can't tell if all 5 of these are new fields, or just some.  If they're all fields, are they all at the bottom of tab2, or just the first 2 fields ?  Are you sure you want those field names, they're really long (which is fine by me, but will take up more space for you) -- J'''
+'''I can't tell if all 5 of these are new fields, or just some.'''    They are all new fields. <br>
+'''If they're all fields, are they all at the bottom of tab2, or just the first 2 fields ?''' Yes, add them all to tab 2 bottom<br>
+'''Are you sure you want those field names, they're really long (which is fine by me, but will take up more space for you) -- J''' Good point, I shortened them<br>
 *Add fields to OA - add to TAB2 at the bottom in the following order
-**rescued_by_transgene - multi-ontology from transgene tables, autocomplete on transgene name
+**(NEW FIELD) rescued by - multi-ontology from transgene tables, autocomplete on transgene name
-**legacy information -
+**(NEW FIELD) legacy info - parsed data from legacy data- all entries with [celegans] from file on tazendra and mangolassi at /home/acedb/work/allele_phenotype '''This file ? /home/acedb/work/allele_phenotype/legacy_information.txt   There are no entries with "[celegans]"  Do you mean lines where it says --;"[C.elegansII]-- ?'''  yes<br>
-***parse data from legacy data- all entries with [celegans] from file on tazendra and mangolassi at /home/acedb/work/allele_phenotype '''This file ? /home/acedb/work/allele_phenotype/legacy_information.txt   There are no entries with "[celegans]"  Do you mean lines where it says --;"[C.elegansII]-- ?  What do you mean by parse ?  Enter everything in each line into a new app_ OA line with its own pgid ?  Or split on ";" and only enter stuff from the third column ?  or something else ? -- J'''
+'''What do you mean by parse ?  Enter everything in each line into a new app_ OA line with its own pgid ? Or split on ";" and only enter stuff from the third column ?  or something else ? -- J''' take everything in quotes starting with the third column where there is "[C. elegansII], in some cases there are more semi-colons, these will need to be ignored after the third column<br>
 ***each entry gets its own pgid
 ***add curation status (app_curation_status) of "down right disgusted" so lines that I have not touched do not get dumped.
-***make legacy data editable '''text or bigtext ? -- J'''
+***make legacy data editable '''text or bigtext ? -- J''' bigtext <br>
-***'''what other fields ?  no app_name ?  you for app_curator ? -- J'''
+***'''what other fields ?  no app_name ?  you for app_curator ? -- J''' sure, me for app_curator<br>
-*ease of scoring - drop down list, with values
+**(NEW FIELD) ES - drop down list, with values
-**"ES0_Impossible_to_score", "ES1_Very_difficult_to_score", "ES2_Difficult_to_score", "ES3_Easy_to_score"
+***"ES0_Impossible_to_score", "ES1_Very_difficult_to_score", "ES2_Difficult_to_score", "ES3_Easy_to_score"
-*male mating efficiency - drop down list with values
+**(NEW FIELD) ME - drop down list with values
-**"ME0_Mating_not_successful", "ME1_Mating_rarely_successful", "ME2_Mating_usually_successful", "ME3_Mating_always_successful"
+***"ME0_Mating_not_successful", "ME1_Mating_rarely_successful", "ME2_Mating_usually_successful", "ME3_Mating_always_successful"
-*hermaphrodite mating efficiency - drop down list with values
+**(NEW FIELD) HME - drop down list with values
-**"HME0_Mating_not_successful", "HME1_Mating_rarely_successful", "HME2_Mating_usually_successful", "HME3_Mating_always_successful"
+***"HME0_Mating_not_successful", "HME1_Mating_rarely_successful", "HME2_Mating_usually_successful", "HME3_Mating_always_successful"
 '''CHANGES TO DUMP SCRIPT'''
 *mating_efficiency
-**constrain lines with mating_efficiency values to be NOT NULL in app_curator, app_tempname (variation), app_person OR app_paper '''what does constrain mean ?  check_data button on OA ?  so not in dumping script ?  I can't find stuff like this in the dumping script.  Unless it's a new thing, but I thought this was in the check_Data button.  Did we ever wiki the dumping script ? -- J'''
+**constrain lines with mating_efficiency values to be NOT NULL in app_curator, app_tempname (variation), app_person OR app_paper '''what does constrain mean ?  check_data button on OA ?  so not in dumping script ?  I can't find stuff like this in the dumping script.  Unless it's a new thing, but I thought this was in the check_Data button.  Did we ever wiki the dumping script ? -- J'''     https://bitbucket.org/kyook/ky_wbprojects/wiki/use_package.pl we did go over the dumping script,  there are some constraints (rules? you probably have another term for this) that I thought were employed, see the bit bucket page.
-**lines with mating_efficiency can be blank for phenotype (app_phenotype) '''Do you mean app_term ?  This implies they can't be blank for other stuff, the script only does some stuff for pgids with data in app_term, does it already ever dump stuff when there isn't an app_term, and this new thing should join that, or is this the first time it will do that ?  If we haven't gone the dumping script, we should, this seems like a pretty big change -- J'''
+**lines with mating_efficiency can be blank for phenotype (app_phenotype) '''Do you mean app_term ?'''    yes, sorry<br>  '''This implies they can't be blank for other stuff, the script only does some stuff for pgids with data in app_term, does it already ever dump stuff when there isn't an app_term, and this new thing should join that, or is this the first time it will do that ?  If we haven't gone the dumping script, we should, this seems like a pretty big change -- J'''   You are correct the script already has rules in it to not dump data when there is no phenotype (app_term); however, lines with mating efficiency values (ME and or HME) will need to escape that rule. I hope this is not a big change(!)
 **.ace should look like '''the example is good -- J'''
   Variation : "WBVar00266499"
@@ Line 70: / Line 74: @@
   Phenotype	"WBPhenotype:0000456"	Rescued_by_Transgene	"asIs432248"	Person_evidence	"WBPerson712"
-*legacy data- needs to be mapped to corresponding gene (app_wbgene)<br> '''while the table app_wbgene still exists, it is 1) not in the OA, 2) does not have WBGene objects, it has data in what looks like a bad format : WBGene00003883 (osm-1 (WBGene00003883))  or   WBGene00000058 (acr-19).  I don't know where this came from, but it seems bad.  If you meant that we should dump whatever is in legacy pgids to existing app_wbgene, I don't see how since the legacy pgids are new, and the app_wbgene pgids are old and you can't add data to them without going to postgres directly.  If you know where this data came from, let me know if it's good.  If it's not good, let's back it up and get rid of the table -- J'''
+*legacy data- needs to be mapped to corresponding gene (app_wbgene)<br> '''while the table app_wbgene still exists, it is 1) not in the OA, 2) does not have WBGene objects, it has data in what looks like a bad format : WBGene00003883 (osm-1 (WBGene00003883))  or   WBGene00000058 (acr-19). I don't know where this came from, but it seems bad.'''     This is most likely from the Variation_gene.txt I create after each build so alleles can be mapped to genes.
+'''If you meant that we should dump whatever is in legacy pgids to existing app_wbgene, I don't see how since the legacy pgids are new, and the app_wbgene pgids are old and you can't add data to them without going to postgres directly.'''    no, you are correct that is not what I meant.
+'''If you know where this data came from, let me know if it's good.  If it's not good, let's back it up and get rid of the table -- J'''    if it isn't being used then yes, by all means we should get rid of it.  Can you get rid of it on mangolassi so we can see what happens?  As it is an app_wbgene table it is only looked at by the phenotype OA correct?
 **.ace should be like
   Gene :	"WBGene00003173"

Difference between revisions of "OA-phenotype"

Revision as of 01:25, 29 February 2012

Contents

package dump script

requested changes

5/17/2011

2/2012

Phenotype OA postgres tables

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools