Difference between revisions of "Populating the OA and New Dumping Script"

From WormBaseWiki
Jump to navigationJump to search
(Created page with "1. Need to populate postgres (GO OA tables with data from Phenotype2GO file) 2. File to parse into postgres is in this directory on mangolassi: /home/postgres/work/pgpopulat...")
 
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
1. Need to populate postgres (GO OA tables with data from Phenotype2GO file)
 
1. Need to populate postgres (GO OA tables with data from Phenotype2GO file)
  
2. File to parse into postgres is in this directory on mangolassi:
+
2. File to parse into postgres is currently in this directory on mangolassi:
  
 
/home/postgres/work/pgpopulation/go/go_curation/20141106_kevin_godata/newGpaEntries
 
/home/postgres/work/pgpopulation/go/go_curation/20141106_kevin_godata/newGpaEntries
  
3. Mapping of columns in the newGpaEntries file to GO OA tables is here:
+
3. Table mapping columns in the newGpaEntries file to GO OA tables is here:
  
 
http://wiki.wormbase.org/index.php/Mapping_the_GAF_to_GO_OA_tables
 
http://wiki.wormbase.org/index.php/Mapping_the_GAF_to_GO_OA_tables
  
4.  Once file is read into OA tables, then we’ll need a new dumping script.
+
Will need to add three values to the Qualifier drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’
  
Specs for new OA dumping script
+
4. Once file is read into OA tables, then we’ll need a new dumping script.
 +
 
 +
5. Specs for new OA dumping script
  
 
Model for WS248:
 
Model for WS248:
Line 17: Line 19:
 
http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/wormbase/wspec/models.wrm?revision=1.416&root=ensembl&view=markup
 
http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/wormbase/wspec/models.wrm?revision=1.416&root=ensembl&view=markup
  
1. Annotation ID
+
Can likely borrow a lot from the gpad parsing script on tazendra /home/acedb/kimberly/citace_upload/go/go_gpad_parser.pl
 +
 
 +
*Annotation ID
 
Will need to determine what the starting ID should be
 
Will need to determine what the starting ID should be
 
Could look at the main .ace file (from gpad conversion - gp_annotation.ace) and start with +1 from the end
 
Could look at the main .ace file (from gpad conversion - gp_annotation.ace) and start with +1 from the end
Line 25: Line 29:
  
  
2. Gene
+
*Gene
 
From gop_wbgene
 
From gop_wbgene
  
Line 31: Line 35:
  
  
3. GO_code
+
*GO_term
 +
 
 +
From gop_goid
 +
 
 +
GO_term "GO:0005634"
 +
 
 +
 
 +
*GO_code
 
From gop_goinference
 
From gop_goinference
  
Line 37: Line 48:
  
  
4. Annotation_relation
+
*Annotation_relation
 
From gop_qualifier
 
From gop_qualifier
 
Will need to add three values to the drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’
 
Will need to add three values to the drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’
Line 46: Line 57:
  
 
Annotation_relation “NOT”
 
Annotation_relation “NOT”
 +
 
Annotation_relation “enables”
 
Annotation_relation “enables”
  
  
5. Annotation_made_with
+
*Annotation_made_with
  
 
gop_with_wbgene
 
gop_with_wbgene
Line 63: Line 75:
 
gop_with
 
gop_with
 
 
Inferred_from_GO_term “GO:0035195”
+
Inferred_from_GO_term “GO:0035195"
  
  
gop_with_variation
+
        gop_with_wbvariation
  
 
Variation “WBVar00278468”
 
Variation “WBVar00278468”
Line 81: Line 93:
  
  
6. Annotation_extension
+
*Annotation_extension
  
Same syntax, relation(entity)
+
gop_xrefto
e.g., has_regulation_target(WB:WBGene00002015)
+
 
Tag used depends on what type of entity is in parenthesis
+
Syntax: relation(entity)
 +
 
 +
e.g., has_regulation_target(WB:WBGene00002015)
 +
 
 +
e.g., happens_during(WBls:00234)
 +
 
 +
Tag used depends on what type of entity is in parenthesis
  
go_xrefto
 
  
 
If entity = WBls:nnnnnn
 
If entity = WBls:nnnnnn
Life_stage_relation “relation” “WBlsnnnnn”
+
Life_stage_relation “relation” “WBls:nnnnn”
  
 
If entity = WB:WBGenennnnnnnn
 
If entity = WB:WBGenennnnnnnn
 
Gene_relation “relation” “WBGenennnnnnnn”
 
Gene_relation “relation” “WBGenennnnnnnn”
  
If entity = ChEBI:nnnnn
+
If entity = WBMol:nnnnnnn
Map ChEBI:nnnnn to WBMol
+
Molecule_relation “relation” “WBMol:nnnnnnn”
Molecule_relation “relation” “WBMolnnnnnnn”
 
  
If entity = WB:WBbtnnnnnnn
+
If entity = WBbt:nnnnnnn
Anatomy_relation “relation” “WBbtnnnnnnn”
+
Anatomy_relation “relation” “WBbt:nnnnnnn”
  
If entity = GO:GOnnnnnnn
+
If entity = GO:nnnnnnn
 
GO_term_relation “relation” “GO:nnnnnnn”
 
GO_term_relation “relation” “GO:nnnnnnn”
  
  
7. gop_protein (don’t have any at the moment)
+
 
 +
'''Note:'''When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately. On mangolassi see pgid 14233 in the GO OA for an example.
 +
 
 +
Example:
 +
 
 +
has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666)
 +
 
 +
Gene_relation "has_regulation_target" "WBGene00002015"
 +
 
 +
GO_term_relation "happens_during" "GO:0001666"
 +
 
 +
 
 +
*gop_protein (don’t have any at the moment)
  
 
Annotation_isoform “UniProtKB:nnnnnn”
 
Annotation_isoform “UniProtKB:nnnnnn”
  
  
9. gop_paper
+
*gop_paper
  
 
Reference “WBPaper00028482”
 
Reference “WBPaper00028482”
  
  
10. gop_accession
+
*gop_accession
  
 
GO_reference “Gene Ontology Consortium” “GO_REF” “nnnnnnn”
 
GO_reference “Gene Ontology Consortium” “GO_REF” “nnnnnnn”
  
  
11. Contributed_by “WormBase”
+
*Contributed_by “WormBase”
  
 
This will be a default value.
 
This will be a default value.
  
  
12. gop_lastupdate
+
*gop_lastupdate
  
 
Date_last_updated “YYYY-MM-DD”
 
Date_last_updated “YYYY-MM-DD”
 +
 +
 +
 +
Back to [[Gene Ontology]]

Latest revision as of 22:19, 21 February 2015

1. Need to populate postgres (GO OA tables with data from Phenotype2GO file)

2. File to parse into postgres is currently in this directory on mangolassi:

/home/postgres/work/pgpopulation/go/go_curation/20141106_kevin_godata/newGpaEntries

3. Table mapping columns in the newGpaEntries file to GO OA tables is here:

http://wiki.wormbase.org/index.php/Mapping_the_GAF_to_GO_OA_tables

Will need to add three values to the Qualifier drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’

4. Once file is read into OA tables, then we’ll need a new dumping script.

5. Specs for new OA dumping script

Model for WS248:

http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/wormbase/wspec/models.wrm?revision=1.416&root=ensembl&view=markup

Can likely borrow a lot from the gpad parsing script on tazendra /home/acedb/kimberly/citace_upload/go/go_gpad_parser.pl

  • Annotation ID

Will need to determine what the starting ID should be Could look at the main .ace file (from gpad conversion - gp_annotation.ace) and start with +1 from the end Currently, that file is on tazendra here: /home/acedb/kimberly/citace_upload/go/gp_annotation.ace

GO_annotation : “00055495”


  • Gene

From gop_wbgene

Gene “WBGene00004362”


  • GO_term

From gop_goid

GO_term "GO:0005634"


  • GO_code

From gop_goinference

GO_code “IMP”


  • Annotation_relation

From gop_qualifier Will need to add three values to the drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’ Then populate all ‘P’ with ‘involved_in’; all ‘F’ with ‘enables’, all ‘C’ with ‘part_of’ Can be a multi-value field, i.e., can have something like ‘NOT’ ‘enables’

Annotation_relation “enables”

Annotation_relation “NOT”

Annotation_relation “enables”


  • Annotation_made_with

gop_with_wbgene

Interacting_gene “WBGene00000936”

Multiple entries

Interacting_gene “WBGene00000936” Interacting_gene “WBGene00000937”


gop_with

Inferred_from_GO_term “GO:0035195"


       gop_with_wbvariation

Variation “WBVar00278468”


gop_with_RNAi

RNAi_result “WBRNAi00035226”


gop_with_phenotype

Phenotype “WBPhenotype:0000059”


  • Annotation_extension

gop_xrefto

Syntax: relation(entity)

e.g., has_regulation_target(WB:WBGene00002015)

e.g., happens_during(WBls:00234)

Tag used depends on what type of entity is in parenthesis


If entity = WBls:nnnnnn Life_stage_relation “relation” “WBls:nnnnn”

If entity = WB:WBGenennnnnnnn Gene_relation “relation” “WBGenennnnnnnn”

If entity = WBMol:nnnnnnn Molecule_relation “relation” “WBMol:nnnnnnn”

If entity = WBbt:nnnnnnn Anatomy_relation “relation” “WBbt:nnnnnnn”

If entity = GO:nnnnnnn GO_term_relation “relation” “GO:nnnnnnn”


Note:When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately. On mangolassi see pgid 14233 in the GO OA for an example.

Example:

has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666)

Gene_relation "has_regulation_target" "WBGene00002015"

GO_term_relation "happens_during" "GO:0001666"


  • gop_protein (don’t have any at the moment)

Annotation_isoform “UniProtKB:nnnnnn”


  • gop_paper

Reference “WBPaper00028482”


  • gop_accession

GO_reference “Gene Ontology Consortium” “GO_REF” “nnnnnnn”


  • Contributed_by “WormBase”

This will be a default value.


  • gop_lastupdate

Date_last_updated “YYYY-MM-DD”


Back to Gene Ontology