Difference between revisions of "Populating the OA and New Dumping Script"
Line 120: | Line 120: | ||
If entity = GO:nnnnnnn | If entity = GO:nnnnnnn | ||
GO_term_relation “relation” “GO:nnnnnnn” | GO_term_relation “relation” “GO:nnnnnnn” | ||
+ | |||
+ | When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately. | ||
+ | |||
+ | Example: | ||
+ | |||
+ | has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666) | ||
+ | |||
+ | Gene_relation "has_regulation_target" "WBGene00002015" | ||
+ | |||
+ | GO_term_relation "happens_during" "GO:0001666" | ||
Revision as of 22:17, 21 February 2015
1. Need to populate postgres (GO OA tables with data from Phenotype2GO file)
2. File to parse into postgres is currently in this directory on mangolassi:
/home/postgres/work/pgpopulation/go/go_curation/20141106_kevin_godata/newGpaEntries
3. Table mapping columns in the newGpaEntries file to GO OA tables is here:
http://wiki.wormbase.org/index.php/Mapping_the_GAF_to_GO_OA_tables
Will need to add three values to the Qualifier drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’
4. Once file is read into OA tables, then we’ll need a new dumping script.
5. Specs for new OA dumping script
Model for WS248:
Can likely borrow a lot from the gpad parsing script on tazendra /home/acedb/kimberly/citace_upload/go/go_gpad_parser.pl
- Annotation ID
Will need to determine what the starting ID should be Could look at the main .ace file (from gpad conversion - gp_annotation.ace) and start with +1 from the end Currently, that file is on tazendra here: /home/acedb/kimberly/citace_upload/go/gp_annotation.ace
GO_annotation : “00055495”
- Gene
From gop_wbgene
Gene “WBGene00004362”
- GO_term
From gop_goid
GO_term "GO:0005634"
- GO_code
From gop_goinference
GO_code “IMP”
- Annotation_relation
From gop_qualifier Will need to add three values to the drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’ Then populate all ‘P’ with ‘involved_in’; all ‘F’ with ‘enables’, all ‘C’ with ‘part_of’ Can be a multi-value field, i.e., can have something like ‘NOT’ ‘enables’
Annotation_relation “enables”
Annotation_relation “NOT”
Annotation_relation “enables”
- Annotation_made_with
gop_with_wbgene
Interacting_gene “WBGene00000936”
Multiple entries
Interacting_gene “WBGene00000936” Interacting_gene “WBGene00000937”
gop_with
Inferred_from_GO_term “GO:0035195"
gop_with_wbvariation
Variation “WBVar00278468”
gop_with_RNAi
RNAi_result “WBRNAi00035226”
gop_with_phenotype
Phenotype “WBPhenotype:0000059”
- Annotation_extension
gop_xrefto
Syntax: relation(entity)
e.g., has_regulation_target(WB:WBGene00002015)
e.g., happens_during(WBls:00234)
Tag used depends on what type of entity is in parenthesis
If entity = WBls:nnnnnn
Life_stage_relation “relation” “WBls:nnnnn”
If entity = WB:WBGenennnnnnnn Gene_relation “relation” “WBGenennnnnnnn”
If entity = WBMol:nnnnnnn Molecule_relation “relation” “WBMol:nnnnnnn”
If entity = WBbt:nnnnnnn Anatomy_relation “relation” “WBbt:nnnnnnn”
If entity = GO:nnnnnnn GO_term_relation “relation” “GO:nnnnnnn”
When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately.
Example:
has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666)
Gene_relation "has_regulation_target" "WBGene00002015"
GO_term_relation "happens_during" "GO:0001666"
- gop_protein (don’t have any at the moment)
Annotation_isoform “UniProtKB:nnnnnn”
- gop_paper
Reference “WBPaper00028482”
- gop_accession
GO_reference “Gene Ontology Consortium” “GO_REF” “nnnnnnn”
- Contributed_by “WormBase”
This will be a default value.
- gop_lastupdate
Date_last_updated “YYYY-MM-DD”
Back to Gene Ontology