Difference between revisions of "Populating the OA and New Dumping Script"

From WormBaseWiki
Jump to navigationJump to search
Line 120: Line 120:
 
If entity = GO:nnnnnnn
 
If entity = GO:nnnnnnn
 
GO_term_relation “relation” “GO:nnnnnnn”
 
GO_term_relation “relation” “GO:nnnnnnn”
 +
 +
When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately.
 +
 +
Example:
 +
 +
has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666)
 +
 +
Gene_relation "has_regulation_target" "WBGene00002015"
 +
 +
GO_term_relation "happens_during" "GO:0001666"
  
  

Revision as of 22:17, 21 February 2015

1. Need to populate postgres (GO OA tables with data from Phenotype2GO file)

2. File to parse into postgres is currently in this directory on mangolassi:

/home/postgres/work/pgpopulation/go/go_curation/20141106_kevin_godata/newGpaEntries

3. Table mapping columns in the newGpaEntries file to GO OA tables is here:

http://wiki.wormbase.org/index.php/Mapping_the_GAF_to_GO_OA_tables

Will need to add three values to the Qualifier drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’

4. Once file is read into OA tables, then we’ll need a new dumping script.

5. Specs for new OA dumping script

Model for WS248:

http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/wormbase/wspec/models.wrm?revision=1.416&root=ensembl&view=markup

Can likely borrow a lot from the gpad parsing script on tazendra /home/acedb/kimberly/citace_upload/go/go_gpad_parser.pl

  • Annotation ID

Will need to determine what the starting ID should be Could look at the main .ace file (from gpad conversion - gp_annotation.ace) and start with +1 from the end Currently, that file is on tazendra here: /home/acedb/kimberly/citace_upload/go/gp_annotation.ace

GO_annotation : “00055495”


  • Gene

From gop_wbgene

Gene “WBGene00004362”


  • GO_term

From gop_goid

GO_term "GO:0005634"


  • GO_code

From gop_goinference

GO_code “IMP”


  • Annotation_relation

From gop_qualifier Will need to add three values to the drop-down menu, ‘enables’, ‘involved_in’, ‘part_of’ Then populate all ‘P’ with ‘involved_in’; all ‘F’ with ‘enables’, all ‘C’ with ‘part_of’ Can be a multi-value field, i.e., can have something like ‘NOT’ ‘enables’

Annotation_relation “enables”

Annotation_relation “NOT”

Annotation_relation “enables”


  • Annotation_made_with

gop_with_wbgene

Interacting_gene “WBGene00000936”

Multiple entries

Interacting_gene “WBGene00000936” Interacting_gene “WBGene00000937”


gop_with

Inferred_from_GO_term “GO:0035195"


       gop_with_wbvariation

Variation “WBVar00278468”


gop_with_RNAi

RNAi_result “WBRNAi00035226”


gop_with_phenotype

Phenotype “WBPhenotype:0000059”


  • Annotation_extension

gop_xrefto

Syntax: relation(entity)

e.g., has_regulation_target(WB:WBGene00002015)

e.g., happens_during(WBls:00234)

Tag used depends on what type of entity is in parenthesis


If entity = WBls:nnnnnn Life_stage_relation “relation” “WBls:nnnnn”

If entity = WB:WBGenennnnnnnn Gene_relation “relation” “WBGenennnnnnnn”

If entity = WBMol:nnnnnnn Molecule_relation “relation” “WBMol:nnnnnnn”

If entity = WBbt:nnnnnnn Anatomy_relation “relation” “WBbt:nnnnnnn”

If entity = GO:nnnnnnn GO_term_relation “relation” “GO:nnnnnnn”

When there are multiple annotation extensions separated by commas, need to split them and populate each relation(entity) separately.

Example:

has_regulation_target(WB:WBGene00002015),happens_during(GO:0001666)

Gene_relation "has_regulation_target" "WBGene00002015"

GO_term_relation "happens_during" "GO:0001666"


  • gop_protein (don’t have any at the moment)

Annotation_isoform “UniProtKB:nnnnnn”


  • gop_paper

Reference “WBPaper00028482”


  • gop_accession

GO_reference “Gene Ontology Consortium” “GO_REF” “nnnnnnn”


  • Contributed_by “WormBase”

This will be a default value.


  • gop_lastupdate

Date_last_updated “YYYY-MM-DD”


Back to Gene Ontology