Revision as of 18:39, 4 November 2014

Initial Round of Entering Phenotype2GO-Based Annotations into Postgres

The idea here is to generate a non-redundant set of P2GO annotations to enter into postgres. Non-redundant means that these annotations do no overlap with any existing manual annotations.
Step 1: Grep all annotations in current WB GAF (from build) that have 'WBPhenotype' in Column 8: With/From
Step 2: Using the annotations retrieved from Step 1, create a two-column table that contains the WBGene ID found in Column 2 and the PMID value found in Column 6.
Step 3: Using the gp_association file from UniProt-GOA, first map Column 2 value (a UniProtKB ID) to a WBGene ID using the WB gp2protein file and the replace the UniProtKB ID in Column 2 with the corresponding WBGene ID.
Step 4: Output a list of any UniProtKB IDs that don't map to a WBGene ID and update the gp2protein file.
Step 5: Repeat Step 3 with an updated gp2protein file, if needed.
Step 6: Create a second two-column table that contains the WBGene ID now found in Column 2 of the gp_association file and the PMID value found in Column 5 of the gp_association file.
Step 7: Compare the values in each of the two tables. Output two files from the grepped Phenotype2GO annotations: 1) a file containing those annotations that are redundant with the gp_association annotations, and 2) a file containing those annotations that are NOT present in the gp_association file.
Step 8: Upload the annotations in file #2 (non-redundant) to postgres gop_ OA tables.

Revision as of 18:38, 4 November 2014 (view source) Vanaukenk (talk \| contribs) ← Older edit		Revision as of 18:39, 4 November 2014 (view source) Vanaukenk (talk \| contribs) Newer edit →
Line 11:		Line 11:
	*Step 8: Upload the annotations in file #2 (non-redundant) to postgres gop_ OA tables.		*Step 8: Upload the annotations in file #2 (non-redundant) to postgres gop_ OA tables.

−	Back to [[~~Gene Ontology~~]]	+	Back to [[20141022_-_Phenotype2GO_Pipeline]]