Difference between revisions of "WBGene information and status pipeline"

From WormBaseWiki
Jump to navigationJump to search
Line 28: Line 28:
 
| gin_locus
 
| gin_locus
 
| Yes
 
| Yes
|     
+
| If it has this tag, gene is considered good      
 
| Yes
 
| Yes
 
|  
 
|  
Line 52: Line 52:
 
| gin_sequence (double check this)
 
| gin_sequence (double check this)
 
| Yes
 
| Yes
|  
+
| No
 
| Yes
 
| Yes
 
|  
 
|  
Line 64: Line 64:
 
| gin_wbgene
 
| gin_wbgene
 
| Yes (but only when no CGC_name or Sequence_name)
 
| Yes (but only when no CGC_name or Sequence_name)
|  
+
| If it has this tag, gene is considered good
 
| Don't need (Public_name also in Other_name - confirm this is ''always'' the case)
 
| Don't need (Public_name also in Other_name - confirm this is ''always'' the case)
 
|
 
|
Line 134: Line 134:
 
|-
 
|-
 
| Corresponding_CDS
 
| Corresponding_CDS
| gin_sequence
+
| gin_sequence + gin_seqprot
 
| No
 
| No
 
| Yes
 
| Yes
Line 146: Line 146:
 
|-
 
|-
 
| Corresponding_protein
 
| Corresponding_protein
| gin_protein, gin_seqprotein (Need to check about this.)
+
| gin_protein, gin_seqprotein (Need to check about this. -- it's gin_seqprot)
 
| No
 
| No
 
| Yes
 
| Yes
Line 182: Line 182:
 
|-
 
|-
 
|}
 
|}
 
  
 
='''Current Scripts:'''=
 
='''Current Scripts:'''=

Revision as of 19:00, 20 September 2013

Table Summarizing Current/Future Postgres Population

AceDB tag Postgres table Current - Nameserver nightly dump Current - WS bimonthly release Future - Geneace nightly dump Future - WS bimonthly release Use - Paper or meeting abstract gene connection Use - OA data type curation Use - Dumping scripts Use - Protein2GO data conversion Comment
WBGene identifier gin_wbgene Yes Yes Yes Yes Yes
CGC_name gin_locus Yes If it has this tag, gene is considered good Yes Yes Yes No
Other_name gin_synonyms No Yes Yes No Yes Yes No
Sequence_name gin_sequence (double check this) Yes No Yes Yes Yes No
Public_name gin_wbgene Yes (but only when no CGC_name or Sequence_name) If it has this tag, gene is considered good Don't need (Public_name also in Other_name - confirm this is always the case) Not if also in Other_name Not if also in Other_name Not if also in Other_name No I think we can now ignore the Public_name tag as long as there's always an Other_name value as well
Molecular_name gin_molname No Yes No Yes Yes No Maybe
Status gin_dead Yes Yes Yes Yes Yes Yes
Merged_into gin_history, gin_dead (confirm) No Yes Yes No Historical_gene tag?
Split_into gin_history No Yes Yes No Historical_gene tag?
Corresponding_transcript gin_sequence No Yes No Yes Confirm
Corresponding_CDS gin_sequence + gin_seqprot No Yes No Yes Confirm
Corresponding_protein gin_protein, gin_seqprotein (Need to check about this. -- it's gin_seqprot) No Yes No Yes Confirm Yes, but we'll need isoform data in WB
Species This could perhaps be used to populate a future species tag for papers, but this is not an immediate need. Other use cases?
Version_change Yes, to make sure we don't attach GO annotations to pseudogenes. One use case would be to know when genes change class, e.g. CDS ->Pseudogene. We may not need to actually store this in postgres, though.

Current Scripts:

  1. /home/acedb/cron/populate_gin_locus.pl
  2. /home/acedb/cron/populate_gin.pl


New Scripts: