Difference between revisions of "WBGene information and status pipeline"
From WormBaseWiki
Jump to navigationJump to searchLine 5: | Line 5: | ||
!Current - Nameserver nightly dump | !Current - Nameserver nightly dump | ||
!Current - WS bimonthly release | !Current - WS bimonthly release | ||
+ | !Future - Nameserver nightly dump | ||
!Future - Geneace nightly dump | !Future - Geneace nightly dump | ||
!Future - WS bimonthly release | !Future - WS bimonthly release |
Revision as of 18:16, 17 October 2013
Contents
Table Summarizing Current/Future Postgres Population
AceDB tag | Postgres table | Current - Nameserver nightly dump | Current - WS bimonthly release | Future - Nameserver nightly dump | Future - Geneace nightly dump | Future - WS bimonthly release | Use - Paper or meeting abstract gene connection | Use - OA data type curation | Use - Dumping scripts -- could be wrong, but I don't think any gin_ tables are used in dumping scripts since we store WBGene IDs. except maybe gin_dead if people want those suppressed or to have some kind of error message or to map to Historical_gene or something like that) | Use - Protein2GO data conversion | Use - GSA Markup | Comment |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WBGene identifier | gin_wbgene | Yes | Yes | Yes | Yes | Yes | ||||||
CGC_name | gin_locus | Yes | If it has this tag, gene is considered good (What does 'good' mean?) | Yes | Yes | Yes | No | |||||
Other_name | gin_synonyms | No | Yes | Yes | No | Yes | Yes | No | ||||
Sequence_name | gin_seqname | Yes | No | Yes | Yes | Yes | No | |||||
Status | gin_dead | Yes | only if value is dead and species ~ elegans$ | Yes | only if value is dead | Yes | Yes | Yes | Yes | |||
Merged_into | gin_dead | No | Yes | Yes | No | Historical_gene tag? | ||||||
Split_into | gin_dead | No | Yes | Yes | No | Historical_gene tag? | ||||||
Corresponding_transcript | gin_sequence | No | Yes | No | Yes | Confirm | ||||||
Corresponding_CDS | gin_sequence + gin_seqprot | No | Yes | No | Yes | Confirm | ||||||
Corresponding_protein | gin_protein, gin_seqprot | No | Yes | No | Yes | Confirm | Yes, but we'll need isoform data in WB | |||||
Molecular_name | gin_molname | No | Yes | No | Yes | Yes | No | Maybe | ||||
Species | This could perhaps be used to populate a future species tag for papers, but this is not an immediate need. Other use cases? | |||||||||||
Version_change | No | Yes, to make sure we don't attach GO annotations to pseudogenes. | One use case would be to know when genes change class, e.g. CDS ->Pseudogene. We may not need to actually store this in postgres, though. | |||||||||
Public_name | gin_wbgene | Yes (but only when no CGC_name or Sequence_name) | If it has this tag, gene is considered good | Don't need (Public_name also in Other_name - confirm this is always the case) | No | Not if also in Other_name | Not if also in Other_name | Not if also in Other_name | No | I think we can now ignore the Public_name tag as long as there's always an Other_name value as well -- so if there is no Other_name then we'd look at Public_name ? looking at the script, we're not doing anything with this value) |
Current Scripts:
- /home/acedb/cron/populate_gin_locus.pl
- /home/acedb/cron/populate_gin.pl
New Scripts:
Some Relevant Postgres Queries:
SELECT * FROM gin_dead WHERE gin_dead ~ 'merged' AND gin_dead ~ 'split';
SELECT * FROM gin_dead WHERE gin_dead ~ 'split';
Back to Caltech documentation