Difference between revisions of "GFF Release Data and Changes"

From WormBaseWiki
Jump to navigationJump to search
(Added SQL commands that explain how I retrieve names from GFF2/GFF3 tables.)
(Added report on WS239.)
 
Line 4: Line 4:
  
 
Checking for names in a GFF3 table schema: '''select distinct name from name where name like '%Scaffold122';'''
 
Checking for names in a GFF3 table schema: '''select distinct name from name where name like '%Scaffold122';'''
 +
 +
= WS239 =
 +
 +
Table name format: ''genus-initial''_''species''_''bioproject-id''_''release''
 +
 +
Example: a_suum_PRJNA62057_WS239
 +
 +
{| class="wikitable"
 +
|-
 +
! Table !! Data Format !! Format Changed from WS238 to WS239? !! Landmark Includes Bioproject ID? !!
 +
|-
 +
| a_suum_PRJNA62057_WS239 || GFF3 || no || TBD
 +
|-
 +
| a_suum_PRJNA80881_WS239 || GFF3 || no || yes (e.g., As_PRJNA80881_Scaffold122)
 +
|-
 +
| b_malayi_PRJNA10729_WS239 || GFF3 || yes || TBD
 +
|-
 +
| b_xylophilus_PRJEA64437_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_angaria_PRJNA51225_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_brenneri_PRJNA20035_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_briggsae_PRJNA10731_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_elegans_PRJNA13758_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_japonica_PRJNA12591_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_remanei_PRJNA53967_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_sp11_PRJNA53597_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_sp5_PRJNA194557_WS239 || GFF3 || no || TBD
 +
|-
 +
| h_bacteriophora_PRJNA13977_WS239 || GFF3 || no || TBD
 +
|-
 +
| h_contortus_PRJEB506_WS239 || GFF3 || no || TBD
 +
|-
 +
| l_loa_PRJNA60051_WS239 || GFF3 || no || TBD
 +
|-
 +
| m_hapla_PRJNA29083_WS239 || GFF3 || no || TBD
 +
|-
 +
| m_incognita_PRJEA28837_WS239 || GFF3 || no || TBD
 +
|-
 +
| p_pacificus_PRJNA12644_WS239 || GFF3 || yes || TBD
 +
|-
 +
| s_ratti_PRJEA62033_WS239 || GFF3 || no || TBD
 +
|-
 +
| t_spiralis_PRJNA12603_WS239 || GFF3 || no || TBD
 +
|}
 +
 +
GFF3 type and source assignments (via `select * from typelist;`)
 +
 +
'''Only from C. briggsae right now:'''
 +
 +
* CDS:WormBase
 +
* CDS:genBlastG
 +
* SL1_acceptor_site:SL1
 +
* SL2_acceptor_site:SL2
 +
* assembly_component:Genbank
 +
* assembly_component:Genomic_canonical
 +
* assembly_component:Link
 +
* complex_substitution:Variation_project
 +
* exon:Transposo
 +
* exon:WormBase
 +
* exon:history
 +
* exon:tRNA_mature_transcript
 +
* expressed_sequence_match:BLAT_Caen_EST_BEST
 +
* expressed_sequence_match:BLAT_Caen_EST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_OST_BEST
 +
* expressed_sequence_match:BLAT_Caen_OST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_RST_BEST
 +
* expressed_sequence_match:BLAT_Caen_RST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_mRNA_BEST
 +
* expressed_sequence_match:BLAT_Caen_mRNA_OTHER
 +
* expressed_sequence_match:BLAT_Caen_ncRNA_BEST
 +
* expressed_sequence_match:BLAT_Caen_ncRNA_OTHER
 +
* expressed_sequence_match:BLAT_Caen_tc1_BEST
 +
* expressed_sequence_match:BLAT_Caen_tc1_OTHER
 +
* expressed_sequence_match:BLAT_EST_BEST
 +
* expressed_sequence_match:BLAT_EST_OTHER
 +
* expressed_sequence_match:BLAT_mRNA_BEST
 +
* expressed_sequence_match:BLAT_mRNA_OTHER
 +
* expressed_sequence_match:EMBL_nematode_cDNAs-BLAT
 +
* expressed_sequence_match:NEMATODE.NET_cDNAs-BLAT
 +
* expressed_sequence_match:NEMBASE_cDNAs-BLAT
 +
* five_prime_UTR:WormBase
 +
* gene:WormBase
 +
* intron:WormBase
 +
* inverted_repeat:inverted
 +
* low_complexity_region:dust
 +
* mRNA:WormBase
 +
* nc_primary_tra
 +
* nc_primary_transcript:history
 +
* nucleotide_match:EXONERATE_BAC_END_BEST
 +
* nucleotide_match:EXONERATE_BAC_END_OTHER
 +
* nucleotide_match:TEC_RED
 +
* primary_transcript:history
 +
* protein_match:UniProt-BLASTX
 +
* protein_match:bmalayi_proteins-BLASTX
 +
* protein_match:cbrenneri_proteins-BLASTX
 +
* protein_match:cbriggsae_proteins-BLASTX
 +
* protein_match:celegans_proteins-BLASTX
 +
* protein_match:cjaponica_proteins-BLASTX
 +
* protein_match:cremanei_proteins-BLASTX
 +
* protein_match:dmelanogaster_proteins-BLASTX
 +
* protein_match:hsapiens_proteins-BLASTX
 +
* protein_match:ppacificus_proteins-BLASTX
 +
* protein_match:scerevisiae_proteins-BLASTX
 +
* pseudogenic_transcript:WormBase
 +
* reagent:Oligo_set
 +
* region:
 +
* repeat_region:
 +
* sequence_motif:translated_feature
 +
* substitution:Allele
 +
* tRNA:tRNA_mature_transcript
 +
* tandem_repeat:tandem
 +
* transcript_region:RNASeq_F_asymmetry
 +
* transcript_region:RNASeq_reads
 +
* transposable_element_CDS:Transposon_CDS
  
 
= WS238 =
 
= WS238 =

Latest revision as of 19:36, 29 August 2013

Elementary SQL Commands

Checking for names in a GFF2 table schema: select distinct fref from fdata where fref like '%Contig2';

Checking for names in a GFF3 table schema: select distinct name from name where name like '%Scaffold122';

WS239

Table name format: genus-initial_species_bioproject-id_release

Example: a_suum_PRJNA62057_WS239

Table Data Format Format Changed from WS238 to WS239? Landmark Includes Bioproject ID?
a_suum_PRJNA62057_WS239 GFF3 no TBD
a_suum_PRJNA80881_WS239 GFF3 no yes (e.g., As_PRJNA80881_Scaffold122)
b_malayi_PRJNA10729_WS239 GFF3 yes TBD
b_xylophilus_PRJEA64437_WS239 GFF3 no TBD
c_angaria_PRJNA51225_WS239 GFF3 no TBD
c_brenneri_PRJNA20035_WS239 GFF3 yes TBD
c_briggsae_PRJNA10731_WS239 GFF3 yes TBD
c_elegans_PRJNA13758_WS239 GFF3 yes TBD
c_japonica_PRJNA12591_WS239 GFF3 yes TBD
c_remanei_PRJNA53967_WS239 GFF3 yes TBD
c_sp11_PRJNA53597_WS239 GFF3 no TBD
c_sp5_PRJNA194557_WS239 GFF3 no TBD
h_bacteriophora_PRJNA13977_WS239 GFF3 no TBD
h_contortus_PRJEB506_WS239 GFF3 no TBD
l_loa_PRJNA60051_WS239 GFF3 no TBD
m_hapla_PRJNA29083_WS239 GFF3 no TBD
m_incognita_PRJEA28837_WS239 GFF3 no TBD
p_pacificus_PRJNA12644_WS239 GFF3 yes TBD
s_ratti_PRJEA62033_WS239 GFF3 no TBD
t_spiralis_PRJNA12603_WS239 GFF3 no TBD

GFF3 type and source assignments (via `select * from typelist;`)

Only from C. briggsae right now:

  • CDS:WormBase
  • CDS:genBlastG
  • SL1_acceptor_site:SL1
  • SL2_acceptor_site:SL2
  • assembly_component:Genbank
  • assembly_component:Genomic_canonical
  • assembly_component:Link
  • complex_substitution:Variation_project
  • exon:Transposo
  • exon:WormBase
  • exon:history
  • exon:tRNA_mature_transcript
  • expressed_sequence_match:BLAT_Caen_EST_BEST
  • expressed_sequence_match:BLAT_Caen_EST_OTHER
  • expressed_sequence_match:BLAT_Caen_OST_BEST
  • expressed_sequence_match:BLAT_Caen_OST_OTHER
  • expressed_sequence_match:BLAT_Caen_RST_BEST
  • expressed_sequence_match:BLAT_Caen_RST_OTHER
  • expressed_sequence_match:BLAT_Caen_mRNA_BEST
  • expressed_sequence_match:BLAT_Caen_mRNA_OTHER
  • expressed_sequence_match:BLAT_Caen_ncRNA_BEST
  • expressed_sequence_match:BLAT_Caen_ncRNA_OTHER
  • expressed_sequence_match:BLAT_Caen_tc1_BEST
  • expressed_sequence_match:BLAT_Caen_tc1_OTHER
  • expressed_sequence_match:BLAT_EST_BEST
  • expressed_sequence_match:BLAT_EST_OTHER
  • expressed_sequence_match:BLAT_mRNA_BEST
  • expressed_sequence_match:BLAT_mRNA_OTHER
  • expressed_sequence_match:EMBL_nematode_cDNAs-BLAT
  • expressed_sequence_match:NEMATODE.NET_cDNAs-BLAT
  • expressed_sequence_match:NEMBASE_cDNAs-BLAT
  • five_prime_UTR:WormBase
  • gene:WormBase
  • intron:WormBase
  • inverted_repeat:inverted
  • low_complexity_region:dust
  • mRNA:WormBase
  • nc_primary_tra
  • nc_primary_transcript:history
  • nucleotide_match:EXONERATE_BAC_END_BEST
  • nucleotide_match:EXONERATE_BAC_END_OTHER
  • nucleotide_match:TEC_RED
  • primary_transcript:history
  • protein_match:UniProt-BLASTX
  • protein_match:bmalayi_proteins-BLASTX
  • protein_match:cbrenneri_proteins-BLASTX
  • protein_match:cbriggsae_proteins-BLASTX
  • protein_match:celegans_proteins-BLASTX
  • protein_match:cjaponica_proteins-BLASTX
  • protein_match:cremanei_proteins-BLASTX
  • protein_match:dmelanogaster_proteins-BLASTX
  • protein_match:hsapiens_proteins-BLASTX
  • protein_match:ppacificus_proteins-BLASTX
  • protein_match:scerevisiae_proteins-BLASTX
  • pseudogenic_transcript:WormBase
  • reagent:Oligo_set
  • region:
  • repeat_region:
  • sequence_motif:translated_feature
  • substitution:Allele
  • tRNA:tRNA_mature_transcript
  • tandem_repeat:tandem
  • transcript_region:RNASeq_F_asymmetry
  • transcript_region:RNASeq_reads
  • transposable_element_CDS:Transposon_CDS

WS238

Table name format: genus-initial_species_bioproject-id_release

Example: a_suum_PRJNA62057_WS238

Table Data Format Format Changed from WS237 to WS238? Landmark Includes Bioproject ID?
a_suum_PRJNA62057_WS238 GFF3 no no
a_suum_PRJNA80881_WS238 GFF3 no yes (e.g., As_PRJNA80881_Scaffold122)
b_malayi_PRJNA10729_WS238 GFF2 no ?
b_xylophilus_PRJEA64437_WS238 GFF3 no no
c_angaria_PRJNA51225_WS238 GFF3 no no
c_brenneri_PRJNA20035_WS238 GFF2 no ?
c_briggsae_PRJNA10731_WS238 GFF2 no ?
c_elegans_PRJNA13758_WS238 GFF2 no ?
c_japonica_PRJNA12591_WS238 GFF2 no ?
c_remanei_PRJNA53967_WS238 GFF2 no ?
c_sp11_PRJNA53597_WS238 GFF3 no no
c_sp5_PRJNA194557_WS238 GFF3 no no
h_bacteriophora_PRJNA13977_WS238 GFF3 no no
h_contortus_PRJEB506_WS238 GFF3 no no
l_loa_PRJNA60051_WS238 GFF3 no no
m_hapla_PRJNA29083_WS238 GFF3 no no
m_incognita_PRJEA28837_WS238 GFF3 no no
p_pacificus_PRJNA12644_WS238 GFF2 no ?
s_ratti_PRJEA62033_WS238 GFF3 no no
t_spiralis_PRJNA12603_WS238 GFF3 no no