Difference between revisions of "GFF Release Data and Changes"

From WormBaseWiki
Jump to navigationJump to search
(Added overview of which datasets contain bioproject IDs as part of their landmark.)
(Added report on WS239.)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
= Elementary SQL Commands =
 +
 +
Checking for names in a GFF2 table schema: '''select distinct fref from fdata where fref like '%Contig2';'''
 +
 +
Checking for names in a GFF3 table schema: '''select distinct name from name where name like '%Scaffold122';'''
 +
 +
= WS239 =
 +
 +
Table name format: ''genus-initial''_''species''_''bioproject-id''_''release''
 +
 +
Example: a_suum_PRJNA62057_WS239
 +
 +
{| class="wikitable"
 +
|-
 +
! Table !! Data Format !! Format Changed from WS238 to WS239? !! Landmark Includes Bioproject ID? !!
 +
|-
 +
| a_suum_PRJNA62057_WS239 || GFF3 || no || TBD
 +
|-
 +
| a_suum_PRJNA80881_WS239 || GFF3 || no || yes (e.g., As_PRJNA80881_Scaffold122)
 +
|-
 +
| b_malayi_PRJNA10729_WS239 || GFF3 || yes || TBD
 +
|-
 +
| b_xylophilus_PRJEA64437_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_angaria_PRJNA51225_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_brenneri_PRJNA20035_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_briggsae_PRJNA10731_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_elegans_PRJNA13758_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_japonica_PRJNA12591_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_remanei_PRJNA53967_WS239 || GFF3 || yes || TBD
 +
|-
 +
| c_sp11_PRJNA53597_WS239 || GFF3 || no || TBD
 +
|-
 +
| c_sp5_PRJNA194557_WS239 || GFF3 || no || TBD
 +
|-
 +
| h_bacteriophora_PRJNA13977_WS239 || GFF3 || no || TBD
 +
|-
 +
| h_contortus_PRJEB506_WS239 || GFF3 || no || TBD
 +
|-
 +
| l_loa_PRJNA60051_WS239 || GFF3 || no || TBD
 +
|-
 +
| m_hapla_PRJNA29083_WS239 || GFF3 || no || TBD
 +
|-
 +
| m_incognita_PRJEA28837_WS239 || GFF3 || no || TBD
 +
|-
 +
| p_pacificus_PRJNA12644_WS239 || GFF3 || yes || TBD
 +
|-
 +
| s_ratti_PRJEA62033_WS239 || GFF3 || no || TBD
 +
|-
 +
| t_spiralis_PRJNA12603_WS239 || GFF3 || no || TBD
 +
|}
 +
 +
GFF3 type and source assignments (via `select * from typelist;`)
 +
 +
'''Only from C. briggsae right now:'''
 +
 +
* CDS:WormBase
 +
* CDS:genBlastG
 +
* SL1_acceptor_site:SL1
 +
* SL2_acceptor_site:SL2
 +
* assembly_component:Genbank
 +
* assembly_component:Genomic_canonical
 +
* assembly_component:Link
 +
* complex_substitution:Variation_project
 +
* exon:Transposo
 +
* exon:WormBase
 +
* exon:history
 +
* exon:tRNA_mature_transcript
 +
* expressed_sequence_match:BLAT_Caen_EST_BEST
 +
* expressed_sequence_match:BLAT_Caen_EST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_OST_BEST
 +
* expressed_sequence_match:BLAT_Caen_OST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_RST_BEST
 +
* expressed_sequence_match:BLAT_Caen_RST_OTHER
 +
* expressed_sequence_match:BLAT_Caen_mRNA_BEST
 +
* expressed_sequence_match:BLAT_Caen_mRNA_OTHER
 +
* expressed_sequence_match:BLAT_Caen_ncRNA_BEST
 +
* expressed_sequence_match:BLAT_Caen_ncRNA_OTHER
 +
* expressed_sequence_match:BLAT_Caen_tc1_BEST
 +
* expressed_sequence_match:BLAT_Caen_tc1_OTHER
 +
* expressed_sequence_match:BLAT_EST_BEST
 +
* expressed_sequence_match:BLAT_EST_OTHER
 +
* expressed_sequence_match:BLAT_mRNA_BEST
 +
* expressed_sequence_match:BLAT_mRNA_OTHER
 +
* expressed_sequence_match:EMBL_nematode_cDNAs-BLAT
 +
* expressed_sequence_match:NEMATODE.NET_cDNAs-BLAT
 +
* expressed_sequence_match:NEMBASE_cDNAs-BLAT
 +
* five_prime_UTR:WormBase
 +
* gene:WormBase
 +
* intron:WormBase
 +
* inverted_repeat:inverted
 +
* low_complexity_region:dust
 +
* mRNA:WormBase
 +
* nc_primary_tra
 +
* nc_primary_transcript:history
 +
* nucleotide_match:EXONERATE_BAC_END_BEST
 +
* nucleotide_match:EXONERATE_BAC_END_OTHER
 +
* nucleotide_match:TEC_RED
 +
* primary_transcript:history
 +
* protein_match:UniProt-BLASTX
 +
* protein_match:bmalayi_proteins-BLASTX
 +
* protein_match:cbrenneri_proteins-BLASTX
 +
* protein_match:cbriggsae_proteins-BLASTX
 +
* protein_match:celegans_proteins-BLASTX
 +
* protein_match:cjaponica_proteins-BLASTX
 +
* protein_match:cremanei_proteins-BLASTX
 +
* protein_match:dmelanogaster_proteins-BLASTX
 +
* protein_match:hsapiens_proteins-BLASTX
 +
* protein_match:ppacificus_proteins-BLASTX
 +
* protein_match:scerevisiae_proteins-BLASTX
 +
* pseudogenic_transcript:WormBase
 +
* reagent:Oligo_set
 +
* region:
 +
* repeat_region:
 +
* sequence_motif:translated_feature
 +
* substitution:Allele
 +
* tRNA:tRNA_mature_transcript
 +
* tandem_repeat:tandem
 +
* transcript_region:RNASeq_F_asymmetry
 +
* transcript_region:RNASeq_reads
 +
* transposable_element_CDS:Transposon_CDS
 +
 
= WS238 =
 
= WS238 =
  
Line 7: Line 134:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! Table !! Data Format !! Landmark Includes Bioproject ID !!
+
! Table !! Data Format !! Format Changed from WS237 to WS238? !! Landmark Includes Bioproject ID? !!
 
|-
 
|-
| a_suum_PRJNA62057_WS238 || GFF3 || no
+
| a_suum_PRJNA62057_WS238 || GFF3 || no || no
 
|-
 
|-
| a_suum_PRJNA80881_WS238 || GFF3 || yes (e.g., As_PRJNA80881_Scaffold122)
+
| a_suum_PRJNA80881_WS238 || GFF3 || no || yes (e.g., As_PRJNA80881_Scaffold122)
 
|-
 
|-
| b_malayi_PRJNA10729_WS238 || GFF2 || ?
+
| b_malayi_PRJNA10729_WS238 || GFF2 || no || ?
 
|-
 
|-
| b_xylophilus_PRJEA64437_WS238 || GFF3 || no
+
| b_xylophilus_PRJEA64437_WS238 || GFF3 || no || no
 
|-
 
|-
| c_angaria_PRJNA51225_WS238 || GFF3 || no
+
| c_angaria_PRJNA51225_WS238 || GFF3 || no || no
 
|-
 
|-
| c_brenneri_PRJNA20035_WS238 || GFF2 || ?
+
| c_brenneri_PRJNA20035_WS238 || GFF2 || no || ?
 
|-
 
|-
| c_briggsae_PRJNA10731_WS238 || GFF2 || ?
+
| c_briggsae_PRJNA10731_WS238 || GFF2 || no || ?
 
|-
 
|-
| c_elegans_PRJNA13758_WS238 || GFF2 || ?
+
| c_elegans_PRJNA13758_WS238 || GFF2 || no || ?
 
|-
 
|-
| c_japonica_PRJNA12591_WS238 || GFF2 || ?
+
| c_japonica_PRJNA12591_WS238 || GFF2 || no || ?
 
|-
 
|-
| c_remanei_PRJNA53967_WS238 || GFF2 || ?
+
| c_remanei_PRJNA53967_WS238 || GFF2 || no || ?
 
|-
 
|-
| c_sp11_PRJNA53597_WS238 || GFF3 || no
+
| c_sp11_PRJNA53597_WS238 || GFF3 || no || no
 
|-
 
|-
| c_sp5_PRJNA194557_WS238 || GFF3 || no
+
| c_sp5_PRJNA194557_WS238 || GFF3 || no || no
 
|-
 
|-
| h_bacteriophora_PRJNA13977_WS238 || GFF3 || no
+
| h_bacteriophora_PRJNA13977_WS238 || GFF3 || no || no
 
|-
 
|-
| h_contortus_PRJEB506_WS238 || GFF3 || no
+
| h_contortus_PRJEB506_WS238 || GFF3 || no || no
 
|-
 
|-
| l_loa_PRJNA60051_WS238 || GFF3 || no
+
| l_loa_PRJNA60051_WS238 || GFF3 || no || no
 
|-
 
|-
| m_hapla_PRJNA29083_WS238 || GFF3 || no
+
| m_hapla_PRJNA29083_WS238 || GFF3 || no || no
 
|-
 
|-
| m_incognita_PRJEA28837_WS238 || GFF3 || no
+
| m_incognita_PRJEA28837_WS238 || GFF3 || no || no
 
|-
 
|-
| p_pacificus_PRJNA12644_WS238 || GFF2 || ?
+
| p_pacificus_PRJNA12644_WS238 || GFF2 || no || ?
 
|-
 
|-
| s_ratti_PRJEA62033_WS238 || GFF3 || no
+
| s_ratti_PRJEA62033_WS238 || GFF3 || no || no
 
|-
 
|-
| t_spiralis_PRJNA12603_WS238 || GFF3 || no
+
| t_spiralis_PRJNA12603_WS238 || GFF3 || no || no
 
|}
 
|}

Latest revision as of 19:36, 29 August 2013

Elementary SQL Commands

Checking for names in a GFF2 table schema: select distinct fref from fdata where fref like '%Contig2';

Checking for names in a GFF3 table schema: select distinct name from name where name like '%Scaffold122';

WS239

Table name format: genus-initial_species_bioproject-id_release

Example: a_suum_PRJNA62057_WS239

Table Data Format Format Changed from WS238 to WS239? Landmark Includes Bioproject ID?
a_suum_PRJNA62057_WS239 GFF3 no TBD
a_suum_PRJNA80881_WS239 GFF3 no yes (e.g., As_PRJNA80881_Scaffold122)
b_malayi_PRJNA10729_WS239 GFF3 yes TBD
b_xylophilus_PRJEA64437_WS239 GFF3 no TBD
c_angaria_PRJNA51225_WS239 GFF3 no TBD
c_brenneri_PRJNA20035_WS239 GFF3 yes TBD
c_briggsae_PRJNA10731_WS239 GFF3 yes TBD
c_elegans_PRJNA13758_WS239 GFF3 yes TBD
c_japonica_PRJNA12591_WS239 GFF3 yes TBD
c_remanei_PRJNA53967_WS239 GFF3 yes TBD
c_sp11_PRJNA53597_WS239 GFF3 no TBD
c_sp5_PRJNA194557_WS239 GFF3 no TBD
h_bacteriophora_PRJNA13977_WS239 GFF3 no TBD
h_contortus_PRJEB506_WS239 GFF3 no TBD
l_loa_PRJNA60051_WS239 GFF3 no TBD
m_hapla_PRJNA29083_WS239 GFF3 no TBD
m_incognita_PRJEA28837_WS239 GFF3 no TBD
p_pacificus_PRJNA12644_WS239 GFF3 yes TBD
s_ratti_PRJEA62033_WS239 GFF3 no TBD
t_spiralis_PRJNA12603_WS239 GFF3 no TBD

GFF3 type and source assignments (via `select * from typelist;`)

Only from C. briggsae right now:

  • CDS:WormBase
  • CDS:genBlastG
  • SL1_acceptor_site:SL1
  • SL2_acceptor_site:SL2
  • assembly_component:Genbank
  • assembly_component:Genomic_canonical
  • assembly_component:Link
  • complex_substitution:Variation_project
  • exon:Transposo
  • exon:WormBase
  • exon:history
  • exon:tRNA_mature_transcript
  • expressed_sequence_match:BLAT_Caen_EST_BEST
  • expressed_sequence_match:BLAT_Caen_EST_OTHER
  • expressed_sequence_match:BLAT_Caen_OST_BEST
  • expressed_sequence_match:BLAT_Caen_OST_OTHER
  • expressed_sequence_match:BLAT_Caen_RST_BEST
  • expressed_sequence_match:BLAT_Caen_RST_OTHER
  • expressed_sequence_match:BLAT_Caen_mRNA_BEST
  • expressed_sequence_match:BLAT_Caen_mRNA_OTHER
  • expressed_sequence_match:BLAT_Caen_ncRNA_BEST
  • expressed_sequence_match:BLAT_Caen_ncRNA_OTHER
  • expressed_sequence_match:BLAT_Caen_tc1_BEST
  • expressed_sequence_match:BLAT_Caen_tc1_OTHER
  • expressed_sequence_match:BLAT_EST_BEST
  • expressed_sequence_match:BLAT_EST_OTHER
  • expressed_sequence_match:BLAT_mRNA_BEST
  • expressed_sequence_match:BLAT_mRNA_OTHER
  • expressed_sequence_match:EMBL_nematode_cDNAs-BLAT
  • expressed_sequence_match:NEMATODE.NET_cDNAs-BLAT
  • expressed_sequence_match:NEMBASE_cDNAs-BLAT
  • five_prime_UTR:WormBase
  • gene:WormBase
  • intron:WormBase
  • inverted_repeat:inverted
  • low_complexity_region:dust
  • mRNA:WormBase
  • nc_primary_tra
  • nc_primary_transcript:history
  • nucleotide_match:EXONERATE_BAC_END_BEST
  • nucleotide_match:EXONERATE_BAC_END_OTHER
  • nucleotide_match:TEC_RED
  • primary_transcript:history
  • protein_match:UniProt-BLASTX
  • protein_match:bmalayi_proteins-BLASTX
  • protein_match:cbrenneri_proteins-BLASTX
  • protein_match:cbriggsae_proteins-BLASTX
  • protein_match:celegans_proteins-BLASTX
  • protein_match:cjaponica_proteins-BLASTX
  • protein_match:cremanei_proteins-BLASTX
  • protein_match:dmelanogaster_proteins-BLASTX
  • protein_match:hsapiens_proteins-BLASTX
  • protein_match:ppacificus_proteins-BLASTX
  • protein_match:scerevisiae_proteins-BLASTX
  • pseudogenic_transcript:WormBase
  • reagent:Oligo_set
  • region:
  • repeat_region:
  • sequence_motif:translated_feature
  • substitution:Allele
  • tRNA:tRNA_mature_transcript
  • tandem_repeat:tandem
  • transcript_region:RNASeq_F_asymmetry
  • transcript_region:RNASeq_reads
  • transposable_element_CDS:Transposon_CDS

WS238

Table name format: genus-initial_species_bioproject-id_release

Example: a_suum_PRJNA62057_WS238

Table Data Format Format Changed from WS237 to WS238? Landmark Includes Bioproject ID?
a_suum_PRJNA62057_WS238 GFF3 no no
a_suum_PRJNA80881_WS238 GFF3 no yes (e.g., As_PRJNA80881_Scaffold122)
b_malayi_PRJNA10729_WS238 GFF2 no ?
b_xylophilus_PRJEA64437_WS238 GFF3 no no
c_angaria_PRJNA51225_WS238 GFF3 no no
c_brenneri_PRJNA20035_WS238 GFF2 no ?
c_briggsae_PRJNA10731_WS238 GFF2 no ?
c_elegans_PRJNA13758_WS238 GFF2 no ?
c_japonica_PRJNA12591_WS238 GFF2 no ?
c_remanei_PRJNA53967_WS238 GFF2 no ?
c_sp11_PRJNA53597_WS238 GFF3 no no
c_sp5_PRJNA194557_WS238 GFF3 no no
h_bacteriophora_PRJNA13977_WS238 GFF3 no no
h_contortus_PRJEB506_WS238 GFF3 no no
l_loa_PRJNA60051_WS238 GFF3 no no
m_hapla_PRJNA29083_WS238 GFF3 no no
m_incognita_PRJEA28837_WS238 GFF3 no no
p_pacificus_PRJNA12644_WS238 GFF2 no ?
s_ratti_PRJEA62033_WS238 GFF3 no no
t_spiralis_PRJNA12603_WS238 GFF3 no no