Difference between revisions of "NGASP Results"

Latest revision as of 17:54, 16 August 2010

NGASP Gene prediction competition

The nGASP project parallels recent computational prediction initiatives including CASP, GASP, and EGASP. Participation is open to all academic, private sector, and government researchers and results will be made immediately available through a public database. A summary of the results will be submitted for peer-reviewed publication.

For nGASP, a set of regions representing ~10% (10 Mb) of the C. elegans genome release WS160 was selected to evaluate the performance of the participating gene predictors. We have selected two sets of regions: the training set (10 Mb) is to be used in training your software (if needed) and includes curated gene models from WormBase. We will used the test set (also 10 Mb) for evaluating gene prediction software.

More...

Initial results for NGASP

These are the nucleotide-level and exon-level results for NGASP. Transcript-level and gene-level results will be announced soon.

The nucleotide-level and exon-level results were calculated by using a benchmark consisting of all confirmed and non-confirmed isoforms in all genes in the test set regions.

In this analysis, when we say "exon-level" accuracy we are referring to the accuracy in predicting coding exons (CDSs). Accuracy in predicting UTRs is not considered in these results (but will be included in the final NGASP results, to be announced soon).

In the data below, "Sn" is an abbreviation for sensitivity and "Sp" is an abbreviation for specificity. The genefinders in each category are ranked according to the average of their sensitivity and specificity.

The numbers given below are percentages. In the names of the programs we give the NGASP category and version. For example, "MGENE_1_2" refers to the MGENE category 1, version 2 submitted gene set.

These results have been confirmed by two different evaluation codes written by different individuals, and therefore we are confident that they are correct. However, if you notice any anomalies with these results, please contact the NGASP organisers.

We thank all participants for taking part in NGASP, and will be announcing further results (eg. transcript-level and gene-level results) soon.

Nucleotide-level results for NGASP

Sn      = Sensitivity

Sp = Specificity average = (Sn + Sp)/2

Category 1 (ab initio) genefinders

MGENE_1_1	average=92.31	Sn=93.14	Sp=91.48
MGENE_1_3	average=92.15	Sn=92.71	Sp=91.59
MGENE_1_2	average=92.14	Sn=92.67	Sp=91.60
FGENESH_1_1	average=91.16	Sn=95.20	Sp=87.11
CRAIG_1_1	average=91.03	Sn=91.15	Sp=90.91
GLIMMERHMM_1_1	average=90.45	Sn=93.28	Sp=87.62
AUGUSTUS_1_1	average=90.34	Sn=91.66	Sp=89.02
AUGUSTUS_1_2	average=90.31	Sn=91.31	Sp=89.30
GENEMARKHMM_1_1	average=88.91	Sn=94.76	Sp=83.06
EXONHUNTER_1_1	average=88.71	Sn=91.38	Sp=86.03
SNAP_1_1	average=87.46	Sn=90.45	Sp=84.47
GENEID_1_1	average=87.41	Sn=86.57	Sp=88.24
EUGENE_1_1	average=87.07	Sn=84.65	Sp=89.48
AGENE_1_1	average=84.22	Sn=85.02	Sp=83.41

Category 2 (dual/multi-genome) genefinders

MGENE_2_2	average=92.33	Sn=93.74	Sp=90.91
MGENE_2_1	average=92.30	Sn=93.68	Sp=90.91
EUGENE_2_1	average=89.35	Sn=91.21	Sp=87.48
NSCAN_2_1	average=89.33	Sn=90.59	Sp=88.07
SGP2_2_1	average=86.93	Sn=83.86	Sp=89.99

Category 3 genefinders (use alignments of proteins, ESTs, or mRNAs

MGENE_3_3	average=92.54	Sn=93.19	Sp=91.88
MGENE_3_1	average=92.52	Sn=93.15	Sp=91.89
FGENESH++_3_1	average=92.33	Sn=94.95	Sp=89.70
AUGUSTUS_3_1	average=92.28	Sn=94.03	Sp=90.52
GRAMENE_3_2	average=91.49	Sn=88.21	Sp=94.77
MGENE_3_2	average=91.40	Sn=94.93	Sp=87.86
GRAMENE_3_1	average=90.82	Sn=86.21	Sp=95.42
EXONHUNTER_3_1	average=90.21	Sn=93.10	Sp=87.31
EUGENE_3_2	average=89.97	Sn=94.84	Sp=85.09
EUGENE_3_1	average=89.69	Sn=94.08	Sp=85.30
MAKER_3_1	average=85.71	Sn=82.92	Sp=88.50
MAKER_3_2	average=84.67	Sn=78.28	Sp=91.05
EXONHUNTER_3_2	average=81.49	Sn=71.02	Sp=91.96

Category 4 genefinders (combiners)

JIGSAW_4_2	average=93.80	Sn=95.93	Sp=91.66
JIGSAW_4_1	average=93.45	Sn=93.68	Sp=93.22
EVIGAN_4_1	average=93.43	Sn=97.27	Sp=89.59
GENEID_4_2	average=93.27	Sn=94.57	Sp=91.97
GENEID_4_1	average=93.23	Sn=94.96	Sp=91.50
FGENESH++C_4_1	average=92.60	Sn=95.51	Sp=89.68
GENOMIX_4_2	average=92	Sn=93.60	Sp=90.40
GLEAN_4_1	average=91.88	Sn=96.48	Sp=87.28
EUGENE_4_2	average=90.41	Sn=95.46	Sp=85.35
EUGENE_4_4	average=90.24	Sn=95.14	Sp=85.34
EUGENE_4_1	average=90.24	Sn=94.88	Sp=85.60
EUGENE_4_3	average=90.05	Sn=94.50	Sp=85.60
GENOMIX_4_1	average=89.84	Sn=91.13	Sp=88.55
GESECA_4_1	average=89.79	Sn=96.74	Sp=82.84
GRAMENE_4_1	average=85.61	Sn=90.33	Sp=80.89

Exon-level results for NGASP

Sn      = Sensitivity

Sp = Specificity average = (Sn + Sp)/2

Category 1 (ab initio) genefinders

MGENE_1_1	average=78.12	Sn=77.65	Sp=78.58
MGENE_1_2	average=77.87	Sn=77.04	Sp=78.70
MGENE_1_3	average=77.86	Sn=77.08	Sp=78.63
FGENESH_1_1	average=76.08	Sn=78.60	Sp=73.55
CRAIG_1_1	average=75.63	Sn=73.11	Sp=78.15
AUGUSTUS_1_2	average=74.55	Sn=74.76	Sp=74.33
AUGUSTUS_1_1	average=74.33	Sn=76.10	Sp=72.55
GLIMMERHMM_1_1	average=73.84	Sn=76.30	Sp=71.37
GENEMARKHMM_1_1	average=71.02	Sn=76.45	Sp=65.58
EUGENE_1_1	average=70.11	Sn=67.22	Sp=73.00
GENEID_1_1	average=67.34	Sn=66.05	Sp=68.63
EXONHUNTER_1_1	average=65.12	Sn=67.70	Sp=62.53
SNAP_1_1	average=64.61	Sn=67.92	Sp=61.30
AGENE_1_1	average=60.40	Sn=59.71	Sp=61.09

Category 2 (dual/multi-genome) genefinders

MGENE_2_1	average=78.58	Sn=78.80	Sp=78.35
MGENE_2_2	average=78.57	Sn=78.84	Sp=78.30
EUGENE_2_1	average=73.17	Sn=73.51	Sp=72.82
NSCAN_2_1	average=71.67	Sn=72.50	Sp=70.83
SGP2_2_1	average=67.22	Sn=64.17	Sp=70.27

Category 3 genefinders (using alignments of proteins, ESTs, or mRNAs)

FGENESH++_3_1	average=82.50	Sn=84.06	Sp=80.93
AUGUSTUS_3_1	average=81.40	Sn=82.59	Sp=80.20
MGENE_3_1	average=80.45	Sn=80.22	Sp=80.67
MGENE_3_3	average=80.44	Sn=80.27	Sp=80.61
MGENE_3_2	average=78.58	Sn=81.27	Sp=75.88
EUGENE_3_1	average=76.14	Sn=80.05	Sp=72.22
EUGENE_3_2	average=76.07	Sn=81.81	Sp=70.32
GRAMENE_3_1	average=72.93	Sn=74.09	Sp=71.76
EXONHUNTER_3_1	average=72.26	Sn=75.19	Sp=69.33
GRAMENE_3_2	average=71.77	Sn=75.76	Sp=67.77
EXONHUNTER_3_2	average=67.08	Sn=57.23	Sp=76.92
MAKER_3_1	average=65.63	Sn=64.99	Sp=66.27
MAKER_3_2	average=65.49	Sn=61.48	Sp=69.49

Category 4 genefinders (combiners)

JIGSAW_4_1	average=85.28	Sn=83.20	Sp=87.36
EVIGAN_4_1	average=84.43	Sn=86.54	Sp=82.31
FGENESH++C_4_1	average=84.39	Sn=86.07	Sp=82.70
GENEID_4_1	average=84.27	Sn=84.75	Sp=83.78
GENEID_4_2	average=84.19	Sn=83.35	Sp=85.03
JIGSAW_4_2	average=83.16	Sn=83.28	Sp=83.04
GENOMIX_4_2	average=83.04	Sn=82.55	Sp=83.53
EUGENE_4_1	average=79.80	Sn=84.53	Sp=75.07
EUGENE_4_2	average=79.42	Sn=86.20	Sp=72.63
GLEAN_4_1	average=78.79	Sn=82.21	Sp=75.37
EUGENE_4_3	average=78.21	Sn=82.24	Sp=74.18
EUGENE_4_4	average=77.81	Sn=83.85	Sp=71.76
GENOMIX_4_1	average=77.17	Sn=76.98	Sp=77.36
GESECA_4_1	average=73.44	Sn=80.07	Sp=66.81
GRAMENE_4_1	average=61.56	Sn=74.40	Sp=48.71

Overall summary

Please note that this summary is intended only to give a general flavour for which programs are best at the nucleotide and exon level. This summary only deals with the exon-level and base-level results. The transcript-level and gene-level results (which have yet to be announced) are also important components of gene prediction assessment and will reveal further detail about which programs perform best overall.

In category 1 (ab initio genefinders), MGENE version 1 does best on the exon-level and base-level.

In category 2 (dual/multi-genome gene-finders), MGENE version 1 does best on the exon-level and MGENE version 2 does best on the base-level (but the difference between MGENE versions 1 and 2 is 0.03% at the base-level and 0.01% at the exon-level).

In category 3 (gene-finders that use alignments of proteins, ESTs, or mRNAs), FGENESH++ does best at the exon-level and MGENE version 3 does best at the base-level. However, FGENESH++ is 2.06% better than MGENE version 3 at the exon-level, while MGENE version 3 is only 0.21% better than FGENESH++ at the base-level.

In category 4 (combiners) JIGSAW version 1 does best at the exon-level and JIGSAW version 2 best best at the base-level. However, JIGSAW version 1 does 2.12% better than JIGSAW version 2 at the exon-level, while JIGSAW version 2 does only 0.35% better than JIGSAW version 1 at the base-level. Please note however that not all of the category 4 genefinders used exactly the same training sets, so we will be analysing these results further to figure out the effect of training set.

Get the Predictions

The final set of predictions that will be used to form the canonical gene sets for each species are available on the Wormbase FTP site

@@ Line 566: / Line 566: @@
 The final set of predictions that will be used to form the canonical gene sets for each species are available on the [http://ftp.wormbase.org/nGASP/final_gene_predictions/predictions/ Wormbase FTP site]
+[[Category:User Guide]]
+[[Category:Curation]]

Difference between revisions of "NGASP Results"

Latest revision as of 17:54, 16 August 2010

Contents

NGASP Gene prediction competition

Initial results for NGASP

Nucleotide-level results for NGASP

Exon-level results for NGASP

Overall summary

Get the Predictions

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools