UserGuide:SimpleMine

From WormBaseWiki
Jump to navigationJump to search

SimpleMine Users' Guide

SimpleMine is designed for biologists who want to get essential information for a list of genes without any command-line or programming skill. We consider the following as "essential information" based on user feedback. Please feel free to contact us if you want to include more information on the list.

Input: Users can submit CGC names, sequence names, WormBase Gene IDs, WormPep IDs, UniProt IDs, TreeFam IDs, and RefSeq IDs. Output: Users can opt for HTML display or download a tab-delimited file. One row per gene, each cell contains one data field. In each cell, information is divided according to the following tier: comma-separated, bar-separated, semicolon-separated. Please see the following explanations for each data field about how information is organized for a particular type of data.


Names, Identifiers, Sequences, Species

WormBase Gene ID

Public Name

Species

Sequence Name

Other Name

Transcript

Operon

WormPep

Protein Domain

UniProt

Reference UniProt ID

TreeFam

RefSeq_mRNA

RefSeq_protein


Genetics, Phenotypes, Interactions

Genetic Map Position: Display Chromosome and chromosomal position of the gene.

RNAi Phenotype Observed: Display the phenotype ontology names sorted in alphabetical order.

Allele Phenotype Observed: Display the phenotype ontology names sorted in alphabetical order.

Coding_exon Non_silent Allele: Among those alleles that were sequenced, we exclude polymorphisms and alleles that fall in any coding exon. Alleles are sorted and displayed according to the following order of their types: Deletion, Insertion, Substitution, Tandem_duplication. Each allele entry contains three bar-separated fields: allele name, allele type, and molecular change.

Interacting Gene: We only display experimentally confirmed gene interactions (Physical, Genetic, and Regulatory). The genes are displayed in the following order: genes with all three types of interactions detected, genes with two out of three types of interactions detected, genes with one type of interaction detected. Each gene entry contains two bar-separated fields: gene name and interaction types. The interaction types are separated with semicolons.


Expression

Expr_pattern Tissue: Anatomical expression based on GFP, immunoprecipitation, In_situ, etc. Anatomy names are displayed in alphabetical order.

Genomic Study Tissue: Tissue enrichment based on the microarray, RNA-Seq, and proteomics studies. Anatomy names are displayed in alphabetical order.

Expr_pattern LifeStage: Developmental expression based on GFP, immunoprecipitation, In_situ, etc. Life stages are displayed in alphabetical order.

Genomic Study LifeStage: Developmental expression based on the microarray, RNA-Seq, and proteomics studies. Life stages are displayed in alphabetical order.


Human Orthologs and Disease

Disease Info: Display the disease names associated with the gene. Each disease entry contains two bar-separated fields: disease name and the evidence (By Orthology or By Experiment).

Human Ortholog: Display the human orthologs of the gene. Each ortholog entry contains two bar-separated fields: ortholog name and algorithms that predicted the orthology. The algorithms are separated with semicolons.


Functional Annotation and References

Gene Ontology Association: Display the names of gene ontology terms that were annotated to the gene, sorted in alphabetic order.

Concise Description: Outdated manually written descriptions of the gene functions.

Automated Description: Up-to-date gene description machine generated based on the current WormBase data.

Expression Cluster Summary: Gene regulation, molecular regulation, and tissue enrichment summary based on the microarray, RNA-Seq, and proteomics studies.

Reference: Primary research articles that studied the gene.