Curated data types

From WormBaseWiki
Jump to navigationJump to search

Data types that are flagged and extracted from each primary research publication in our database

Data type Definition/examples
Species of focus in paper i.e., C. elegans, C. elegans other than Bristol, nematodes other than C. elegans, or non-nematode species.
Genes studied or cloned C. elegans or other Caenorhabditis studied or newly identified in the paper.
Alleles new alleles that do not exist in WormBase already.
Genetic mapping data gene location was determined using genetic tools, e.g., 2-factor, 3-factor interval linkage, Df breakpoints, etc.
Phenotype analysis phenotypes of mutants or phenotypic analysis of "wild-type" nematode strains.
Overexpression phenotype phenotypes caused by the overexpression of genes as a transgene.
Small-scale and large-scale RNAi experiments RNAi sequences used in gene function assays and their phenotypic outcome.
Mosaic analysis results of lineage analysis experiments assaying gene function in specific cells.
Tissue or cell site of action results of experiments where gene function was assayed in specific cells or tissues, such as in the case where gene function was rescued by cell/tissue-specific expression of the gene.
Time of action results of experiments where the timing of a gene's function was assayed, for example with temperature-shift experiments.
Molecular function of a gene product a new/novel molecular function or aspect of molecular function for a gene.
Homolog of a human disease-associated gene whether or not a gene studied in the paper is a homolog of a human gene, which is directly associated with a disease.
Genetic interactions genes were demonstrated to have an effect on the function of another gene. Often this is made apparent by the analysis of double, triple, etc. mutants, or with the use of experiments where RNAi was used concurrent with other RNAi-treatment or mutations.
Functional complementation functional redundancy was demonstrated between separate genes, e.g., the rescue of gen-A by overexpression of gen-B, or any other extragenic sequence, or by the rescue of gene function by a gene from another species.
Gene product interactions physical interactions were demonstrated between protein-protein, RNA-protein, DNA-protein, or through Y2H experiments, etc.
Expression patterns new temporal or spatial (e.g., tissue, subcellular, etc.) patterns of expression of any gene in a wild-type background, which includes reporter gene analysis, antibody staining, In situ hybridization, RT-PCR, Western or Northern blot data.
Images images from the published literature related to gene expression studies; only those images with copy right permission from journal will be extracted and displayed.
Gene regulation alterations in gene expression levels or patterns were reported in response to genetic, chemical, temperature, or any other experimental treatment.
Regulatory sequence features any gene expression regulatory elements, e.g., DNA/RNA elements required for gene expression, promoters, introns, UTR's, DNA binding sites, etc.
Position frequency matrix (PFM) or Position weight matrix (PWM) PFMs or PWMs, which are typically used to define regulatory sites in genomic DNA (e.g., bound by transcription factors) or mRNA (e.g., bound by translational factors or miRNA). PFMs define simple nucleotide frequencies, while PWMs are scaled logarithmically against a background frequency.
Microarray data self-explanatory
Protein analysis in vitro any in vitro protein analysis such as kinase assays, agonist pharmacological studies, reconstitution studies.
Domain analysis particular domains within a protein were targeted for genetic or molecular analysis.
Covalent modification post-translational modifications of a gene product were assayed by mutagenesis or in vitro analysis.
Structural information protein structure data from NMR, X-Ray crystallography, etc.
Mass spectrometry small peptide mass data from mass spectrometry analysis (MS/MS, LCMS, HRMS) using analysis programs such as MASCOT, SEQUEST, X!Tandem, OMSSA, MassMatrix.
C. elegans antibodies antibodies generated against C. elegans gene products in a noncommercial laboratory.
Integrated transgenes integrated transgenes that don't exist in WormBase already.
Transgenes used as tissue markers reporter constructs (integrated transgenes) used to mark certain tissues, subcellular structures, or life stages, etc., as a reference to assay site of action of gene function or location.
Small Molecules molecules, chemicals, or drugs used in gene regulation experiments, or noted as a cause or influence of a phenotype.
Gene structure correction gene structures differ from the ones in WormBase, e.g., different splice-sites, 3'UTR, etc.
Sequence of mutant alleles sequence for any mutation.
New SNPs SNPs were reported that don\'t exist in WormBase already.
Ablation data cells or anatomical unit were ablated by laser or other means (e.g., by expressing a cell-toxic protein).
Cell function novel functions for any anatomical part (e.g., cell, tissue, etc.).
Phylogenetic data evolutionary relationships between or among genes or gene products.
Other bioinformatics analysis bioinformatic data not covered by other data types.