WormBase Model:Gene cluster

From WormBaseWiki
Jump to navigationJump to search

WormBase Models

Curator Comments/Description

This class appears to be almost unused as the added benefit tags aren't populated, just the XREF Contains_gene?

I believe the data is a little stale and would benefit from a cleanup.

Model

?Gene_cluster Title UNIQUE ?Text                                        // Information on the Gene cluster 
              Description ?Text
              Contains_gene ?Gene XREF In_cluster

Proposed Changes

WS242 for phylogenetic-based clusters

Proposed by Karen Yook
Model changes for the capture clusters of genes with correlated conservation patterns

?Gene_cluster
?Analysis classes
?WBProcess
?Paper

Proposed tags are noted

?Gene_cluster class

?Gene_cluster     Title    Unique        ?Text
            Description    ?Text
            Contains_gene ?Gene    XREF    In_cluster
            Contains_WB_gene ?Gene 
                   DB_info Database ?Database ?Database_field ?Text //for genes from non WB species 
            Attribute_of Analysis XREF ?Analysis //for specifying analysis conditions, algorithms etc, for WS242
            Associated_with WBProcess //to allow clusters to be linked to Topics, for WS242
            Reference //paper where cluster was determined, for WS242
?Analysis class - used for adding meta data from large scale experiments


?Analysis    DB_info Database ?Database ?Database_field ?Text
         Title Text
             Based_on_WB_Release Int
             Based_on_DB_Release Text
         Group Project UNIQUE ?Analysis XREF Subproject
             Subproject      ?Analysis XREF Project
         Description ?Text
         Sample ?Condition XREF Analysis
         Expression_cluster ?Expression_cluster XREF Analysis
         Gene_cluster ?Gene_cluster XREF Analysis //XREF for linking analysis used to generate cluster, for WS242
         Species_in_analysis ?Species XREF In_analysis //for recording the species used generate phylogenetic profiles, for WS242
         Reference ?Paper XREF Describes_analysis
         Conducted_by ?Person XREF Conducted  
         URL Text 
?WBProcess    Name  ?Text
        Summary ?Text #Evidence
        Process_term ?Text
        Other_name ?Text
        Related_process ?WBProcess XREF Related_process
        Taxon   NCBITaxonomyID ?Text
        Involved_entity        Gene ?Gene XREF WBProcess #Evidence  
                    Expression_cluster ?Expression_cluster XREF WBProcess  #Evidence
                    Interaction ?Interaction XREF WBProcess #Evidence
                    Anatomy_term ?Anatomy_term XREF WBProcess #Evidence
                    Life_stage ?Life_stage XREF WBProcess #Evidence
                    Molecule ?Molecule XREF WBProcess #Evidence
                    Gene_cluster XREF WBProcess #Evidence //XREF to join clusters to topics, for WS242
        Associated_with        Phenotype ?Phenotype XREF WBProcess #Evidence
                      GO_term ?GO_term  #Evidence
        Picture ?Picture #Evidence
        Movie ?Movie  #Evidence
        Pathway DB_info Database ?Database_field ?Accession_number
        Remark ?Text #Evidence
        Reference ?Paper XREF WBProcess
?Species
    In_analysis ?Analysis XREF Species_in_analysis //for WS242

<pre>

SAMPLE .ace
Gene_cluster : "Correlated conservation with RDE-1"
Description "C. elegans genes showing correlated conservation patterns with RDE-1 are predicted to function in small RNA pathways such as miRNA and siRNA-based gene silencing. The signature phylogenetic profile of the Argonaute proteins is that they are absent in 9 out of 31 Ascomycota species, 1 out of 3 Basidiomycota species, and 6 out of 14 protist species, but have not been lost in any of the 33 animal or 6 land plant species compared. High-confidence candidates from this analysis were tested for defects in RNAi silencing. RDE-1 was used to generate a rank-order list of proteins with the similar phylogenetic profiles.”
Attribute_of Analysis "Tabach et al., 2013 Correlated_conservation"
WBProcess "RNA interference"
Contains_gene    "WBGene00000105"
Contains_gene    "WBGene00000106"
Contains_gene    "WBGene00002987"
Contains_gene    "WBGene00003031"
Contains_gene    "WBGene00003384"
Contains_gene    "WBGene00003957"
Contains_gene    "WBGene00003958"
Contains_gene    "WBGene00003959"
Contains_gene    "WBGene00004093"
Contains_gene    "WBGene00004094"
Contains_gene    "WBGene00004178"
Contains_gene    "WBGene00004179"
Contains_gene    "WBGene00004323"
Contains_gene    "WBGene00006409"
Contains_gene    "WBGene00006449"
Contains_gene    "WBGene00007106"
Contains_gene    "WBGene00007297"
Contains_gene    "WBGene00007578"
Contains_gene    "WBGene00007624"
Contains_gene    "WBGene00010263"
Contains_gene    "WBGene00010480"
Contains_gene    "WBGene00011061"
Contains_gene    "WBGene00011910"
Contains_gene    "WBGene00011945"
Contains_gene    "WBGene00012379"
Contains_gene    "WBGene00013606"
Contains_gene    "WBGene00013942"
Contains_gene    "WBGene00015143"
Contains_gene    "WBGene00015337"
Contains_gene    "WBGene00016134"
Contains_gene    "WBGene00017063"
Contains_gene    "WBGene00017138"
Contains_gene    "WBGene00017592"
Contains_gene    "WBGene00017594"
Contains_gene    "WBGene00017641"
Contains_gene    "WBGene00018757"
Contains_gene    "WBGene00018862"
Contains_gene    "WBGene00018921"
Contains_gene    "WBGene00018984"
Contains_gene    "WBGene00019666"
Contains_gene    "WBGene00019752"
Contains_gene    "WBGene00019862"
Contains_gene    "WBGene00019971"
Contains_gene    "WBGene00020172"
Contains_gene    "WBGene00020509"
Contains_gene    "WBGene00020707"
Contains_gene    "WBGene00021711"
Contains_gene    "WBGene00022877"
Contains_gene    "WBGene00044302"
Reference "WBPaper00041947"
Gene_cluster : "Correlated conservation with miRNA and siRNA pathway proteins"
Description "Validated miRNA and siRNA pathway proteins with a shared phylogenetic profile. The driving pattern of this phylogenetic profile correlation is strong conservation in all animals and particular protists, but no homologue in any of the fungi or plants tested. DCR-1 was used to generate a rank-order list of proteins with the similar phylogenetic profiles."
WBProcess "RNA interference"
Contains_gene    "WBGene00000105"
Contains_gene    "WBGene00000106"
Contains_gene    "WBGene00000472"
Contains_gene    "WBGene00000473"
Contains_gene    "WBGene00000474"
Contains_gene    "WBGene00000507"
Contains_gene    "WBGene00000508"
Contains_gene    "WBGene00000797"
Contains_gene    "WBGene00001090"
Contains_gene    "WBGene00001214"
Contains_gene    "WBGene00001332"
Contains_gene    "WBGene00001816"
Contains_gene    "WBGene00002174"
Contains_gene    "WBGene00003014"
Contains_gene    "WBGene00003026"
Contains_gene    "WBGene00003172"
Contains_gene    "WBGene00003392"
Contains_gene    "WBGene00003499"
Contains_gene    "WBGene00003504"
Contains_gene    "WBGene00003507"
Contains_gene    "WBGene00003598"
Contains_gene    "WBGene00004050"
Contains_gene    "WBGene00004093"
Contains_gene    "WBGene00004094"
Contains_gene    "WBGene00004178"
Contains_gene    "WBGene00004179"
Contains_gene    "WBGene00004323"
Contains_gene    "WBGene00004369"
Contains_gene    "WBGene00004508"
Contains_gene    "WBGene00004509"
Contains_gene    "WBGene00004510"
Contains_gene    "WBGene00004795"
Contains_gene    "WBGene00004894"
Contains_gene    "WBGene00006407"
Contains_gene    "WBGene00006449"
Contains_gene    "WBGene00006477"
Contains_gene    "WBGene00006937"
Contains_gene    "WBGene00006964"
Contains_gene    "WBGene00007106"
Contains_gene    "WBGene00007297"
Contains_gene    "WBGene00007329"
Contains_gene    "WBGene00007386"
Contains_gene    "WBGene00007428"
Contains_gene    "WBGene00007578"
Contains_gene    "WBGene00007624"
Contains_gene    "WBGene00008099"
Contains_gene    "WBGene00008400"
Contains_gene    "WBGene00009163"
Contains_gene    "WBGene00009191"
Contains_gene    "WBGene00009285"
Contains_gene    "WBGene00009305"
Contains_gene    "WBGene00009940"
Contains_gene    "WBGene00010138"
Contains_gene    "WBGene00010263"
Contains_gene    "WBGene00010480"
Contains_gene    "WBGene00011061"
Contains_gene    "WBGene00011109"
Contains_gene    "WBGene00011908"
Contains_gene    "WBGene00011910"
Contains_gene    "WBGene00011945"
Contains_gene    "WBGene00011967"
Contains_gene    "WBGene00012127"
Contains_gene    "WBGene00012319"
Contains_gene    "WBGene00012730"
Contains_gene    "WBGene00013256"
Contains_gene    "WBGene00013326"
Contains_gene    "WBGene00013606"
Contains_gene    "WBGene00013942"
Contains_gene    "WBGene00014220"
Contains_gene    "WBGene00015143"
Contains_gene    "WBGene00015583"
Contains_gene    "WBGene00016566"
Contains_gene    "WBGene00016960"
Contains_gene    "WBGene00017245"
Contains_gene    "WBGene00017641"
Contains_gene    "WBGene00018156"
Contains_gene    "WBGene00018757"
Contains_gene    "WBGene00018862"
Contains_gene    "WBGene00018921"
Contains_gene    "WBGene00019001"
Contains_gene    "WBGene00019082"
Contains_gene    "WBGene00019087"
Contains_gene    "WBGene00019237"
Contains_gene    "WBGene00019403"
Contains_gene    "WBGene00019481"
Contains_gene    "WBGene00019628"
Contains_gene    "WBGene00019666"
Contains_gene    "WBGene00019752"
Contains_gene    "WBGene00019862"
Contains_gene    "WBGene00019971"
Contains_gene    "WBGene00020172"
Contains_gene    "WBGene00020391"
Contains_gene    "WBGene00020401"
Contains_gene    "WBGene00020707"
Contains_gene    "WBGene00021391"
Contains_gene    "WBGene00021711"
Contains_gene    "WBGene00022252"
Contains_gene    "WBGene00022371"
Contains_gene    "WBGene00022587"
Contains_gene    "WBGene00022877"
Contains_gene    "WBGene00185078"
Reference "WBPaper00041947"
Analysis : "Tabach et al., 2013 Correlated conservation"
Based_on_WB_Release "220"
Description "The phylogenetic profiles of approximately 20,000 C. elegans proteins was determined in 85 genomes, representing diverse taxa of the eukaryotic tree of life: 33 animals, 6 land plants, 1 alga, 31 Ascomycota fungi, 3 Basidiomycota fungi and 12 protists. A Bayesian approach was used to integrate the phylogenetic profile analysis with predictions from diverse transcriptional coregulation and proteome interaction data sets to assign a probability for each protein for a role in a small RNA pathway. A non-binary method of phylogenetic profiling was developed and used to cluster all protein sequences encoded by C. elegans genes. BLAST scores were normalized to the length of the query sequence and for relative phylogenetic distance between C. elegans and the queried organism. The matrix of 864,644 conservation scores for the 10,054 C. elegans proteins in the 86 genomes was queried either with a single protein to generate a ranking of other C. elegans proteins with the most similar pattern of conservation values or using a more global hierarchical clustering method. Correlation coefficients were calculated using the normalized phylogenetic profile matrix (NPP) and genes were rank ordered."
Species_in_analysis    "Caenorhabditis elegans"
Species_in_analysis    "Caenorhabditis briggsae"
Species_in_analysis    "Caenorhabditis remanei"
Species_in_analysis    "Caenorhabditis japonica"
Species_in_analysis    "Caenorhabditis brenneri"
Species_in_analysis    "Pristionchus pacificus"
Species_in_analysis    "Aedes aegypti"
Species_in_analysis    "Anopheles gambiae"
Species_in_analysis    "Drosophila melanogaster"
Species_in_analysis    "Pediculus humanus"
Species_in_analysis    "Ixodes scapularis"
Species_in_analysis    "Danio rerio"
Species_in_analysis    "Oryzias latipes"
Species_in_analysis    "Gasterosteus aculeatus"
Species_in_analysis    "Takifugu rubripes"
Species_in_analysis    "Tetraodon nigroviridis"
Species_in_analysis    "Xenopus tropicalis"
Species_in_analysis    "Gallus gallus"
Species_in_analysis    "Taeniopygia guttata"
Species_in_analysis    "Anolis carolinensis"
Species_in_analysis    "Ornithorhynchus anatinus"
Species_in_analysis    "Dasypus novemcinctus"
Species_in_analysis    "Homo sapiens"
Species_in_analysis    "Pan troglodytes"
Species_in_analysis    "Dipodomys ordii"
Species_in_analysis    "Mus musculus"
Species_in_analysis    "Rattus norvegicus"
Species_in_analysis    "Canis familiaris"
Species_in_analysis    "Felis catus"
Species_in_analysis    "Tursiops truncatus"
Species_in_analysis    "Pteropus vampyrus"
Species_in_analysis    "Ciona savignyi"
Species_in_analysis    "Ciona intestinalis"
Species_in_analysis    "Malassezia globosa"
Species_in_analysis    "Coprinopsis cinerea"
Species_in_analysis    "Filobasidiella neoformans"
Species_in_analysis    "Phaeosphaeria nodorum"
Species_in_analysis    "Botryotinia fuckeliana"
Species_in_analysis    "Sclerotinia sclerotiorum"
Species_in_analysis    "Gibberella zeae"
Species_in_analysis    "Chaetomium globosum"
Species_in_analysis    "Podospora anserina"
Species_in_analysis    "Neurospora crassa"
Species_in_analysis    "Neosartorya fischeri"
Species_in_analysis    "Penicillium chrysogenum"
Species_in_analysis    "Aspergillus terreus"
Species_in_analysis    "Aspergillus oryzae"
Species_in_analysis    "Aspergillus niger"
Species_in_analysis    "Aspergillus flavus"
Species_in_analysis    "Aspergillus clavatus"
Species_in_analysis    "Paracoccidioides brasiliensis"
Species_in_analysis    "Arthroderma otae"
Species_in_analysis    "Uncinocarpus reesii"
Species_in_analysis    "Ajellomyces dermatitidis"
Species_in_analysis    "Clavispora lusitaniae"
Species_in_analysis    "Candida albicans"
Species_in_analysis    "Yarrowia lipolytica"
Species_in_analysis    "Lachancea thermotolerans"
Species_in_analysis    "Kluyveromyces lactis"
Species_in_analysis    "Candida glabrata"
Species_in_analysis    "Zygosaccharomyces rouxii"
Species_in_analysis    "Saccharomyces cerevisiae"
Species_in_analysis    "Lodderomyces elongisporus"
Species_in_analysis    "Debaryomyces hansenii"
Species_in_analysis    "Scheffersomyces stipitis"
Species_in_analysis    "Schizosaccharomyces japonicus"
Species_in_analysis    "Schizosaccharomyces pombe"
Species_in_analysis    "Chlamydomonas reinhardtii"
Species_in_analysis    "Sorghum bicolor"
Species_in_analysis    "Brachypodium distachyon"
Species_in_analysis    "Oryza sativa"
Species_in_analysis    "Vitis vinifera"
Species_in_analysis    "Arabidopsis thaliana"
Species_in_analysis    "Populus trichocarpa"
Species_in_analysis    "Thalassiosira pseudonana"
Species_in_analysis    "Phaeodactylum tricornutum"
Species_in_analysis    "Dictyostelium discoideum"
Species_in_analysis    "Theileria annulata"
Species_in_analysis    "Babesia bovis"
Species_in_analysis    "Plasmodium vivax"
Species_in_analysis    "Plasmodium falciparum"
Species_in_analysis    "Cryptosporidium parvum"
Species_in_analysis    "Entamoeba histolytica"
Species_in_analysis    "Trypanosoma brucei"
Species_in_analysis    "Leishmania major"
Species_in_analysis    "Giardia intestinalis"
Reference "WBPaper00041947" 

Unused tags

Title Description