Difference between revisions of "WormBase Model:Sequence"

From WormBaseWiki
Jump to navigationJump to search
(Created page with '__TOC__ WormBase Models == Curator Comments/Description == == Model == <pre> </pre> == Proposed Changes ==')
 
 
Line 5: Line 5:
 
== Curator Comments/Description ==
 
== Curator Comments/Description ==
  
 +
This is a huge class with many deprecated elements.
  
  
Line 11: Line 12:
 
== Model ==
 
== Model ==
 
<pre>
 
<pre>
 +
///////////////////////////////////////////////////////////////////////
 +
//
 +
// ?Sequence class
 +
//
 +
// Stores: genomic DNA objects (chromosomes, superlinks, and clones)
 +
//        cDNAs (ESTs, OSTs, mRNAs)
 +
//        sequences from other species (e.g. targets of BLAT_NEMATODE)
 +
//
 +
///////////////////////////////////////////////////////////////////////
 +
 +
?Sequence DNA UNIQUE ?DNA UNIQUE Int            // Int is the length
 +
                // if you want to register a length without a DNA sequence, then use a dummy sequence object, e.g. "-" 
 +
                // This ensures that when a real sequence appears, its length dominates.
 +
          SMap S_parent UNIQUE Canonical_parent UNIQUE ?Sequence XREF Genomic_non_canonical
 +
                              Genomic_parent  UNIQUE ?Sequence XREF Nongenomic
 +
                              AGP_parent      UNIQUE ?Sequence XREF AGP_fragment // added to hold briggsae data [krb 020726]
 +
              S_child Gene_child ?Gene      XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // maximal extent of Gene objects
 +
                      CDS_child  ?CDS        XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // For ?CDS class [031104 krb]
 +
                      Transcript ?Transcript XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // for ?Transcript class [021126 krb]
 +
                      Pseudogene ?Pseudogene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // For ?Pseudogene class [030801 krb]
 +
                      Transposon ?Transposon XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // for transposons  [020128 dl]
 +
                      Genomic_non_canonical ?Sequence XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info
 +
                      Nongenomic ?Sequence XREF Genomic_parent UNIQUE Int UNIQUE Int #SMap_info      // for Lincoln RNAi  [010226 dl]
 +
                      PCR_product ?PCR_product XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi  [010226 dl]
 +
                      Operon ?Operon XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info          // for Operon data  [dl]
 +
                      AGP_fragment ?Sequence XREF AGP_parent Int UNIQUE Int #SMap_info                // for Briggsae data [020726 krb]
 +
                      Allele ?Variation XREF Sequence UNIQUE Int UNIQUE Int #SMap_info                // SMapped Allele class [021217 krb]
 +
                      Oligo_set ?Oligo_set XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info    //added for Oligo_set class
 +
                      Feature_object ?Feature XREF Sequence UNIQUE Int UNIQUE Int #SMap_info          // dl feature play
 +
                      Feature_data ?Feature_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info
 +
                      Homol_data ?Homol_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info
 +
          Structure  From Source UNIQUE ?Sequence XREF Subsequence
 +
                          Source_exons Int UNIQUE Int // start at 1
 +
                          // WARNING: is this still needed when ?CDS class is present???
 +
                    Subsequence ?Sequence XREF Source UNIQUE Int UNIQUE Int
 +
                    Overlap_right UNIQUE ?Sequence XREF Overlap_left UNIQUE Int // potentially use Overlap_right integer for auto-linking
 +
                    Overlap_left UNIQUE ?Sequence XREF Overlap_right
 +
                    Gap_right UNIQUE Int Text            // 000909 dl added to track gap sizes
 +
                    Clone_left_end ?Clone XREF Clone_left_end UNIQUE Int
 +
                    Clone_right_end ?Clone XREF Clone_right_end UNIQUE Int
 +
                    Flipped
 +
          DB_info    Database ?Database ?Database_field UNIQUE ?Accession_number XREF Sequence
 +
                    Protein_id ?Sequence UNIQUE Text UNIQUE Int // DB_info tag added [011030 krb]
 +
                    Secondary_accession ?Accession_number XREF Sequence
 +
                    DB_remark ?Text    #Evidence      // EMBL/Genbank
 +
                    Keyword ?Keyword  // EMBL/Genbank
 +
                    DB_annotation ?Database UNIQUE ?LongText
 +
                    EMBL_dump_info #EMBL_dump_info
 +
          Origin  From_database UNIQUE ?Database UNIQUE Int    // release number
 +
                  From_author ?Author XREF Sequence
 +
                  From_laboratory UNIQUE ?Laboratory
 +
                  Genetic_code UNIQUE ?Genetic_code // krb 030506
 +
                  Date DateType Text                    // Text for comments on operation
 +
                  Date_directory UNIQUE Text            // date of this version for cosmids, changed sdm 000731 to text
 +
                  Life_stage UNIQUE ?Life_stage // to capture details of ESTs that are from different libraries
 +
                  Species UNIQUE ?Species
 +
                  Library UNIQUE ?Library
 +
                  Strain UNIQUE ?Strain
 +
                  Anatomy_term ?Anatomy_term  // life-stage and tissues that the sequence comes from
 +
                  Analysis UNIQUE ?Analysis    // analysis or project that produced this set of data
 +
                  Read_coverage UNIQUE Float  // average read-coverage in a short-read cluster sequence
 +
          Visible Title UNIQUE ?Text
 +
                  Matching_CDS ?CDS XREF Matching_cDNA #Evidence              // to link ESTs/mRNAs to CDS
 +
                  Matching_transcript ?Transcript XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to RNA genes
 +
                  Matching_pseudogene ?Pseudogene XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to Pseudogenes
 +
                  Clone ?Clone XREF Sequence
 +
                  GO_term ?GO_term XREF Sequence ?GO_code #Evidence
 +
                  Gene ?Gene XREF Other_sequence  // for where mRNAs etc. correspond to a gene
 +
                  Paired_read ?Sequence XREF Paired_read          // dl 020110
 +
                  Reference ?Paper XREF Sequence
 +
                  Expr_pattern ?Expr_pattern XREF Sequence
 +
                  RNAi ?RNAi XREF Sequence
 +
                  Confidential_remark ?Text             
 +
                  Remark ?Text #Evidence
 +
                // tag2 system: names of all objects following next tag are shown in the
 +
                //  general annotation display column as "tag:objname"
 +
          Properties    Genomic_canonical Gene_count UNIQUE Int                  // added tag for dot name tracking [020128 dl]
 +
                        Briggsae_canonical
 +
                        Genomic                                                  // added tag to define genomic sequences not from the consortium
 +
                        cDNA cDNA_EST
 +
                            EST_5                    // Indicate whether this is a 5' or 3' read [010423 dl]
 +
                            EST_3
 +
                            Capped_5                  // Indicates capped 5' end - for Lincoln [06/12/01 krb]
 +
                            TSL_tag                  // Indicates a short RT-PCR product for TSL detection [030220 dl]
 +
                        EST_consensus                  // Designates this object as a consensus
 +
                        // if RNA tag, acedb outputs U in place of T in sequence output
 +
                        RNA UNIQUE mRNA UNIQUE Processed_mRNA      //
 +
                                              Unprocessed_mRNA
 +
                                  tRNA Type UNIQUE Text            // ck1 [030926] krb
 +
                                        Anticodon UNIQUE Text      //
 +
                                  rRNA UNIQUE Text
 +
                                  snRNA UNIQUE Text
 +
                                  snoRNA UNIQUE Text              // [030102 krb]
 +
                                  scRNA UNIQUE Text
 +
                                  miRNA UNIQUE Text                // [020306 kj]
 +
                                  ncRNA UNIQUE Text                // true non-coding RNA molecules
 +
                        Ignore #Evidence                            // tag to flag problem Sequence objects to avoid certain analysis [031120 krb]
 +
                        Show_in_reverse_orientation                // Draw 3' reads in reverse orientation [010423 dl]
 +
                        Status  Received UNIQUE DateType
 +
                                Library_construction UNIQUE DateType
 +
                                Shotgun UNIQUE DateType
 +
                                Shotgun_complete UNIQUE DateType
 +
                                Contiguous UNIQUE DateType
 +
                                Finished UNIQUE DateType
 +
                                Submitted UNIQUE DateType
 +
                                Annotated UNIQUE DateType
 +
                                Archived UNIQUE DateType UNIQUE Text // Date Disk
 +
                        Match_type  UNIQUE Match_with_function
 +
                                          Match_without_function
 +
                                // These are designed specifically for measuring
 +
                                // statistics.  What you match should be listed in
 +
                                // Brief_id, Remark etc.  The aim now is to use Brief_id
 +
                                // exactly for what you would like a half-line summary to
 +
                                // contain, for making tables etc.
 +
                        Link // Enable gene curation of link genes [020805 krb]
 +
          Splices      Confirmed_intron  Int Int #Splice_confirmation
 +
                        Predicted_5 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1)
 +
                        Predicted_3 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1)
 +
          Cluster_information Contains_reads      ?Sequence XREF Contained_in_cluster  // Links cluster contig and
 +
                              Contained_in_cluster ?Sequence XREF Contains_reads        // individual reads
 +
          Map ?Map XREF Sequence #Map_position          // use in particular for Genomic_canonical
 +
          Interpolated_map_position UNIQUE ?Map UNIQUE Float // For updated CDS-based interpolated map positions [krb 030502]
 +
          Oligo ?Oligo XREF In_sequence Int UNIQUE Int  // for OSP and human mapping mostly
 +
          Defines_feature ?Feature XREF Defined_by_sequence #Evidence      // Feature data model [dl 030304]
 +
          Assembly_tags Text Int Int Text                                  // type, start, stop, comment
 +
          Gene_regulation Cis_regulator    ?Gene_regulation XREF Cis_regulator_seq // Wen
 +
          YH_bait  ?YH XREF Sequence_bait ?Text// for yeast two hybrid data
 +
          YH_target ?YH XREF Sequence_target ?Text
 +
          Homol DNA_homol ?Sequence XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
 +
                Pep_homol ?Protein XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
 +
                Motif_homol ?Motif XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
 +
                Homol_homol ?Homol_data XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
 +
                        // We will generate a column for each distinct ?Method.  So for
 +
                        // distinct Worm_EST and Worm_genomic columns, use ?Method objects
 +
                        // Worm_EST_Blastn and Worm_genomic_Blastn.
 +
          Method UNIQUE ?Method
  
 
</pre>
 
</pre>
  
 
== Proposed Changes ==
 
== Proposed Changes ==
 +
 +
==Unused tags==
 +
 +
AGP_parent
 +
 +
AGP_fragment
 +
 +
Source_exons
 +
 +
Flipped
 +
 +
Library - will probably use this when creating the ?Library objects for the original C_elegans sequencing.
 +
 +
Strain
 +
 +
Anatomy_term
 +
 +
Read_coverage
 +
 +
GO_term
 +
 +
Confidential_remark
 +
 +
Gene_count
 +
 +
Briggsae_canonical
 +
 +
EST_consensus
 +
 +
Processed_mRNA
 +
 +
Unprocessed_mRNA
 +
 +
tRNA Type
 +
 +
tRNA Anticodon
 +
 +
snoRNA
 +
 +
scRNA
 +
 +
Received
 +
 +
Library_construction
 +
 +
Shotgun_complete
 +
 +
Archived
 +
 +
Match_type
 +
 +
Match_with_function
 +
 +
Match_without_function
 +
 +
Predicted_5
 +
 +
Predicted_3
 +
 +
Cluster_information
 +
 +
Contains_reads
 +
 +
Contained_in_cluster
 +
 +
Gene_regulation
 +
 +
Cis_regulator
 +
 +
YH_bait
 +
 +
DNA_homol
 +
 +
Pep_homol

Latest revision as of 09:29, 5 October 2010

WormBase Models

Curator Comments/Description

This is a huge class with many deprecated elements.



Model

///////////////////////////////////////////////////////////////////////
//
// ?Sequence class
//
// Stores: genomic DNA objects (chromosomes, superlinks, and clones)
//         cDNAs (ESTs, OSTs, mRNAs)
//         sequences from other species (e.g. targets of BLAT_NEMATODE)
//
///////////////////////////////////////////////////////////////////////

?Sequence DNA UNIQUE ?DNA UNIQUE Int            // Int is the length
                // if you want to register a length without a DNA sequence, then use a dummy sequence object, e.g. "-"  
                // This ensures that when a real sequence appears, its length dominates.
          SMap S_parent UNIQUE Canonical_parent UNIQUE ?Sequence XREF Genomic_non_canonical 
                               Genomic_parent   UNIQUE ?Sequence XREF Nongenomic
                               AGP_parent       UNIQUE ?Sequence XREF AGP_fragment // added to hold briggsae data [krb 020726]
               S_child Gene_child ?Gene       XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // maximal extent of Gene objects
                       CDS_child  ?CDS        XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // For ?CDS class [031104 krb]
                       Transcript ?Transcript XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // for ?Transcript class [021126 krb]
                       Pseudogene ?Pseudogene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // For ?Pseudogene class [030801 krb]
                       Transposon ?Transposon XREF Sequence UNIQUE Int UNIQUE Int #SMap_info  // for transposons   [020128 dl]
                       Genomic_non_canonical ?Sequence XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info
                       Nongenomic ?Sequence XREF Genomic_parent UNIQUE Int UNIQUE Int #SMap_info       // for Lincoln RNAi  [010226 dl]
                       PCR_product ?PCR_product XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi  [010226 dl]
                       Operon ?Operon XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info           // for Operon data   [dl]
                       AGP_fragment ?Sequence XREF AGP_parent Int UNIQUE Int #SMap_info                // for Briggsae data [020726 krb]
                       Allele ?Variation XREF Sequence UNIQUE Int UNIQUE Int #SMap_info                // SMapped Allele class [021217 krb]
                       Oligo_set ?Oligo_set XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info     //added for Oligo_set class
                       Feature_object ?Feature XREF Sequence UNIQUE Int UNIQUE Int #SMap_info          // dl feature play
                       Feature_data ?Feature_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info
                       Homol_data ?Homol_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info
          Structure  From Source UNIQUE ?Sequence XREF Subsequence
                          Source_exons Int UNIQUE Int // start at 1 
                          // WARNING: is this still needed when ?CDS class is present???
                     Subsequence ?Sequence XREF Source UNIQUE Int UNIQUE Int
                     Overlap_right UNIQUE ?Sequence XREF Overlap_left UNIQUE Int // potentially use Overlap_right integer for auto-linking
                     Overlap_left UNIQUE ?Sequence XREF Overlap_right
                     Gap_right UNIQUE Int Text            // 000909 dl added to track gap sizes
                     Clone_left_end ?Clone XREF Clone_left_end UNIQUE Int
                     Clone_right_end ?Clone XREF Clone_right_end UNIQUE Int
                     Flipped
          DB_info    Database ?Database ?Database_field UNIQUE ?Accession_number XREF Sequence
                     Protein_id ?Sequence UNIQUE Text UNIQUE Int // DB_info tag added [011030 krb]
                     Secondary_accession ?Accession_number XREF Sequence
                     DB_remark ?Text    #Evidence       // EMBL/Genbank
                     Keyword ?Keyword   // EMBL/Genbank
                     DB_annotation ?Database UNIQUE ?LongText
                     EMBL_dump_info #EMBL_dump_info
          Origin  From_database UNIQUE ?Database UNIQUE Int     // release number
                  From_author ?Author XREF Sequence
                  From_laboratory UNIQUE ?Laboratory
                  Genetic_code UNIQUE ?Genetic_code // krb 030506
                  Date DateType Text                    // Text for comments on operation
                  Date_directory UNIQUE Text            // date of this version for cosmids, changed sdm 000731 to text
                  Life_stage UNIQUE ?Life_stage // to capture details of ESTs that are from different libraries
                  Species UNIQUE ?Species
                  Library UNIQUE ?Library
                  Strain UNIQUE ?Strain
                  Anatomy_term ?Anatomy_term   // life-stage and tissues that the sequence comes from
                  Analysis UNIQUE ?Analysis    // analysis or project that produced this set of data 
                  Read_coverage UNIQUE Float   // average read-coverage in a short-read cluster sequence
          Visible Title UNIQUE ?Text
                  Matching_CDS ?CDS XREF Matching_cDNA #Evidence               // to link ESTs/mRNAs to CDS 
                  Matching_transcript ?Transcript XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to RNA genes
                  Matching_pseudogene ?Pseudogene XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to Pseudogenes
                  Clone ?Clone XREF Sequence 
                  GO_term ?GO_term XREF Sequence ?GO_code #Evidence
                  Gene ?Gene XREF Other_sequence  // for where mRNAs etc. correspond to a gene
                  Paired_read ?Sequence XREF Paired_read          // dl 020110
                  Reference ?Paper XREF Sequence
                  Expr_pattern ?Expr_pattern XREF Sequence
                  RNAi ?RNAi XREF Sequence
                  Confidential_remark ?Text               
                  Remark ?Text #Evidence
                // tag2 system: names of all objects following next tag are shown in the 
                //   general annotation display column as "tag:objname"
          Properties    Genomic_canonical Gene_count UNIQUE Int                  // added tag for dot name tracking [020128 dl]
                        Briggsae_canonical
                        Genomic                                                  // added tag to define genomic sequences not from the consortium
                        cDNA cDNA_EST
                             EST_5                     // Indicate whether this is a 5' or 3' read [010423 dl]
                             EST_3
                             Capped_5                  // Indicates capped 5' end - for Lincoln [06/12/01 krb]
                             TSL_tag                   // Indicates a short RT-PCR product for TSL detection [030220 dl]
                        EST_consensus                  // Designates this object as a consensus
                        // if RNA tag, acedb outputs U in place of T in sequence output
                        RNA UNIQUE mRNA UNIQUE Processed_mRNA       // 
                                               Unprocessed_mRNA
                                   tRNA Type UNIQUE Text            // ck1 [030926] krb
                                        Anticodon UNIQUE Text       // 
                                   rRNA UNIQUE Text
                                   snRNA UNIQUE Text
                                   snoRNA UNIQUE Text               // [030102 krb]
                                   scRNA UNIQUE Text
                                   miRNA UNIQUE Text                // [020306 kj]
                                   ncRNA UNIQUE Text                // true non-coding RNA molecules
                        Ignore #Evidence                            // tag to flag problem Sequence objects to avoid certain analysis [031120 krb]
                        Show_in_reverse_orientation                 // Draw 3' reads in reverse orientation [010423 dl]
                        Status  Received UNIQUE DateType
                                Library_construction UNIQUE DateType
                                Shotgun UNIQUE DateType
                                Shotgun_complete UNIQUE DateType
                                Contiguous UNIQUE DateType
                                Finished UNIQUE DateType
                                Submitted UNIQUE DateType
                                Annotated UNIQUE DateType
                                Archived UNIQUE DateType UNIQUE Text // Date Disk
                        Match_type  UNIQUE Match_with_function
                                           Match_without_function
                                // These are designed specifically for measuring 
                                // statistics.  What you match should be listed in 
                                // Brief_id, Remark etc.  The aim now is to use Brief_id
                                // exactly for what you would like a half-line summary to
                                // contain, for making tables etc.
                        Link // Enable gene curation of link genes [020805 krb]
          Splices       Confirmed_intron  Int Int #Splice_confirmation
                        Predicted_5 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1)
                        Predicted_3 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1)
          Cluster_information Contains_reads       ?Sequence XREF Contained_in_cluster  // Links cluster contig and
                              Contained_in_cluster ?Sequence XREF Contains_reads        // individual reads
          Map ?Map XREF Sequence #Map_position          // use in particular for Genomic_canonical
          Interpolated_map_position UNIQUE ?Map UNIQUE Float // For updated CDS-based interpolated map positions [krb 030502]
          Oligo ?Oligo XREF In_sequence Int UNIQUE Int  // for OSP and human mapping mostly
          Defines_feature ?Feature XREF Defined_by_sequence #Evidence       // Feature data model [dl 030304]
          Assembly_tags Text Int Int Text                                   // type, start, stop, comment
          Gene_regulation Cis_regulator    ?Gene_regulation XREF Cis_regulator_seq // Wen
          YH_bait   ?YH XREF Sequence_bait ?Text// for yeast two hybrid data
          YH_target ?YH XREF Sequence_target ?Text
          Homol DNA_homol ?Sequence XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
                Pep_homol ?Protein XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
                Motif_homol ?Motif XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
                Homol_homol ?Homol_data XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info
                        // We will generate a column for each distinct ?Method.  So for
                        // distinct Worm_EST and Worm_genomic columns, use ?Method objects
                        // Worm_EST_Blastn and Worm_genomic_Blastn.
          Method UNIQUE ?Method

Proposed Changes

Unused tags

AGP_parent

AGP_fragment

Source_exons

Flipped

Library - will probably use this when creating the ?Library objects for the original C_elegans sequencing.

Strain

Anatomy_term

Read_coverage

GO_term

Confidential_remark

Gene_count

Briggsae_canonical

EST_consensus

Processed_mRNA

Unprocessed_mRNA

tRNA Type

tRNA Anticodon

snoRNA

scRNA

Received

Library_construction

Shotgun_complete

Archived

Match_type

Match_with_function

Match_without_function

Predicted_5

Predicted_3

Cluster_information

Contains_reads

Contained_in_cluster

Gene_regulation

Cis_regulator

YH_bait

DNA_homol

Pep_homol