Difference between revisions of "WormBase Model:Sequence"
From WormBaseWiki
Jump to navigationJump to search (Created page with '__TOC__ WormBase Models == Curator Comments/Description == == Model == <pre> </pre> == Proposed Changes ==') |
|||
Line 5: | Line 5: | ||
== Curator Comments/Description == | == Curator Comments/Description == | ||
+ | This is a huge class with many deprecated elements. | ||
Line 11: | Line 12: | ||
== Model == | == Model == | ||
<pre> | <pre> | ||
+ | /////////////////////////////////////////////////////////////////////// | ||
+ | // | ||
+ | // ?Sequence class | ||
+ | // | ||
+ | // Stores: genomic DNA objects (chromosomes, superlinks, and clones) | ||
+ | // cDNAs (ESTs, OSTs, mRNAs) | ||
+ | // sequences from other species (e.g. targets of BLAT_NEMATODE) | ||
+ | // | ||
+ | /////////////////////////////////////////////////////////////////////// | ||
+ | |||
+ | ?Sequence DNA UNIQUE ?DNA UNIQUE Int // Int is the length | ||
+ | // if you want to register a length without a DNA sequence, then use a dummy sequence object, e.g. "-" | ||
+ | // This ensures that when a real sequence appears, its length dominates. | ||
+ | SMap S_parent UNIQUE Canonical_parent UNIQUE ?Sequence XREF Genomic_non_canonical | ||
+ | Genomic_parent UNIQUE ?Sequence XREF Nongenomic | ||
+ | AGP_parent UNIQUE ?Sequence XREF AGP_fragment // added to hold briggsae data [krb 020726] | ||
+ | S_child Gene_child ?Gene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // maximal extent of Gene objects | ||
+ | CDS_child ?CDS XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // For ?CDS class [031104 krb] | ||
+ | Transcript ?Transcript XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // for ?Transcript class [021126 krb] | ||
+ | Pseudogene ?Pseudogene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // For ?Pseudogene class [030801 krb] | ||
+ | Transposon ?Transposon XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // for transposons [020128 dl] | ||
+ | Genomic_non_canonical ?Sequence XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info | ||
+ | Nongenomic ?Sequence XREF Genomic_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi [010226 dl] | ||
+ | PCR_product ?PCR_product XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi [010226 dl] | ||
+ | Operon ?Operon XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Operon data [dl] | ||
+ | AGP_fragment ?Sequence XREF AGP_parent Int UNIQUE Int #SMap_info // for Briggsae data [020726 krb] | ||
+ | Allele ?Variation XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // SMapped Allele class [021217 krb] | ||
+ | Oligo_set ?Oligo_set XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info //added for Oligo_set class | ||
+ | Feature_object ?Feature XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // dl feature play | ||
+ | Feature_data ?Feature_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info | ||
+ | Homol_data ?Homol_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info | ||
+ | Structure From Source UNIQUE ?Sequence XREF Subsequence | ||
+ | Source_exons Int UNIQUE Int // start at 1 | ||
+ | // WARNING: is this still needed when ?CDS class is present??? | ||
+ | Subsequence ?Sequence XREF Source UNIQUE Int UNIQUE Int | ||
+ | Overlap_right UNIQUE ?Sequence XREF Overlap_left UNIQUE Int // potentially use Overlap_right integer for auto-linking | ||
+ | Overlap_left UNIQUE ?Sequence XREF Overlap_right | ||
+ | Gap_right UNIQUE Int Text // 000909 dl added to track gap sizes | ||
+ | Clone_left_end ?Clone XREF Clone_left_end UNIQUE Int | ||
+ | Clone_right_end ?Clone XREF Clone_right_end UNIQUE Int | ||
+ | Flipped | ||
+ | DB_info Database ?Database ?Database_field UNIQUE ?Accession_number XREF Sequence | ||
+ | Protein_id ?Sequence UNIQUE Text UNIQUE Int // DB_info tag added [011030 krb] | ||
+ | Secondary_accession ?Accession_number XREF Sequence | ||
+ | DB_remark ?Text #Evidence // EMBL/Genbank | ||
+ | Keyword ?Keyword // EMBL/Genbank | ||
+ | DB_annotation ?Database UNIQUE ?LongText | ||
+ | EMBL_dump_info #EMBL_dump_info | ||
+ | Origin From_database UNIQUE ?Database UNIQUE Int // release number | ||
+ | From_author ?Author XREF Sequence | ||
+ | From_laboratory UNIQUE ?Laboratory | ||
+ | Genetic_code UNIQUE ?Genetic_code // krb 030506 | ||
+ | Date DateType Text // Text for comments on operation | ||
+ | Date_directory UNIQUE Text // date of this version for cosmids, changed sdm 000731 to text | ||
+ | Life_stage UNIQUE ?Life_stage // to capture details of ESTs that are from different libraries | ||
+ | Species UNIQUE ?Species | ||
+ | Library UNIQUE ?Library | ||
+ | Strain UNIQUE ?Strain | ||
+ | Anatomy_term ?Anatomy_term // life-stage and tissues that the sequence comes from | ||
+ | Analysis UNIQUE ?Analysis // analysis or project that produced this set of data | ||
+ | Read_coverage UNIQUE Float // average read-coverage in a short-read cluster sequence | ||
+ | Visible Title UNIQUE ?Text | ||
+ | Matching_CDS ?CDS XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to CDS | ||
+ | Matching_transcript ?Transcript XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to RNA genes | ||
+ | Matching_pseudogene ?Pseudogene XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to Pseudogenes | ||
+ | Clone ?Clone XREF Sequence | ||
+ | GO_term ?GO_term XREF Sequence ?GO_code #Evidence | ||
+ | Gene ?Gene XREF Other_sequence // for where mRNAs etc. correspond to a gene | ||
+ | Paired_read ?Sequence XREF Paired_read // dl 020110 | ||
+ | Reference ?Paper XREF Sequence | ||
+ | Expr_pattern ?Expr_pattern XREF Sequence | ||
+ | RNAi ?RNAi XREF Sequence | ||
+ | Confidential_remark ?Text | ||
+ | Remark ?Text #Evidence | ||
+ | // tag2 system: names of all objects following next tag are shown in the | ||
+ | // general annotation display column as "tag:objname" | ||
+ | Properties Genomic_canonical Gene_count UNIQUE Int // added tag for dot name tracking [020128 dl] | ||
+ | Briggsae_canonical | ||
+ | Genomic // added tag to define genomic sequences not from the consortium | ||
+ | cDNA cDNA_EST | ||
+ | EST_5 // Indicate whether this is a 5' or 3' read [010423 dl] | ||
+ | EST_3 | ||
+ | Capped_5 // Indicates capped 5' end - for Lincoln [06/12/01 krb] | ||
+ | TSL_tag // Indicates a short RT-PCR product for TSL detection [030220 dl] | ||
+ | EST_consensus // Designates this object as a consensus | ||
+ | // if RNA tag, acedb outputs U in place of T in sequence output | ||
+ | RNA UNIQUE mRNA UNIQUE Processed_mRNA // | ||
+ | Unprocessed_mRNA | ||
+ | tRNA Type UNIQUE Text // ck1 [030926] krb | ||
+ | Anticodon UNIQUE Text // | ||
+ | rRNA UNIQUE Text | ||
+ | snRNA UNIQUE Text | ||
+ | snoRNA UNIQUE Text // [030102 krb] | ||
+ | scRNA UNIQUE Text | ||
+ | miRNA UNIQUE Text // [020306 kj] | ||
+ | ncRNA UNIQUE Text // true non-coding RNA molecules | ||
+ | Ignore #Evidence // tag to flag problem Sequence objects to avoid certain analysis [031120 krb] | ||
+ | Show_in_reverse_orientation // Draw 3' reads in reverse orientation [010423 dl] | ||
+ | Status Received UNIQUE DateType | ||
+ | Library_construction UNIQUE DateType | ||
+ | Shotgun UNIQUE DateType | ||
+ | Shotgun_complete UNIQUE DateType | ||
+ | Contiguous UNIQUE DateType | ||
+ | Finished UNIQUE DateType | ||
+ | Submitted UNIQUE DateType | ||
+ | Annotated UNIQUE DateType | ||
+ | Archived UNIQUE DateType UNIQUE Text // Date Disk | ||
+ | Match_type UNIQUE Match_with_function | ||
+ | Match_without_function | ||
+ | // These are designed specifically for measuring | ||
+ | // statistics. What you match should be listed in | ||
+ | // Brief_id, Remark etc. The aim now is to use Brief_id | ||
+ | // exactly for what you would like a half-line summary to | ||
+ | // contain, for making tables etc. | ||
+ | Link // Enable gene curation of link genes [020805 krb] | ||
+ | Splices Confirmed_intron Int Int #Splice_confirmation | ||
+ | Predicted_5 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1) | ||
+ | Predicted_3 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1) | ||
+ | Cluster_information Contains_reads ?Sequence XREF Contained_in_cluster // Links cluster contig and | ||
+ | Contained_in_cluster ?Sequence XREF Contains_reads // individual reads | ||
+ | Map ?Map XREF Sequence #Map_position // use in particular for Genomic_canonical | ||
+ | Interpolated_map_position UNIQUE ?Map UNIQUE Float // For updated CDS-based interpolated map positions [krb 030502] | ||
+ | Oligo ?Oligo XREF In_sequence Int UNIQUE Int // for OSP and human mapping mostly | ||
+ | Defines_feature ?Feature XREF Defined_by_sequence #Evidence // Feature data model [dl 030304] | ||
+ | Assembly_tags Text Int Int Text // type, start, stop, comment | ||
+ | Gene_regulation Cis_regulator ?Gene_regulation XREF Cis_regulator_seq // Wen | ||
+ | YH_bait ?YH XREF Sequence_bait ?Text// for yeast two hybrid data | ||
+ | YH_target ?YH XREF Sequence_target ?Text | ||
+ | Homol DNA_homol ?Sequence XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info | ||
+ | Pep_homol ?Protein XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info | ||
+ | Motif_homol ?Motif XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info | ||
+ | Homol_homol ?Homol_data XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info | ||
+ | // We will generate a column for each distinct ?Method. So for | ||
+ | // distinct Worm_EST and Worm_genomic columns, use ?Method objects | ||
+ | // Worm_EST_Blastn and Worm_genomic_Blastn. | ||
+ | Method UNIQUE ?Method | ||
</pre> | </pre> | ||
== Proposed Changes == | == Proposed Changes == | ||
+ | |||
+ | ==Unused tags== | ||
+ | |||
+ | AGP_parent | ||
+ | |||
+ | AGP_fragment | ||
+ | |||
+ | Source_exons | ||
+ | |||
+ | Flipped | ||
+ | |||
+ | Library - will probably use this when creating the ?Library objects for the original C_elegans sequencing. | ||
+ | |||
+ | Strain | ||
+ | |||
+ | Anatomy_term | ||
+ | |||
+ | Read_coverage | ||
+ | |||
+ | GO_term | ||
+ | |||
+ | Confidential_remark | ||
+ | |||
+ | Gene_count | ||
+ | |||
+ | Briggsae_canonical | ||
+ | |||
+ | EST_consensus | ||
+ | |||
+ | Processed_mRNA | ||
+ | |||
+ | Unprocessed_mRNA | ||
+ | |||
+ | tRNA Type | ||
+ | |||
+ | tRNA Anticodon | ||
+ | |||
+ | snoRNA | ||
+ | |||
+ | scRNA | ||
+ | |||
+ | Received | ||
+ | |||
+ | Library_construction | ||
+ | |||
+ | Shotgun_complete | ||
+ | |||
+ | Archived | ||
+ | |||
+ | Match_type | ||
+ | |||
+ | Match_with_function | ||
+ | |||
+ | Match_without_function | ||
+ | |||
+ | Predicted_5 | ||
+ | |||
+ | Predicted_3 | ||
+ | |||
+ | Cluster_information | ||
+ | |||
+ | Contains_reads | ||
+ | |||
+ | Contained_in_cluster | ||
+ | |||
+ | Gene_regulation | ||
+ | |||
+ | Cis_regulator | ||
+ | |||
+ | YH_bait | ||
+ | |||
+ | DNA_homol | ||
+ | |||
+ | Pep_homol |
Latest revision as of 09:29, 5 October 2010
Curator Comments/Description
This is a huge class with many deprecated elements.
Model
/////////////////////////////////////////////////////////////////////// // // ?Sequence class // // Stores: genomic DNA objects (chromosomes, superlinks, and clones) // cDNAs (ESTs, OSTs, mRNAs) // sequences from other species (e.g. targets of BLAT_NEMATODE) // /////////////////////////////////////////////////////////////////////// ?Sequence DNA UNIQUE ?DNA UNIQUE Int // Int is the length // if you want to register a length without a DNA sequence, then use a dummy sequence object, e.g. "-" // This ensures that when a real sequence appears, its length dominates. SMap S_parent UNIQUE Canonical_parent UNIQUE ?Sequence XREF Genomic_non_canonical Genomic_parent UNIQUE ?Sequence XREF Nongenomic AGP_parent UNIQUE ?Sequence XREF AGP_fragment // added to hold briggsae data [krb 020726] S_child Gene_child ?Gene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // maximal extent of Gene objects CDS_child ?CDS XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // For ?CDS class [031104 krb] Transcript ?Transcript XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // for ?Transcript class [021126 krb] Pseudogene ?Pseudogene XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // For ?Pseudogene class [030801 krb] Transposon ?Transposon XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // for transposons [020128 dl] Genomic_non_canonical ?Sequence XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info Nongenomic ?Sequence XREF Genomic_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi [010226 dl] PCR_product ?PCR_product XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Lincoln RNAi [010226 dl] Operon ?Operon XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info // for Operon data [dl] AGP_fragment ?Sequence XREF AGP_parent Int UNIQUE Int #SMap_info // for Briggsae data [020726 krb] Allele ?Variation XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // SMapped Allele class [021217 krb] Oligo_set ?Oligo_set XREF Canonical_parent UNIQUE Int UNIQUE Int #SMap_info //added for Oligo_set class Feature_object ?Feature XREF Sequence UNIQUE Int UNIQUE Int #SMap_info // dl feature play Feature_data ?Feature_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info Homol_data ?Homol_data XREF Sequence UNIQUE Int UNIQUE Int #SMap_info Structure From Source UNIQUE ?Sequence XREF Subsequence Source_exons Int UNIQUE Int // start at 1 // WARNING: is this still needed when ?CDS class is present??? Subsequence ?Sequence XREF Source UNIQUE Int UNIQUE Int Overlap_right UNIQUE ?Sequence XREF Overlap_left UNIQUE Int // potentially use Overlap_right integer for auto-linking Overlap_left UNIQUE ?Sequence XREF Overlap_right Gap_right UNIQUE Int Text // 000909 dl added to track gap sizes Clone_left_end ?Clone XREF Clone_left_end UNIQUE Int Clone_right_end ?Clone XREF Clone_right_end UNIQUE Int Flipped DB_info Database ?Database ?Database_field UNIQUE ?Accession_number XREF Sequence Protein_id ?Sequence UNIQUE Text UNIQUE Int // DB_info tag added [011030 krb] Secondary_accession ?Accession_number XREF Sequence DB_remark ?Text #Evidence // EMBL/Genbank Keyword ?Keyword // EMBL/Genbank DB_annotation ?Database UNIQUE ?LongText EMBL_dump_info #EMBL_dump_info Origin From_database UNIQUE ?Database UNIQUE Int // release number From_author ?Author XREF Sequence From_laboratory UNIQUE ?Laboratory Genetic_code UNIQUE ?Genetic_code // krb 030506 Date DateType Text // Text for comments on operation Date_directory UNIQUE Text // date of this version for cosmids, changed sdm 000731 to text Life_stage UNIQUE ?Life_stage // to capture details of ESTs that are from different libraries Species UNIQUE ?Species Library UNIQUE ?Library Strain UNIQUE ?Strain Anatomy_term ?Anatomy_term // life-stage and tissues that the sequence comes from Analysis UNIQUE ?Analysis // analysis or project that produced this set of data Read_coverage UNIQUE Float // average read-coverage in a short-read cluster sequence Visible Title UNIQUE ?Text Matching_CDS ?CDS XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to CDS Matching_transcript ?Transcript XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to RNA genes Matching_pseudogene ?Pseudogene XREF Matching_cDNA #Evidence // to link ESTs/mRNAs to Pseudogenes Clone ?Clone XREF Sequence GO_term ?GO_term XREF Sequence ?GO_code #Evidence Gene ?Gene XREF Other_sequence // for where mRNAs etc. correspond to a gene Paired_read ?Sequence XREF Paired_read // dl 020110 Reference ?Paper XREF Sequence Expr_pattern ?Expr_pattern XREF Sequence RNAi ?RNAi XREF Sequence Confidential_remark ?Text Remark ?Text #Evidence // tag2 system: names of all objects following next tag are shown in the // general annotation display column as "tag:objname" Properties Genomic_canonical Gene_count UNIQUE Int // added tag for dot name tracking [020128 dl] Briggsae_canonical Genomic // added tag to define genomic sequences not from the consortium cDNA cDNA_EST EST_5 // Indicate whether this is a 5' or 3' read [010423 dl] EST_3 Capped_5 // Indicates capped 5' end - for Lincoln [06/12/01 krb] TSL_tag // Indicates a short RT-PCR product for TSL detection [030220 dl] EST_consensus // Designates this object as a consensus // if RNA tag, acedb outputs U in place of T in sequence output RNA UNIQUE mRNA UNIQUE Processed_mRNA // Unprocessed_mRNA tRNA Type UNIQUE Text // ck1 [030926] krb Anticodon UNIQUE Text // rRNA UNIQUE Text snRNA UNIQUE Text snoRNA UNIQUE Text // [030102 krb] scRNA UNIQUE Text miRNA UNIQUE Text // [020306 kj] ncRNA UNIQUE Text // true non-coding RNA molecules Ignore #Evidence // tag to flag problem Sequence objects to avoid certain analysis [031120 krb] Show_in_reverse_orientation // Draw 3' reads in reverse orientation [010423 dl] Status Received UNIQUE DateType Library_construction UNIQUE DateType Shotgun UNIQUE DateType Shotgun_complete UNIQUE DateType Contiguous UNIQUE DateType Finished UNIQUE DateType Submitted UNIQUE DateType Annotated UNIQUE DateType Archived UNIQUE DateType UNIQUE Text // Date Disk Match_type UNIQUE Match_with_function Match_without_function // These are designed specifically for measuring // statistics. What you match should be listed in // Brief_id, Remark etc. The aim now is to use Brief_id // exactly for what you would like a half-line summary to // contain, for making tables etc. Link // Enable gene curation of link genes [020805 krb] Splices Confirmed_intron Int Int #Splice_confirmation Predicted_5 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1) Predicted_3 ?Method Int Int UNIQUE Float // (x, x+1) or (x, x-1) Cluster_information Contains_reads ?Sequence XREF Contained_in_cluster // Links cluster contig and Contained_in_cluster ?Sequence XREF Contains_reads // individual reads Map ?Map XREF Sequence #Map_position // use in particular for Genomic_canonical Interpolated_map_position UNIQUE ?Map UNIQUE Float // For updated CDS-based interpolated map positions [krb 030502] Oligo ?Oligo XREF In_sequence Int UNIQUE Int // for OSP and human mapping mostly Defines_feature ?Feature XREF Defined_by_sequence #Evidence // Feature data model [dl 030304] Assembly_tags Text Int Int Text // type, start, stop, comment Gene_regulation Cis_regulator ?Gene_regulation XREF Cis_regulator_seq // Wen YH_bait ?YH XREF Sequence_bait ?Text// for yeast two hybrid data YH_target ?YH XREF Sequence_target ?Text Homol DNA_homol ?Sequence XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info Pep_homol ?Protein XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info Motif_homol ?Motif XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info Homol_homol ?Homol_data XREF DNA_homol ?Method Float Int UNIQUE Int Int UNIQUE Int #Homol_info // We will generate a column for each distinct ?Method. So for // distinct Worm_EST and Worm_genomic columns, use ?Method objects // Worm_EST_Blastn and Worm_genomic_Blastn. Method UNIQUE ?Method
Proposed Changes
Unused tags
AGP_parent
AGP_fragment
Source_exons
Flipped
Library - will probably use this when creating the ?Library objects for the original C_elegans sequencing.
Strain
Anatomy_term
Read_coverage
GO_term
Confidential_remark
Gene_count
Briggsae_canonical
EST_consensus
Processed_mRNA
Unprocessed_mRNA
tRNA Type
tRNA Anticodon
snoRNA
scRNA
Received
Library_construction
Shotgun_complete
Archived
Match_type
Match_with_function
Match_without_function
Predicted_5
Predicted_3
Cluster_information
Contains_reads
Contained_in_cluster
Gene_regulation
Cis_regulator
YH_bait
DNA_homol
Pep_homol