Difference between revisions of "Source and maintenance of non-WBGene info"
m |
|||
Line 35: | Line 35: | ||
they can be temporarily added to obo_name_variation through the [http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=TempVariationObo TempVariationObo]. If mistakes were made in entering objects, you need to go to /home/azurebrd/public_html/cgi-bin/data/obo_tempfile to edit the flatfile. NOTE: you should be able to enter lists of temporary variations here as well; however you will not be able to see the objects until a cron job is triggered -probably based on the geneace update. Adding objects through the generic_cgi adds the object to both the flatfile AND the obo tables, so those objects are seen right away. | they can be temporarily added to obo_name_variation through the [http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=TempVariationObo TempVariationObo]. If mistakes were made in entering objects, you need to go to /home/azurebrd/public_html/cgi-bin/data/obo_tempfile to edit the flatfile. NOTE: you should be able to enter lists of temporary variations here as well; however you will not be able to see the objects until a cron job is triggered -probably based on the geneace update. Adding objects through the generic_cgi adds the object to both the flatfile AND the obo tables, so those objects are seen right away. | ||
− | Curators first need to retrieve a WBVarID from the variation nameserver in Hinxton. Once a WBVarID is received, curators enter | + | Curators first need to retrieve a WBVarID from the variation nameserver in Hinxton http://www.sanger.ac.uk/sanger/Worm_NameServer. Once a WBVarID is received, curators enter the public name and WBVarId, separated by a space OR tab, into the form at the link above; the form can take columns of data as long as it is in the form "allele" "WBVarID". The information is added immediately to the obo_name_variation and is available through the OA (a reload may be necessary). |
If the allele already has a WBVarID but does not exist in the nightly geneace dump, curators should still enter the object through the generic.cgi. When the object gets into geneace, it's information will be captured and overwritten in obo_data_variation. | If the allele already has a WBVarID but does not exist in the nightly geneace dump, curators should still enter the object through the generic.cgi. When the object gets into geneace, it's information will be captured and overwritten in obo_data_variation. |
Revision as of 22:07, 20 February 2015
Contents
Geneace dump from Hinxton
Information for gene, variation, clone, strain, rearrangement, and laboratory are provided in a nightly json dump from Hinxton. The gene information is discussed over here. This current page outlines the processing of all non gene information supplied through the dump.
nightly_geneace.pl
/home/postgres/work/pgpopulation/obo_oa_ontologies/geneace/nightly_geneace.pl
For variations: -populates obo_name/data_<datatype> tables were <datatype> is variation, clone, strain, or rearrangement -adds any WBVarID not in the geneace nightly dump but on obo_tempfile at /home/azurebrd/public_html/cgi-bin/data/obo_tempfile and -compares WBVar to Public_name mapping in both files by both public_name and WBVar, emails curator (Karen) if it's different. The curator needs to edit the obo_tempfile to resolve the differences, otherwise an email will continue to be sent. -NOTE: objects added to obo_tempfile are immediately available in the variation dropdown and will remain on obo_tempfile until the object shows up in the nightly geneace dump.
For Clone: -script extracts Type: plasmid only from ftp://ftp.sanger.ac.uk/pub/consortia/wormbase/STAFF/mh6/nightly_geneace/clones2.ace.gz -Need to change this to take all clones. Construct curation will require annotating using cDNAs, Fosmids, Cosmids, etc.
Variations
For each variation with specific Method (listed in table below), the following information will be retrieved:
- WBVar ID
- public_name
- gene association
- references
- method
- status
If the variation does not have an attached method, it is not retrieved. These data will populate obo_name_variation and obo_data_variation.
During curation, if variation does not exist in the geneace dump, and hence not in obo_name/data_variation tables, they can be temporarily added to obo_name_variation through the TempVariationObo. If mistakes were made in entering objects, you need to go to /home/azurebrd/public_html/cgi-bin/data/obo_tempfile to edit the flatfile. NOTE: you should be able to enter lists of temporary variations here as well; however you will not be able to see the objects until a cron job is triggered -probably based on the geneace update. Adding objects through the generic_cgi adds the object to both the flatfile AND the obo tables, so those objects are seen right away.
Curators first need to retrieve a WBVarID from the variation nameserver in Hinxton http://www.sanger.ac.uk/sanger/Worm_NameServer. Once a WBVarID is received, curators enter the public name and WBVarId, separated by a space OR tab, into the form at the link above; the form can take columns of data as long as it is in the form "allele" "WBVarID". The information is added immediately to the obo_name_variation and is available through the OA (a reload may be necessary).
If the allele already has a WBVarID but does not exist in the nightly geneace dump, curators should still enter the object through the generic.cgi. When the object gets into geneace, it's information will be captured and overwritten in obo_data_variation.
Clone
- clone
- type
- strain
- general_remark
- location
- accession_number
Strain
- strain
- genotype
- location
Laboratory
- laboratory
- representative
- registered_lab_members
- allele_designation
- strain_designatin
- mail*
Rearrangement
- rearrangenment
- gene_inside
- gene_outside
- map
Non-WBGene objects retrieved through geneace
AceDB tag | Postgres table | Current - Nameserver nightly dump | Current - WS bimonthly release | Future - Geneace nightly dump | Future - WS bimonthly release | Use - Paper or meeting abstract gene connection | Use - OA data type curation | Use - OA term info | Use - Dumping scripts | Use -Text mining/SVM | Use - Updating GSA Lexicon | Comment |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Variation | obo_name_variation obo_data_variation |
yes | yes | yes | no | no | yes | yes | yes | no | no | WBVariationID |
Variation public_name | obo_name_variation obo_data_variation |
no | yes | yes | no | no | yes | yes | no | For Mary Ann's Variation first pass/SVM | For Variation lexicon | In multiple OAs |
Variation- Gene | obo_data_variation | no | yes | yes | no | no | no | yes Display WBGeneID and gin_locus |
no | no | no | |
Variation -Reference | obo_data_variation | no | yes | yes | no | no | no | yes | no | yes? for MA's scripts?? | no | |
Variation -Method | obo_data_variation | no | no used to query for Variation type Allele and Transposon |
yes | no | no | no | yes | no | no | no | Only take in data from Variation objects with these Methods: "Allele" "Deletion_allele" "Deletion_and_insertion_allele" "Deletion_polymorphism "Insertion_allele" "Insertion_polymorhism" "KO_consortium_allele" "Mos_insertion" "NBP_knockout_allele" "NemaGENETAG_consortium_allele" "Substitution_allele" "Transposon_insertion" |
Status | obo_data_variation | yes | yes | yes | no | no | no | yes | no | no | no | |
Rearrangement | obo_name_rearrangement obo_data_rearrangement |
no | yes | yes | no | no | yes | yes | no | no | yes | |
Rearrangement -map | obo_data_rearrangement | no | yes | yes | no | no | no | yes | no | no | no | |
gene_inside | obo_data_rearrangement | no | yes | yes | no | no | no | yes display gin_locus (do not need WBGeneID) |
no | no | no | |
gene_outside | obo_data_rearrangement | no | yes | yes | no | no | no | yes display gin_locus (do not need WBGeneID) |
no | no | no | |
Strain | obo_name_strain obo_data_strain |
no | yes | yes | no | no | yes | yes | no | no | yes | |
Strain -genotype | obo_data_strain | no | yes | yes | no | no | no | yes | no | no | no | |
Strain- location | obo_data_strain | no | yes | yes | no | no | no | yes | no | no | no | |
Clone | obo_name_clone obo_data_clone |
no | yes | yes | no | no | yes (expr_pattern) | yes | no? | no | yes | |
Clone -Type | Not sure you need a table for this. All clones that populate the clone tables will be of one type = PLASMID | no | yes | yes | no | no | no | yes | no | no | no | |
Clone -Transgene | obo_data_clone | no | yes | yes | no | no | no | yes | no | no | no | I don't think there is any data in this tag in the ftp cloness.ace |
Clone -strain | obo_data_clone | no | yes | yes | no | no | no | yes | no | no | no | |
Clone -general_remark | obo_data_clone | no | yes | yes | no | no | no | yes | no | no | no | |
Clone -location | obo_data_clone | no | yes | yes | no | no | no | yes | no | no | no | |
Clone -accession_number | obo_data_clone | no | yes | yes | no | no | no | yes | no | no | no | |
Laboratory | obo_name_laboratory obo_data_laboratory |
no | yes | yes | no | no | yes | yes | no | no | no | |
Laboratory -Representative | obo_name_laboratory obo_data_laboratory |
no | yes | yes | no | no | yes | yes | no | no | no | |
Laboratory -Registered_lab_members | obo_data_laboratory - actually I don't know if this needs to be displayed in the term info | no | yes | yes | no | no | no | yes | no | no | no | |
Laboratory - allele_designation | obo_data_laboratory | no | yes | yes | no | no | no | yes | no | yes MA's script | yes? use for text markup regex? | |
Laboratory - strain_designation | obo_data_laboratory | no | yes | yes | no | no | no | yes | no | no | no | |
Laboratory -Mail | obo_data_laboratory | no | yes | yes | no | no | no | yes | no | no | no |