Difference between revisions of "Updating ontology (.obo) files for the OA"

From WormBaseWiki
Jump to navigationJump to search
m
Line 52: Line 52:
 
Exchange newest release with older one by removing old release and or change ./xace launch path to new release etc. $ rm <old WS release>
 
Exchange newest release with older one by removing old release and or change ./xace launch path to new release etc. $ rm <old WS release>
  
=Updating .obo files=
+
=AQL Queries for updating the variation file for the OA=
  
 
Instructions for retrieving object connections from the latest WS build  
 
Instructions for retrieving object connections from the latest WS build  
  
Variation-gene and variation-paper connection information retrieved in phenote depends on information from the latest WS (install latest release).
+
Variation-gene, variation-paper connections are used to filter out the variation term drop down list to keep the size of the autocomplete file manageable.  The total variation object class is filtered for those variations that are called 'allele's' or have a gene connection (which includes transposon variations). The files used for filtering this list needs to be updated with each release. To update this info, the following AQL queries need to be performed on the newest release and deposited on tazendra for the updating scripts.   
When the new release is ready these .obo files in tazendra need to be repopulated with the most current information.   
 
  
Run queries on the latest WS release for the most current information.
 
 
 
----
 
 
==AQL Queries==
 
  
 
===Variation_gene connections===
 
===Variation_gene connections===
Line 99: Line 92:
 
==Repopulating .obo's==
 
==Repopulating .obo's==
  
Two scripts need to be run to update the .obo's for phenoteIt is important to run these scripts every time the variation information is updated. The scripts are on tazendra and run off of files '''Variation_gene.txt''', '''transgene_summary_reference.txt''' and '''rearr_simple.txt'''. So files need to be transferred to tazendra and renamed to be recognizable by those scripts.  
+
Two scripts run off of these files to update the .obo's for the OA.  The scripts are on tazendra and run off of files '''Variation_gene.txt''', '''transgene_summary_reference.txt''' and '''rearr_simple.txt'''. So files need to be transferred to tazendra and renamed to be recognizable by those scripts.  
  
 
===Transfer files to tazendra===
 
===Transfer files to tazendra===
Line 116: Line 109:
 
  $ scp total_variations.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/total_variations.txt
 
  $ scp total_variations.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/total_variations.txt
  
===Scripts to repopulate Phenote .obo files===  
+
===Scripts that repopulate the OA '.obo' files===  
*''populate_newobjects_cgi_postgres_tables.pl''  updates information based on Variation_gene.txt and transgene_summary.txt. This script is required for posting new allele and transgene entries on to the New objects cgi and sending notifications to the relevant curators. (Make sure files are named accordingly or the program won’t see them).
+
*Cron job: ''populate_newobjects_cgi_postgres_tables.pl''  updates information based on Variation_gene.txt and transgene_summary.txt. This script is required for posting new allele and transgene entries on to the New objects cgi and sending notifications to the relevant curators. (Make sure files are named accordingly or the program won’t see them).
  
 +
* http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=AddToVariationObo This web launchable script can be launched after entering a new allele in the variation name server to perform an on-the-fly update to the variation list.
 +
 +
===Obsolete===
 
Note: As of 6/10/2010, we have discontinued the use of the 'New Variation!' object cgi page
 
Note: As of 6/10/2010, we have discontinued the use of the 'New Variation!' object cgi page
  
*''make_obo.pl'' creates a text .obo based on rearr_simple.txt and Variation_gene.txt. This script populates the WS current info (which is needed for the Term info Display in the OA).
+
''make_obo.pl'' creates a text .obo based on rearr_simple.txt and Variation_gene.txt. This script used to populate the WS current info (which was needed for the Term info Display in the phenote). now obsolete and replaced by a web launchable script http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=AddToVariationObo
 
 
NOTE: Most likely going to be obsolete after the 'Update' button gets inserted into the phenotype OA
 
 
    
 
    
 
Both apps are on tazendra in the same directory as the updated variation info.  
 
Both apps are on tazendra in the same directory as the updated variation info.  

Revision as of 01:49, 25 January 2011

This SOP is for updating the Object names obo files in the OA


Updating local acedb to latest available WS (instructions from Wen)

You have to download the latest build from the Sanger website. From the command line (X11 on Mac OS), go to the local directory where the old WS is installed. Login anonymously to Sanger’s ftp site

bash-3.2$ ftp ftp.sanger.ac.uk
Connected to ftpservice2.sanger.ac.uk.
220-ftp.sanger.ac.uk NcFTPd Server (free educational license) ready.
220-Wellcome Trust Sanger Institute FTP server
220-
220-Problems after login? Try using '-' as the first character of you
220-password.
220-
220-****
220-****
220-**** 7/9/06 FTP Server upgraded please report any problems to
220-****    ftpadmin@sanger.ac.uk
220-****
220 
Name (ftp.sanger.ac.uk:Yook): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password: 

Go to directory containing WS releases Download whole release (takes about 1 hour) Quit ftp

FTP> cd pub/wormbase
FTP> get WS188.tar 
[or get –R WS188 for Ncfp client]
FTP> bye

Unzip tar Get into the new WS directory and run install to install new database (~15 minutes)

$ tar –xvvf WS188.tar
$ cd WS188
$./INSTALL

The readout after installation is as follows:

ACEDB installation script:
Yook will be known as the acedb-administrator
We are going to install the acedb system in the present directory: 
    /Users/Yook/WS_latest/WS188
This is your available disk space in this directory: 
Filesystem   1024-blocks      Used Available Capacity  Mounted on
/dev/disk0s2   488050672 149925748 337868924    31%    /
The amount of space you need will depend on what data you are installing.
For the source code and binary, you need around 15 Mb.
Should we proceed?  Please answer yes/no : yes

Exchange newest release with older one by removing old release and or change ./xace launch path to new release etc. $ rm <old WS release>

AQL Queries for updating the variation file for the OA

Instructions for retrieving object connections from the latest WS build

Variation-gene, variation-paper connections are used to filter out the variation term drop down list to keep the size of the autocomplete file manageable. The total variation object class is filtered for those variations that are called 'allele's' or have a gene connection (which includes transposon variations). The files used for filtering this list needs to be updated with each release. To update this info, the following AQL queries need to be performed on the newest release and deposited on tazendra for the updating scripts.


Variation_gene connections

Find all variations in the allele group (excludes SNPs etc.) along with the WBGeneID and public gene name of the gene they are assigned, if that is available. You will be making a file named WS200_vargene.txt that is a combination of WS200_vargene0.txt and WS200_transposons.txt

  • WS200_vargene.txt
select g, g->gene, g->gene->public_name, g->reference from g in class variation where exists_tag g->allele 
Export as WS200_vargene.txt to your desktop (choose Separator character set to blank (TAB))
  • WS200_transposons.txt
select v, v->gene, v->gene->public_name, v->reference from v in class variation where exists_tag v->transposon_insertion and exists v->gene
Export as WS200_transposons.txt to your desktop (choose Separator character set to blank (TAB))
  • Make WS200_vargene.txt by copying and pasting WS200_transposons.txt to the end of WS200_vargene.txt and saving as WS200_vargene.txt
  • total_variations.txt
select v, v->gene, v->gene->public_name, v->reference from v in class variation
Export as total_variations.txt to your desktop (choose Separator character set to blank (TAB))

//This is required for building an exclusion list that filters out SNPs

Transgene_summary_paper connections

List transgenes already linked to a paper

  • WS200_transpapsum.txt
select t, t->reference, t->summary from t in class transgene where exists t->reference
Export as WS200_transpapsum.txt to your desktop (choose Separator character set to blank (TAB)) 

Rearrangement_inside_gene connections

List rearrangements with LG, 'genes inside' and ‘gene outside’ (public names only)

  • WS200_rearragene.txt
select r, r->map, r->gene_inside->public_name, r->gene_outside->public_name from r in class rearrangement
Export as WS200_rearragene.txt your desktop (choose Separator character set to blank (TAB))



Repopulating .obo's

Two scripts run off of these files to update the .obo's for the OA. The scripts are on tazendra and run off of files Variation_gene.txt, transgene_summary_reference.txt and rearr_simple.txt. So files need to be transferred to tazendra and renamed to be recognizable by those scripts.

Transfer files to tazendra

From within the directory that contains the files you just downloaded send files to acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries

  • Send and Rename WS200_vargene.txt to Variation_gene.txt
$ scp WS200_vargene.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/Variation_gene.txt 
  • Send and Rename WS200_rearragene.txt to rearr_simple.txt
$ scp WS200_rearragene.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/rearr_simple.txt 
  • Send and Rename WS200_transpapsum to transgene_summary_reference.txt
$ scp WS200_transpapsum.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/transgene_summary.txt 
$ scp total_variations.txt acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries/total_variations.txt

Scripts that repopulate the OA '.obo' files

  • Cron job: populate_newobjects_cgi_postgres_tables.pl updates information based on Variation_gene.txt and transgene_summary.txt. This script is required for posting new allele and transgene entries on to the New objects cgi and sending notifications to the relevant curators. (Make sure files are named accordingly or the program won’t see them).

Obsolete

Note: As of 6/10/2010, we have discontinued the use of the 'New Variation!' object cgi page

make_obo.pl creates a text .obo based on rearr_simple.txt and Variation_gene.txt. This script used to populate the WS current info (which was needed for the Term info Display in the phenote). now obsolete and replaced by a web launchable script http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=AddToVariationObo

Both apps are on tazendra in the same directory as the updated variation info.

cd to /home/acedb/jolene/WS_AQL_queries/
$ ./populate_newobjects_cgi_postgres_tables.pl
$ ./make_obo.pl


-Jolene


NOTE: populate_gin_variation updates data based on variation_tab_wbgene file (in postgres / cgi) , which is no longer current.

To check if the re-population scripts worked, check out the WS_current info field The date will tell you when it was last updated; it should reflect the date the script was run.


back

--kjy