Difference between revisions of "Dumping Script"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
 
Overview:
 
Overview:
  
Papers are dumped in a .ace file format to every ?
+
Papers are dumped in a .ace file format to every Thursday that is either a 20- or 30-something.
  
On ? that are either a 20- or 30-something, the file is also copied to ?
+
Every Thursday at 4 am, a cronjob on spica copies the file from tazendra to spica into the Data_from_Kimberly directory.
 
 
Every ? at ? am, a cronjob on spica copies the file from tazendra to spica into the Data_from_Kimberly directory.
 
  
  
 
Details:
 
Details:
 
  
 
The papers cronjob is on the acedb account :
 
The papers cronjob is on the acedb account :
 
0 2 * * thu /home/postgres/work/citace_upload/papers/wrapper.pl
 
0 2 * * thu /home/postgres/work/citace_upload/papers/wrapper.pl
 
 
 
  
 
The dumping script lives here:
 
The dumping script lives here:
  
 
/home/postgres/work/citace_upload/papers/dumpPapAce.pl
 
/home/postgres/work/citace_upload/papers/dumpPapAce.pl
 
  
 
The papers.ace file is dumped automatically at 2am on the Thursday of the upload and copied to spica at 4am on that same Thursday.
 
The papers.ace file is dumped automatically at 2am on the Thursday of the upload and copied to spica at 4am on that same Thursday.
 
  
 
The dumping script will check for any dead gene IDs attached to papers and comment them out of the .ace file until they are fixed/deleted from postgres by a curator.
 
The dumping script will check for any dead gene IDs attached to papers and comment them out of the .ace file until they are fixed/deleted from postgres by a curator.
Line 30: Line 22:
  
 
select all class gene where ->Species like "*elegans" and ->Status like "Dead"
 
select all class gene where ->Species like "*elegans" and ->Status like "Dead"
 
  
 
The dumping script will also check that all associated genes are in the format: WBGenennnnnnnn where the 8 n's correspond to numbers.
 
The dumping script will also check that all associated genes are in the format: WBGenennnnnnnn where the 8 n's correspond to numbers.

Revision as of 20:54, 22 June 2011

Overview:

Papers are dumped in a .ace file format to every Thursday that is either a 20- or 30-something.

Every Thursday at 4 am, a cronjob on spica copies the file from tazendra to spica into the Data_from_Kimberly directory.


Details:

The papers cronjob is on the acedb account : 0 2 * * thu /home/postgres/work/citace_upload/papers/wrapper.pl

The dumping script lives here:

/home/postgres/work/citace_upload/papers/dumpPapAce.pl

The papers.ace file is dumped automatically at 2am on the Thursday of the upload and copied to spica at 4am on that same Thursday.

The dumping script will check for any dead gene IDs attached to papers and comment them out of the .ace file until they are fixed/deleted from postgres by a curator.

The AQL query that finds all dead genes in WB is:

select all class gene where ->Species like "*elegans" and ->Status like "Dead"

The dumping script will also check that all associated genes are in the format: WBGenennnnnnnn where the 8 n's correspond to numbers.


Back to 2010_-_Paper_Pipeline:_Documentation_and_Instructions