Paper Pipeline Scripts

From WormBaseWiki
Jump to: navigation, search

PMID Downloads and Paper Editor

  • - dowloads new xml records from daily PubMed search of 'elegans'
    • resides here: /home/postgres/work/pgpopulation/wpa_papers/pmid_downloads/
  • - processes the PubMed xml records based on actions in the paper_editor.cgi
    • resides here: /home/postgres/work/pgpopulation/pap_papers/new_papers/
    • Updated script on 2017-05-12 to strip '0' from single-digit dates (will be added back when we dump the .ace file - see line 151 in /home/postgres/work/citace_upload/papers/
      • We made this change because in late 2016 PubMed changed their date format from '1' to '01', '2' to '02', etc.

Ace File Generation


Before dumping the papers file, double-check the 'Find Dead Genes' list on the paper editor and make any necessary updates.

papers cronjob is on the acedb account : 0 2 * * thu /home/postgres/work/citace_upload/papers/

It creates a file at :


and symlinks it to :


so you can see it on the web at :

While the cronjob runs every thursday, the wrapper only dumps on days that are 20something or 30something.

If you ever need to run it on a different week, try :

 /home/postgres/work/citace_upload/papers/ > /home/postgres/work/citace_upload/papers/out/papers.ace.<date>
 rm /home/postgres/public_html/cgi-bin/data/papers.ace
 ln -s /home/postgres/work/citace_upload/papers/out/papers.ace.<date> /home/postgres/public_html/cgi-bin/data/papers.ace

and then you can pick it up from spica by ssh-ing into it, cd to the directory, remove the existing file, and :

 wget ""