Importing Protein Structure Data Images

From WormBaseWiki
Jump to navigationJump to search

The structure data imported to Wormbase (see Importing Protein Structure Data) contains a large number of records. Most of these records are for proteins that are currently in the structure determination pipeline of the genomics center processing them. A small portion of the records point to proteins for which structure have been determined. For these records the structure image is acquired and displayed in the gene page.

Most of these images come from PDB and a small portion comes from NESGC. The images from NESGC are downloaded manually.

To update images from PDB:

Code: $WORMBASE/util/import_export/compile_3d_images/
grep 'DB_info Database "PDB"' <MERGED.ace> | cut -d' ' -f5 | cut -d'_' -f1 | sort -u > pdb_ids.txt
(see cut_pdb_ids.sh)
  • Make a list of URLS using the get_3d_image_urls.pl with the pdb_ids.txt to obtain URLs of the relevant images. This script expects a directory and stores all information in the specified directory. A file named image_index.log is created.
  • Acquire the images by running get_3d_images.pl with the image_index.log file. You can specify the same directory to this script. Acquired images are stored in the directory.
  • Create thumbnails of the images acquired.
Thumbnail script: $WORMBASE/util/import_export//make_thumbnails.pl 
  • Move the thumbnail directory to:
$WORMBASE/html/images/structure-images/prot2prot_yymmdd_hhmm/

  • Symlink to this directory from:
$WORMBASE/html/images/structure-images/prot2prot