Difference between revisions of "Pictures"
Line 190: | Line 190: | ||
===Anatomy_term=== | ===Anatomy_term=== | ||
It will link the picture object directly to an Anatomy Object | It will link the picture object directly to an Anatomy Object | ||
+ | |||
+ | ==Draft OA for picture curation== | ||
Revision as of 22:57, 20 October 2010
links to relevant pages
Caltech documentation
Pictures
Contents
Picture Curation
The immediate goal of picture curation is to be able to obtain images of gene expression data from the literature and individual laboratories and display them in the WormBase gene expression page.
- We want display images related to the temporal or spatial (e.g., tissue, subcellular, etc.) localization of any gene in a wild-type background with different data types
- Reporter gene analysis
- Antibody staining
- In situ hybridization
- RT-PCR
- Western or Northern blot data
Pipeline
In the early phases of curation, pictures will be taken from open access journals (e.g. PLoS). During the process of PLoS image curation, other publishers will be contacted for obtaining copyright permissions.
The images should be saved and stored according to the following guidelines. The example shown below refers to a PLoS Biology paper but the rules of handling the pictures are universal and not "paper specific".
Overview
This is a mock page of the expression page for gene K07C11.4. We would like to see highlighted panel B and F with the figure capture describing the expression of the gene AND be able to access the original figure by clicking the "See original figure" button.
Downloading and saving the images
Pictures are downloaded in TIFF format from the original paper.
Pictures are saved with their original name in order to minimize editing from the curator. In this case the file is called “journal.pbio.0020352.g006”. The files are directly converted into JPEG. TIFF is not indicated as web display format. Avoid using special characters like ' * / in the file name.
The file is saved in a directory named after the WB paper ID. E.g.: WBPaper00024505, meaning that picture “journal.pbio.0020352.g006” has been downloaded from WBPaper00024505.
These 2 numbers together WBPaper00024505_journal.pbio.0020352.g006 will be UNIQUE IDENTIFIERS of the object, that we call Picture object 1 (WBPicture000000001). The ID WBPicture000000001 will be the NAME of the object (?Picture) in the Picture Data Model.
The path WBPaper00024505_journal.pbio.0020352.g006 will define the SOURCE of the object in the Picture Data Model.
Now look at the picture above: In our WormBase expression pattern page we don’t want to display the whole picture because it contains information not pertinent to the expression data. We therefore need to CROP the 2 pictures depicting expression of the gene in the Wild Type. We want to have only panel B and F.
Each panel is cropped from the original picture in Photoshop and the files are saved as “journal.pbio.0020352.g006_B” “journal.pbio.0020352.g006_F” in the same directory as before: WBPaper00024505
These will be respectively Picture object 2(WBPicture000000002) and Picture object 3 (WBPicture000000003).
To summarize till now:
Picture object 1: WBPicture000000001: WBPaper00024505_journal.pbio.0020352.g006
Picture object 2 WBPicture000000002: WBPaper00024505_journal.pbio.0020352.g006_B
Picture object 3: WBPicture000000003: WBPaper00024505_journal.pbio.0020352.g006_F
where WBPicture000000001 corresponds to the NAME of the object in the picture data model and WBPaper00024505_ journal.pbio.0020352.g006 corresponds to the SOURCE of the object in the Picture Data Model.
Question to web team: is it OK to keep the file names as proposed? -> Yes (Answer from TH october 6th)
At the same time, the text file associated with the entire figure WBPicture000000001, is saved with the same name as the figure -journal.pbio.0020352.g006- with a .doc extension. In this way we can make sure which figure legend goes with which picture. This .doc file is per se irrelevant for picture curation as the figure legend will be inserted in the "description" tag in the Picture Data Model.
Special case: what do I do when one single panel refers to multiple genes. E.g. In the example below, panel B displays the expression of 3 different genes. We will simply name the pictures Fig3_B1, Fig3_B2, Fig3_B3.
Let's go one step further...
Picture lineage
Picture object 1 is our PARENTAL IMAGE, we will display it only when the user will click on a “see original figure” link. Picture Objects 2 and 3 are our Daughter Images, which will be displayed on the gene expression page. See mock page below for a visual example:
We would like to keep the lineage relationship in order to know how images should be handled. In other words, we would like to know which image should be displayed in the expression pattern page and which should be displayed next to the "See original figure" link.
For that purpose, in the Picture Data Model we have the "Image lineage" tag.
There are cases in which parental image = daughter image. See picture below.
Question to the web team: in this case is the Picture Data Model proposed sufficient to determine that this picture should be displayed as PARENTAL or DAUGHTER? Answer Yes
Picture size and format
All the pictures should be in JPEG format, if possible.
The picture size for thumbnails shown in the main gene expression page should be 200x200 pixels.
Picture size for the full view 600x600 pixels.
Picture size for the original file will be as big as needed.
NB: a note on 200x200 and 600x600 pixel size. This will not distort the pictures but just put a constraint on the maximum size of the thumbnail or the full image.
Generating 200x200px thumbnails
Thumbnails are generated using the freeware "ThumbsUp" (v4.4) a simple, drag-and-drop based utility to create thumbnails for a bunch of pictures and supports all image formats of Mac OS X and QuickTime (including PDF documents)<ref>http://www.macupdate.com/info.php/id/11898/thumbsup</ref>
Trials for automation have been done with Photoshop (automated image processor) and MacOSX (Automator -> creation of Thumbnail images). With Photoshop automator is NOT possible to save the thumbnails in the same folder. With MacOSX Automator is not possible to create thumbnails larger than 128px. ThumbsUp allows generation of 200x200 in the same folder where the original files are.
The file name for thumbnails is the same as the original picture with a _thumb suffix
Generating 600x600px full view
600x600 images are generated with photoshop (scripts -> image processor) and stored in a separate folder called nameofthejournal_600. For example PLoS_600. The architecture of the sub-folders is the same as the original. OICR should take the 600x600 full view files from here
Picture Data Model Proposal
////////////////////////////////////////////////////////////////////////////////////
?Picture Description ?Text Source ?Text Image_lineage Crop_picture ?Picture XREF Cropped_from Cropped_from ?Picture XREF Crop_picture Pick_me_to_call Text Text Expr_pattern ?Expr_pattern XREF Picture Reference ?Paper XREF Picture Remark ?Text #Evidence Cellular_component ?GO_term XREF Pictures Anatomy_term ?Anatomy_term XREF Picture ///////////////////////////////////////////////////////////////////////////////////
Picture Data Model step by step explanation
Picture
Name of the picture object. E.g. WBPicture0000000001
Description
Figure legend
Source
For actual picture names. This is the name of the path leading to the picture file. The source includes the name of the directory where the picture comes from AND the name of the picture file. e.g. WBPaper00024505_journal.pbio.0020352.g006
Image Lineage
This is the picture object lineage. Large figures will be cropped into sections when they represent different data. We want to maintain the picture lineage -> by clicking on the "see original figure button" we want to access the entire image.
Pick_me_to_call
It is probably an XACE command to call the image. Asked via the Wormbase Dev mailing list. Waiting for answer
Expr_pattern
For linking to Expr-pattern data. This will be the Expr_pattern object that is associated with the picture
Reference
For the source of the picture E.g.WBPaper12345678
Remark
For curator notes this tag will be also used when the picture is coming from a lab and not from a publication. E.g. Picture provided by Ian Hope The Remark tag will also be used to input information on permission/copyright
GO_Term
This links to the GO term e.g. if a picture depicts sub-cellular localization
Anatomy_term
It will link the picture object directly to an Anatomy Object
Draft OA for picture curation
Picture conversion
cd Desktop/WormBase/PLoS/Gene<tab>/ enterthis let me choose the directory from where I get the files
scp -r <directory_name I have chosen before> acedb@tazendra.caltech.edu:draciti/pictures/ enter I copy a folder and all its files into tazendra <enter password>
Open a new terminal
ssh acedb@tazendra.caltech.edu enter login into tazendra <enter password>
cd draciti/pictures/ enter this is the directory in tazendra where I will put my files cd <directory name> enter you can type ls for a list of files present in the directory igal2 -bigy 400 enter the program converts the pictures, bigy 400 is an arbitrary size for vertical pixels
ls -al it will list all the files present
If you want to open a new command window just press command+N on keyboard
the folder for getting the converted picures is called Incoming from Tazendra. We want to bring the files back: cd Desktop/Incoming\ from\ Tazendra/ enter
scp -r acedb@tazendra.caltech.edu:draciti/pictures/WBPaper00024399 . enter
(ssh = secure shell) (scp = secure copy
scp -r = recursive secure copy for directories)
(cd = change directory) pwd - show current directory