From WormBaseWiki
Revision as of 00:36, 22 October 2013 by Cgrove (talk | contribs) (→‎Notes)
Jump to navigationJump to search

back to Caltech documentation

Antibody curation SOPs

Antibody curation

1. Antibody paper first pass:

  • Antibody papers are identified via a script written by Juancarlos. Here is the first pass results for antibody curation:

  • The results is in html page. It lists paper names and the antibodies associated with them.

2. A few steps need to be done before starting curation:

  • save the file on desktop as 'anti_protein_20110113.txt' (add the date in file name)
  • open a terminal
    • cd to curation, then my_acefiles, then antibody_curation
    • cp /Users/xiaodongwang/Desktop/anti_protein_20110113.txt . (copy the file into antibody_curation directory)
    • [OBSOLETE STEP: scp anti_protein_20110113.txt anti_protein.txt (copy the contents in to anti_protein.txt as input file 1 when run TextpressoABFinder)]
    • run Yuling's new script (written on 3/12/2012) to get new antibody paper:
      • ./ anti_protein_20120320.txt WBAbPaperList.ace AbCurationLog.txt > aaa
        • basically, script will minus papers from 'WBAbPaperList.ace' and 'AbCurationLog.txt' and output new papers into file 'aaa'
  • copy and paste new papers into 'Ab_curation_spreadsheet.xlsx' of my own curation log in antibody curation folder

3. when curation is finished, copy papers from spreadsheet to 'AbCurationLog.txt' file:

Curators need to document the status of every paper from 'Ab_curation_spreadsheet.xlsx' to the curation log file AbCurationLog.txt, so that the same paper will not appear again next time.

AbCurationLog.txt is located at: Users/xiaodongwang/curation/my_acefiles/antibody_curation

4. A few important documents:

  • AbCurationLog.txt -- Curator maintain a curation log file for all the antibody papers that were already curated. it is located at: Users/xiaodongwang/curation/my_acefiles/antibody_curation
    • this file needs to be updated every time by appending newly curated paper (may copy paper list from excel file and paste into the file)
  • WBAbPaperList.ace -- Antibody papers curated before Textpresso time

OBSOLETE STEP (3/20/2012):3. The curation log file listed above only document papers that were curated after Texpresso first pass was applied. Antibodies curated before that are kept in this file: WBAbPaperList.ace

OBSOLETE STEP (3/20/2012)4. There is a script written by Wen to screen file 1, and filter out papers in file 2 and 3 (which were already curated), then give the new paper list. The script is called:

OBSOLETE STEP (3/20/2012):[

Here is how to use the

(wen@athena:~/TextPresso/TextPressoAb$ ./

/Users/xiaodongwang/curation/my_acefiles/antibody_curation/ ./

This script check the result of Textpresso, compare with the antibody paper list dumped from citace, and look For Antibody papers that were not curated.

Input file 1: anti_protein.txt -- all antibody papers found by Textpresso

Input file 2: WBAbPaperList.ace -- Antibody papers curated before Textpresso time

Input file 3: CurationLog/AbCurationLog.txt -- Antibody curation log.

Output file 1: NewAbPaper.txt -- New antibody papers

    • I can change the output file 1 name to NewAbPaper_20110113 in script.
    • two places need to be changed each time

Output file 2: TPAbFalsePositive.txt -- All false positive antibody papers.

1789 papers flagged by Textpresso, 1734 curated, 55 need to be checked. Among not curated papers, 30 has anti-XXX pattern, 25 has no anti-XXX pattern. 1626 papers curated in citace, 1347 found by Textpresso, 279 not found by Textpresso. Recall is 0.828413284132841. 518 papers identified by Textpresso are false positive. Precision is 0.710452766908888.

5. The result of the script is NewAbPaper.txt. This is the list of antibody papers that need to be curated. I cp this txt onto my desktop and created my own excel file in desktop/curation forms/antibody curation/AB_curation_spreadsheet.xlsx, and name separate sheet in time manner

6. add curated paper in AbCurationLog.txt under /Users/xiaodongwang/curation/my_acefiles/antibody_curation, so that these paper can be subtracted from NewABPaper.txt next time. ]

Antibody curation controlled vocabulary

Antibody control vocabulary

Remark "Commercial Antibody." Remark "Tissue Specific Antibody Marker." Summary "Rabbit polyclonal antibody against XXX recombinant protein." Summary "Rabbit polyclonal peptide antibody against XXX." Summary "Mouse monoclonal peptide antibody against XXX."

Antibody curation guideline

WormBase requires the following information for Antibody:

1. Antibody Name: for consistance, use [WBPaperID]:anti-genename (_1, _2, etc, if several antibodies are made for same gene. genename is in CAPITALS. ig. [WBPaper00036348]:anti-EPG-2)

2. Original reference where the antibody was first reported. For antibodies that are published for the first time, list the original publication and mark the antibody as "Original_publication" antibody (these are good and valid antibody objects in WormBase.)

3. targeting gene (abc-1, xyz-1 ...), clonality (polyclonal or monoclonal) and animal (rabbit or mouse ...)

4. Antigen used to generate antibody (peptide or protein sequence)

5. If the antibody is from another paper, find the original antibody object and add the reference to it.

6. If the antibody has no original reference, create a new antibody object and mark it as "No_original_reference". If you suspect the antibody is the same as another one that was previously published, enter the "Possible_pseudonym" field.

Antibody dumper

  • dumper is located on tazendra:
    • the module and a are on the tazendra at :
      • /home/postgres/work/citace_upload/antibody/
      • /home/postgres/work/citace_upload/antibody/
    • They symlinked the on the tazendra at :
      • /home/acedb/xiaodong/antibody/
  • the dumper checks dead gene from 'Gene' field, and invalid paper from 'Original publication' and 'Reference' field, and throws results into the err.out file.
  • cronjob was cancealed for antibody dumping for upload

[is located in tazendra:( /home/acedb/wen/phenote-antibody/


dump out file: ./

file name: antibody.ace

I usually change the file name: cp antibody.ace antibody.ace.date_of_dump

then copy file to spica: scp antibody.ace.20110503


Changed the postgres tables for ---05/22/2011

reference -> paper

location -> laboratory

cronjob was cancealed. dump is done manually now after checking the err.out file for each dump. - 06/04/2012

email from Juancarlose related to cronjob --- 06/06/2011

Set the cronjob to run every Thursday : 0 2 * * thu /home/acedb/xiaodong/oa_antibody_dumper/ It puts the file at :


So you can see it at :

If you need to run it manually, just paste into the shell :


Then log onto spica, cd into the directory where you want it, remove the existing antibody.ace file, and do :

wget ""

Dumper change for historical_gene tag:-05/22/2013

-model change refer to Chris document:

-dumper change via Skype with J: [5/22/13 1:57:44 PM] j chan: -> /home/postgres/work/citace_upload/antibody/

[5/22/13 1:59:04 PM] j chan:

[5/22/13 2:08:06 PM] j chan: gin_dead

[5/22/13 2:08:19 PM] j chan: Dead -> dead

[5/22/13 2:08:27 PM] j chan: merged_into WBGene -> merged

[5/22/13 2:08:31 PM] j chan: split_into -> split

[CG added 10-21-2013] cgrove: Suppressed -> suppressed

[5/22/13 2:09:06 PM] j chan: looping through the genes where somethign happenned to make sure they don't also point at something else

[5/22/13 2:09:16 PM] j chan: abp_gene

[5/22/13 2:09:47 PM] j chan: merged -> historical_gene + remark AND gene <gene> Inferred_automatically

[5/22/13 2:09:56 PM] j chan: dead -> historical_gene + remark

[5/22/13 2:10:05 PM] j chan: split -> historical_gene + remark AND error message

[CG added 10-21-2013] cgrove: Suppressed -> historical_gene + remark

[5/22/13 2:10:12 PM] j chan: normal ones -> just tag + value

-tested with genes: A split gene: WBGene00012507 A merged gene: WBGene00007524 A dead gene: WBGene00007814

-migrated to tazendra on the same day