Difference between revisions of "Genetics Markup by Textpresso and First Pass"

From WormBaseWiki
Jump to navigationJump to search
m
 
(5 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
==[[Mark Up Work Flow]]==
 
==[[Mark Up Work Flow]]==
 +
[[Media:slide1.jpg]]
  
[[Media:GSA_TextP_WB_workflow.ppt]]: <br>
 
 
this file diagrams the most recent and most detailed workflow we established, which includes editorial roles at Dartmouth. The steps below is a brief overview of the process. <br>
 
this file diagrams the most recent and most detailed workflow we established, which includes editorial roles at Dartmouth. The steps below is a brief overview of the process. <br>
  
Line 18: Line 18:
 
Genetics (3) A WBPaperID and URL are acquired through the ticketing form at: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_paper_ticket.cgi
 
Genetics (3) A WBPaperID and URL are acquired through the ticketing form at: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_paper_ticket.cgi
  
Genetics (4) The WBPaperID from the ticket issuer is attached to the XML and
+
Genetics (4) The WBPaperID from the ticket issuer is attached to the XML  
  
Genetics (5)  the author is sent the generated URL to access the author form
+
Genetics (5)  the author is sent the generated URL to access the author form
  
Author  (6) Authors provide the final source files and fills out of the author first pass form within 48 hours of acceptance to GENETICS .
+
Author  (6) Authors provide the final source files and fills out the author first pass form within 48 hours of acceptance to GENETICS.
* Author data populates journal first-pass tables in postgres, new objects made available to Arun.  
+
* Author data populates journal first-pass tables in postgres, new objects are automatically added to the appropriate lexicon.  
* New object data from the journal first-pass form is sent to Karen, who makes sure the information goes to the appropriate data curator for acedb object creation.
+
* New object data from the journal first-pass form is sent to Karen, who makes sure the information is entered correctly
 +
* New data goes to the appropriate data curator for acedb object creation.
 
* All other data populates author first-pass tables and paper is placed in the pipeline for first-pass curation (Juancarlos will make a filter so these papers will be prioritized for first-pass).  
 
* All other data populates author first-pass tables and paper is placed in the pipeline for first-pass curation (Juancarlos will make a filter so these papers will be prioritized for first-pass).  
 
 
Dartmouth (~5-8) GENETICS sends the final source file (in XML format) to Arun
+
DJS (Dartmouth Journal Services) (~5-8) sends the final source file (in XML format) to Arun - ''deposits XML on a web service?'' http://textpresso-dev.caltech.edu/gsa/worm/incoming_xml/
 +
* An e-mail is sent to the QC curator that a paper is coming and to expect a follow up e-mail with a link to the linked paper and entity table.
  
Arun (8) runs the markup.   
+
Automatic (8) the paper is run through the linking script.   
The linking program links all the objects in wormbase and the ones
+
* The script links all the objects in wormbase and the ones provided by the author first pass form
provided by the author first pass form and sends back the linked XML.
+
* The linked file is deposited here http://textpresso-dev.caltech.edu/gsa/worm/html/
 +
* An initial entity link table is created and deposited in http://textpresso-dev.caltech.edu/gsa/worm/first_pass_entity_link_tables/
 +
* QC curator receives a link to the linked XML in a QCFast interface and a link to the entity table.  
  
Juancarlos (~9) gets XML from Arun, populates bibliography information for WormBase.
+
Juancarlos (~9) gets XML from Arun, populates bibliography information for WormBase.''?''
  
Arun (10) sends markup back to authors and to Tim for quality control
+
Script (10) sends markup back to DJS, this action is prompted by a submit button on QCFast.
 +
* A final entity table is created http://textpresso-dev.caltech.edu/gsa/worm/entity_link_tables/
 +
* An e-mail is sent to the curator that the file has been deposited and includes a link to the final entity table.
  
==[[Instructions for Genetics]]==
+
--[[User:Kyook|kjy]] 19:44, 19 April 2012 (UTC)
 +
 
 +
==Instructions for Genetics==
 
The following are instructions to the journal and to the authors.
 
The following are instructions to the journal and to the authors.
  
Line 78: Line 86:
 
The purpose of the genetics first pass form is to collect data objects that do not exist in WB already so that Arun can mark-up the Genetics paper and provide links for these objects.<br>
 
The purpose of the genetics first pass form is to collect data objects that do not exist in WB already so that Arun can mark-up the Genetics paper and provide links for these objects.<br>
  
Objects are collected for the following data types:  
+
New entities (postgres table) are collected for the following data types:  
*genesymbol
+
*gene (genesymbol)
*extvariation
+
*variations (extvariation)
*newstrains
+
*CGC-sent strains (newstrains)
*newbalancers
+
*integrated transgene (transgene)
*antibody
+
*balancers
*transgene
+
*cell/anatomy(newcell)
*newsnp
+
*phenotype
*newcell
 
  
 
All of these fields, except genesymbol, do not show on the normal afp_form.  We opted to make a hybrid of the afp_form for the Genetics authors so that they would not be requested to fill out another WB generated form for us after their paper was published and because we needed this extra information from them asap.   
 
All of these fields, except genesymbol, do not show on the normal afp_form.  We opted to make a hybrid of the afp_form for the Genetics authors so that they would not be requested to fill out another WB generated form for us after their paper was published and because we needed this extra information from them asap.   
Line 92: Line 99:
 
It is also my understanding that these authors would be required to fill out the form as part of the publication process, so this was an opportunity to have 100% author feedback for paper flagging.  
 
It is also my understanding that these authors would be required to fill out the form as part of the publication process, so this was an opportunity to have 100% author feedback for paper flagging.  
  
The pipeline for alerting data curators from this table still needs to be worked out, right now it is dealt with manually.
+
Data curators are automatically alerted to the new data when the form is submitted.
  
 
==[http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_first_pass.cgi?action=Curate&paper=00000001&passwd=1241223658.1418967 Sample GENETICS Firstpass form]==
 
==[http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_first_pass.cgi?action=Curate&paper=00000001&passwd=1241223658.1418967 Sample GENETICS Firstpass form]==
Line 104: Line 111:
  
 
[[Category:Curation]]
 
[[Category:Curation]]
 +
[[Category:GSA_markup]]

Latest revision as of 20:36, 23 July 2013

[back]


Mark up policy

Mark Up Work Flow

Media:slide1.jpg

this file diagrams the most recent and most detailed workflow we established, which includes editorial roles at Dartmouth. The steps below is a brief overview of the process.

These following steps outline the work flow for Textpresso markup starting with the acceptance of the paper by GENETICS and through to the incorporation of the paper into WB.

The responsible party is noted before each step.

Genetics (1) The paper gets accepted.

Genetics (2) A DOI is assigned to the paper

Genetics (3) A WBPaperID and URL are acquired through the ticketing form at: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_paper_ticket.cgi

Genetics (4) The WBPaperID from the ticket issuer is attached to the XML

Genetics (5) the author is sent the generated URL to access the author form

Author (6) Authors provide the final source files and fills out the author first pass form within 48 hours of acceptance to GENETICS.

  • Author data populates journal first-pass tables in postgres, new objects are automatically added to the appropriate lexicon.
  • New object data from the journal first-pass form is sent to Karen, who makes sure the information is entered correctly
  • New data goes to the appropriate data curator for acedb object creation.
  • All other data populates author first-pass tables and paper is placed in the pipeline for first-pass curation (Juancarlos will make a filter so these papers will be prioritized for first-pass).

DJS (Dartmouth Journal Services) (~5-8) sends the final source file (in XML format) to Arun - deposits XML on a web service? http://textpresso-dev.caltech.edu/gsa/worm/incoming_xml/

  • An e-mail is sent to the QC curator that a paper is coming and to expect a follow up e-mail with a link to the linked paper and entity table.

Automatic (8) the paper is run through the linking script.

Juancarlos (~9) gets XML from Arun, populates bibliography information for WormBase.?

Script (10) sends markup back to DJS, this action is prompted by a submit button on QCFast.

--kjy 19:44, 19 April 2012 (UTC)

Instructions for Genetics

The following are instructions to the journal and to the authors.

Note to authors:

The following message is sent to authors by GENETICS upon acceptance of their paper and after the DOI has been assigned and WBPaperID retrieved.

"GENETICS is working with textpresso (www.textpresso.org) and WormBase (www.wormbase.org) to create links between genetic and genomic objects that are in your paper to the appropriate page in WormBase. These links will be included in both the online full text and PDF formats of your paper.

If you want any genes, alleles, transgenes, CGC-destined strains, anatomy terms, etc., discovered or described in your paper to be linked to WormBase please enter the names of these objects using the form at the following link: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_first_pass.cgi?action=Curate&paper=00040237&passwd=1317224998.8903801

Follow the examples carefully as your submitted data will be processed automatically. If you would rather upload a file, please contact kyook@caltech.edu.

Thank you for your help."

previous notes to authors
--kjy 19:40, 2 December 2011 (UTC)

Note to GENETICS:

This form (link sent) represents a user-friendly interface to a postgres database table stored at WormBase Caltech.

Once the authors hit 'flag' (the button at the bottom of the page) their data will:

  • be added to Arun's object markup list, so new objects can be marked up in the paper along with known WB objects. NOTE: these objects will need to be entered correctly and cleanly as everything that appears on his markup list will be linked.
  • populate the postgres table and the corresponding first pass paper curation form used by fp curators
  • be marked in our first pass curation list as having author data, therefore marked for priority curation

The WB QC curator will receive an e-mail alerting them to the submitted data, at which point it will be checked over to make sure it is in the proper format for Arun's scripts and corrected if needed.

We won't be able to take uploaded tables for the data types; the parsing of this data will be problematic for Texptresso as we anticipate people uploading the data in many different formats. Nonetheless, if they do want to upload a file, for data that doesn't exist in their supplemental materials, they are welcome to contact one of us (kyook@caltech.edu, for now) to do that.

--kjy 19:53, 2 December 2011 (UTC)

Journal first-pass form (jfp) : GENETICS papers only

This is not a postgres table but is a form that collects extra data from Genetics authors, which is then stored in our author first pass table (afp).

The form URL is generated by Genetics through the doi ticket form.

The purpose of the genetics first pass form is to collect data objects that do not exist in WB already so that Arun can mark-up the Genetics paper and provide links for these objects.

New entities (postgres table) are collected for the following data types:

  • gene (genesymbol)
  • variations (extvariation)
  • CGC-sent strains (newstrains)
  • integrated transgene (transgene)
  • balancers
  • cell/anatomy(newcell)
  • phenotype

All of these fields, except genesymbol, do not show on the normal afp_form. We opted to make a hybrid of the afp_form for the Genetics authors so that they would not be requested to fill out another WB generated form for us after their paper was published and because we needed this extra information from them asap.

It is also my understanding that these authors would be required to fill out the form as part of the publication process, so this was an opportunity to have 100% author feedback for paper flagging.

Data curators are automatically alerted to the new data when the form is submitted.

Sample GENETICS Firstpass form

jfp postgres table details

Data types used on first-pass forms