Difference between revisions of "Genetics Markup by Textpresso and First Pass"

From WormBaseWiki
Jump to navigationJump to search
m
Line 102: Line 102:
  
 
==[[Data types used on first-pass forms]]==
 
==[[Data types used on first-pass forms]]==
 +
 +
 +
 +
 +
[[Category:Curation]]

Revision as of 23:22, 13 August 2010

[back]


Mark up policy

Mark Up Work Flow

Media:GSA_TextP_WB_workflow.ppt:
this file diagrams the most recent and most detailed workflow we established, which includes editorial roles at Dartmouth. The steps below is a brief overview of the process.

These following steps outline the work flow for Textpresso markup starting with the acceptance of the paper by GENETICS and through to the incorporation of the paper into WB.

The responsible party is noted before each step.

Genetics (1) The paper gets accepted.

Genetics (2) A DOI is assigned to the paper

Genetics (3) A WBPaperID and URL are acquired through the ticketing form at: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/journal/journal_paper_ticket.cgi

Genetics (4) The WBPaperID from the ticket issuer is attached to the XML and

Genetics (5) the author is sent the generated URL to access the author form

Author (6) Authors provide the final source files and fills out of the author first pass form within 48 hours of acceptance to GENETICS .

  • Author data populates journal first-pass tables in postgres, new objects made available to Arun.
  • New object data from the journal first-pass form is sent to Karen, who makes sure the information goes to the appropriate data curator for acedb object creation.
  • All other data populates author first-pass tables and paper is placed in the pipeline for first-pass curation (Juancarlos will make a filter so these papers will be prioritized for first-pass).

Dartmouth (~5-8) GENETICS sends the final source file (in XML format) to Arun

Arun (8) runs the markup. The linking program links all the objects in wormbase and the ones provided by the author first pass form and sends back the linked XML.

Juancarlos (~9) gets XML from Arun, populates bibliography information for WormBase.

Arun (10) sends markup back to authors and to Tim for quality control

Instructions for Genetics

The following are the current instructions to the journal and to the authors, I am working on changes to the instructions as well as to the form itself. You can see the changes to these instructions here.

Note to GENETICS:

This form represents a user-friendly interface to a postgres database table stored here at WormBase Caltech.

Once the authors hit 'flag' (the button at the bottom of the page) their data will:

  • populate the postgres table and the corresponding first pass paper curation form used by fp curators
  • be marked in our first pass curation list as having author data, therefore marked for priority curation
  • be parsed so that Textpresso will receive any new objects the authors report, so new objects can be marked up in the paper along with known WB objects.


We won't be able to take uploaded tables for the data types; the parsing of this data will be problematic for Texptresso as we anticipate people uploading the data in many different formats. If we required that data be uploaded in a certain format, then there is no benefit for them to upload a table versus cutting and pasting the data into the box area on the form.

Nonetheless, if they do want to upload a file, for data that doesn't exist in their supplemental materials, they are welcome to contact one of us (kyook@caltech.edu, for now) to do that. The ultimate goal is to have this pipeline automated to get away from manual flagging of the data types. If it turns out that there will always have to be a curator that receives data, which has to be manually parsed, then it is not the ideal situation, but it is doable and in the end still a massive step forward for literature curation.

Note to authors:

Congratulations on your paper being accepted by GENETICS for publication!

Now that your paper has been accepted, you have an opportunity to see your data incorporated into WormBase on a much faster time scale than data published in other journals. GENETICS has agreed to help you get your data into our database faster, by requesting that you fill out the following form before your paper hits the press, rather than us waiting for your paper to appear in Pubmed, which can take weeks to months. In addition, by giving us just a bit more detailed information than we would ask of authors of papers from other journals, we can help your paper become a more comprehensive resource portal to worm biology.

To fill out the form, either indicate by a check of the box if a specific data type exists in the body of your paper or list specific details that you use as part of your experiments. In the latter case, when we ask for specific details, these will be objects that you have created or used during the course of your research, such as alleles, RNAi probes, antibodies, integrated transgenes, etc., which do not exist in WormBase already. Please enter the objects into the space provided. If you would rather upload a file, please contact kyook@caltech.edu.

Thank you for your cooperation. We expect you to benefit greatly from participating in our data extraction and markup partnership with GENETICS, Textpresso and WormBase.


Best Wishes,

WormBase
http://wwww.wormbase.org
wormbase-help@wormbase.org

Please click the box next to the type of data found in the body of your publication.

If this is not a primary research article, please click here. You may ignore the fields below. Thank you.
Click the "?" to find out more about the data type.

Journal first-pass form (jfp) : GENETICS papers only

This is not a postgres table but is a form that collects extra data from Genetics authors, which is then stored in our author first pass table (afp).

The form URL is generated by Genetics through the doi ticket form.

The purpose of the genetics first pass form is to collect data objects that do not exist in WB already so that Arun can mark-up the Genetics paper and provide links for these objects.

Objects are collected for the following data types:

  • genesymbol
  • extvariation
  • newstrains
  • newbalancers
  • antibody
  • transgene
  • newsnp
  • newcell

All of these fields, except genesymbol, do not show on the normal afp_form. We opted to make a hybrid of the afp_form for the Genetics authors so that they would not be requested to fill out another WB generated form for us after their paper was published and because we needed this extra information from them asap.

It is also my understanding that these authors would be required to fill out the form as part of the publication process, so this was an opportunity to have 100% author feedback for paper flagging.

The pipeline for alerting data curators from this table still needs to be worked out, right now it is dealt with manually.

Sample GENETICS Firstpass form

jfp postgres table details

Data types used on first-pass forms