First-pass to Curation

From WormBaseWiki
Jump to navigationJump to search

Caltech documentation
First-pass flagging pipelines
First-pass schedule, instructions, automation

First pass check out list

Once a paper has been entered into Postgres if it has not been labeled as "review" or "for functional analysis only"

1. It appears on the first-pass curation checkout page. The paper is in grey until a pdf or document is available for a curator to access. It also remains grey until the paper has been through Textpresso (?).

2. It goes through a Textpresso scan for the presence of predetermined values, patterns, etc. for specific data types. These values are entered into the Textpresso first pass tables (tfp) and show up on the curator first pass form (cfp form) in the first column. If a paper has been through Textpresso, it will have a "T" next to the WBPaper ID in the checkout form.

3. An automated script scans the paper for an e-mail address, if one is found, a URL to an author first pass form (afp form) is sent to the e-mail address. URLs are sent in batches of 50 every Thursday. Once a URL is sent, the paper is appended with an "A" in the checkout. In the first 7 days, the "A" is in red, after that the "A" is in cyan. If an author responds, the "A" turns to dark blue or to a red "R" if the author labeled their paper as a review.

Curator checks out a paper

To first pass a paper, a first-pass curator will "check-out" a paper, that is, select a paper to curate by pressing 'curate' on the check out form. Their name will be added in the 'check-out' column of the form so other people will know that that paper is getting worked on.

The curator will be taken to the cfp form were they can query postgres for data entered for the paper by Textpresso, and or authors, (or other curators).

If there are data from authors, the curator can approve or reject it by clicking the check box in the author column for each data type. If the curator wants to modify what the author enter, they can merge the author data into the curator table and edit the data.

The curator cannot approve or reject Textpresso results. However, they should feel free to comment on it in the curator table.

Once the curator is happy with the first-pass, they hit "flag!".

Data curator alerted

More detailed information can be found here: First-pass flagging pipelines

For those data types that are actively curated, the data are sent to the e-mail address(es), which are noted in the the last column of the cfp form. In some cases the data curator is alerted to the paper as being flagged for them through a check-out form of their own.

Each curator has their own way of receiving these flags and will need to be asked individually for this information.

Which papers have been flagged for a specific data type is stored in postgres and can be mined from the curation status form here

Data curator finishes a paper

How the paper is noted as curated once the data curator is done, is also dealt with on an individual basis, see documentation on the curation status page here