Populating dois
Current Pipeline
- Put into place in 2014-01
- Uses pmid2doi mapping service hosted by NBIC (http://www.pmid2doi.org/)
- This service has dois for many older papers for which PubMed does not have a doi
- Script that queries NBIC service is here:
/home/postgres/work/pgpopulation/pap_papers/20140107_doi_from_pmid/get_doi_from_pmid.pl
- And is called by this script here:
/home/postgres/work/pgpopulation/wpa_papers/pmid_downloads/get_new_elegans_xml.pl
- This runs on the 19th as part of the &pubmedNotFinal routine where we query PubMed again to update metadata for papers that weren't fully indexed by PubMed when they came into WormBase (which is usually all of them).
- You can check the latest pap_identifier that are doi with Kimberly as the curator with :
SELECT * FROM pap_identifier WHERE pap_curator = 'two1843' AND pap_identifier ~ 'doi' ORDER by pap_timestamp DESC;
- And you can probably run the script manually by logging on as acedb and pasting :
/home/postgres/work/pgpopulation/pap_papers/20140107_doi_from_pmid/get_doi_from_pmid.pl
Pipeline Updates, 2014-07-21
Change to pap_match.pm was made to insert the doi from the PubMed XML when papers are approved.
my @temp = ( "pmid$pmid", $doi ); # some data is %multi, so needs to be passed to &changeXmlPg as an array
New script created to add dois to existing WBPaper objects for which we have a PubMed XML. This script only adds dois to papers that don't have any existing doi entry (i.e., it skips papers that have a doi from the Genetics/G3 pipeline and it skips papers that have a manually added doi). The evidence for these doi entries is PubMed. Note that there is some HTML code in some dois from the PubMed XML. We may need to address and clean these up a some point?
The new script is on tazendra here:
/home/postgres/work/pgpopulation/pap_papers/20140722_doi_population_from_xml/doi_populate_from_xml.pl
Timestamp for when we first ran the script on tazendra and populated pap_identifier table: 2014-07-22 09:47:43
Older Pipeline
Populating dois for WormBase Paper Objects
1) Add dois to papers for which we have a PubMed XML and the doi is in the XML
a) Script is on mangolassi: /home/postgres/work/pgpopulation/pap_papers/20111014_doi_pii_population/doi_pii_populate.pl
2) Add dois to newly approved papers in the paper editor
a) Update code of paper_editor.cgi and pap_match.pm - Need to confirm this
3) Add dois to papers that don't have a doi in the PubMed XML, or don't have a PubMed XML
a) Use crossref.org account b) How many papers are there? c) How do we keep track of what we've done, or should we just periodically run a query?