Contacting the Community

From WormBaseWiki
Jump to: navigation, search

List of papers in WormBase

  • This is the list of papers that come into WormBase, in real time, latest are at the bottom:

http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=WpaXref

Criteria for choosing papers for community gene description requests

  1. Should have paper editor tag-- pubmed_final final
  2. Should have paper editor tag--Status valid
  3. Should have paper editor tag--Type Journal_article
  4. Should not have paper editor tag--Curation flags E-mailed_community_gene_descrip
  5. Should not have been e-mailed for author first pass, in the last 3 months
  6. Should have a PDF in WormBase
  7. Should have: First author has a WBPerson with an e-mail address
  8. Should have at least one gene connected via 'Inferred_automatically' script that connects genes from abstracts to papers or from manual first pass curation
  9. Should not have an entry in the 'Reference' field of the OA, postgres table, con_paper (meaning it should not have been used in a concise description

Script location: /home/acedb/ranjana/concise_emailing/generate_list_papers_concise_email.pl

See output at : http://tazendra.caltech.edu/~acedb/ranjana/concise_list_to_email.html (Note: Above script has been replaced by the community curation tracker CGI, see below)

Criteria for choosing papers for phenotype requests

  1. Should be flagged for "newmutant"
  2. Should NOT be curated in Phenotype OA
  3. Should have paper editor tag-- pubmed_final final
  4. Should have paper editor tag--Status valid
  5. Should have paper editor tag--Type Journal_article
  6. Should have flag "Primary_data' Primary flag.

Letter for e-mailing the community for gene descriptions

Subject: WormBase request for community curation of gene descriptions

Dear Authors,

In an effort to keep the gene descriptions in WormBase updated, we are requesting your assistance either to update an existing gene description or write a new gene description if none exists, for any genes studied in your publication:

<paper citation>

Gene descriptions appear in the 'Overview' widget on WormBase gene pages. We would greatly appreciate if you, or any of the other authors, could use our simple web-based tool, to either write or update gene descriptions:

http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/community_gene_description.cgi

If you have any questions, comments or concerns, please let us know.

Thank you so much!

Best Regards,
The WormBase Gene Description Team

Letter for phenotype requests

Dear <author>,

In an effort to improve WormBase's coverage of phenotypes, we are requesting 
your assistance to annotate nematode phenotypes from the following paper:

<citation>

WormBase would greatly appreciate if you, or any of the other authors, could 
take a moment to contribute phenotype connections using our simple web-based tool:

http://www.wormbase.org/submissions/phenotype.cgi?input_1_pmid=23797102

If you have any questions, comments or concerns, please let us know.

Thank you so much!

Best regards,

The WormBase Phenotype Team

Community Contact & Curation Tracker CGI

Sandbox: http://mangolassi.caltech.edu/~postgres/cgi-bin/community_curation_tracker.cgi

Live: http://tazendra.caltech.edu/~postgres/cgi-bin/community_curation_tracker.cgi

EMAIL FUNCTIONALITY IS LIVE! PLEASE NOTE!

The Community Contact & Curation Tracker CGI allows curators (1) to get a filtered list of papers that fit the criteria for sending e-mail requests to authors, asking them to fill out the Concise Description Form or the Allele-Phenotype Form and (2) track the papers for which we've sent request e-mails. For each form (concise description & allele-phenotype), there is a "Ready" CGI to provide curators the filtered list of ready papers (#1 above) and a "Tracker" CGI to track progress of requests sent for each paper (#2 above).


New Mutant Ready CGI

For phenotypes ("newmutant" data type), the "New Mutant Ready" form generates a filtered list of papers that:

  1. have been predicted/indicated to have phenotype data (by newmutant or RNAi SVM) and have not been curated in the Phenotype or RNAi OAs (by WB curator or a community curator)(as determined by Curation Status Form: http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi?action=listCurationStatisticsPapersPage&select_curator=two2987&listDatatype=newmutant&method=any%20pos%20ncur&checkbox_cfp=on&checkbox_afp=on&checkbox_str=on&checkbox_svm=on)
  2. the paper has already had an AFP request sent
  3. if the first author email is available, the first author has not been e-mailed for phenotypes in the last month
  4. if the first author email is NOT available, the corresponding author has not been e-mailed for phenotypes in the last month
  5. are valid, pubmed final, journal article and have a PDF

The CGI limits the number of papers displayed to 100.

The CGI has eighteen (18) columns:

  • generate
  • skip
  • WBPaper
  • pmids
  • pdfs
  • first author initials
  • first author's person name
  • first author's person id
  • first author's person emails
  • corresponding author name
  • corresponding person
  • corresponding email
  • afp author name
  • afp person
  • afp email
  • pdf author name
  • pdf person
  • pdf email


Community contact & curation tracker CGI new mutant ready table 8-4-2017.png


Column purposes:

  • generate: this column only has a "generate email" button, which, when pressed, loads a new web page displaying an auto-generated e-mail to the corresponding and first authors with the form letter and PubMed citation in the body of the message.

Community contact & curation tracker CGI autogenerated email 11-3-2015.png

The contents of the target e-mail addresses, the subject line, and the body of the e-mail can all be edited at this stage. Once ready, the curator can click "send email" button to send off the e-mail. Recipients will receive the e-mail from outreach@wormbase.org. Any replies to that request e-mail will be automatically sent to the mailing list, curation@wormbase.org.

  • WBPaper & pmids: Simply, the WormBase paper ID and PubMed ID, respectively, for each paper
  • first author initials: This is the first entry of names in the author list from the WB paper editor. Usually this is of the form LastName FirstInitial
  • first author's person name, first author's person id, and first author's person emails are the verified name, WormBase ID, and e-mail address, respectively, of the first author of the paper; if the WBPerson had no affiliated e-mail address, the WBPerson Name and ID will not be shown. This may be because the author had no available e-mail address or that Cecilia has not yet verified the connection between the author and the person
  • corresponding email: The e-mail address to which the Author First Pass (AFP) request was sent for that paper
  • corresponding author name and corresponding person are the WormBase standard name and WormBase Person ID, respectively, that match the corresponding email that was sent the AFP request
  • pdfs: Links to the PDF and supplementals (if applicable)

New Mutant Tracker CGI

The purpose of this CGI is to track all e-mails that have been sent via the "New Mutant Ready" table. Each paper is represented on one row. Each column is sortable. The table has eight (8) columns:

  • email response
  • remark
  • allele-phenotype email date
  • email addresses sent request
  • community curated
  • WBPaper
  • pmids
  • pdfs


Community contact & curation tracker CGI new mutant tracker table 11-3-2015.png


Column purposes:

  • email response: This is a large free-text field for manually entering dates and notes about any e-mail responses (to the curation@wormbase.org mailing list) for each paper curation request
  • remark: Another large free-text field for manually entering any additional notes about the paper and its status
  • allele-phenotype email date: The date when the e-mail request was sent
  • email addresses sent request: The e-mail addresses that were sent the request
  • community curated: Whether or not a community curator has submitted any allele-phenotype date via the Allele-Phenotype form
  • WBPaper: WormBase paper ID
  • pmids: PubMed paper ID
  • pdfs: Links to the PDF and supplementals (if applicable)

Concise Description Tracker

Concise Description Ready CGI

Similar to the New Mutant CGI, except for:

  1. not doing any tracking of the Curation status form information
  2. pap_primary_data is 'primary'
  3. pap_gene has a gene
  4. tracker displays genes in pap_gene (genes that show in the paper editor) + corresponding gin_locus
  5. the email works the same, but with different text
  6. the tracker works the same but also displays genes that are community curated.
  7. the paper needs to be in the Textpresso list for gene occurences in the results section of a paper:

http://textpresso-dev.caltech.edu/concise_descriptions/textpresso/textpresso_papers_results_genes.txt

  1. considers both the primary and .sup papers in the TP list and gets the unique genes between them
  2. looks at the above list to get the genes and maps them to gene IDs, by looking at Public name to WBGene mapping?
  • need to update the new com_con_emailsent with the emails you've already sent. That is, the list of Paper IDs, email

address, and timestamps.

  • The pap_curation_flags doesn't have the email addresses that were sent an email.

Generating the Textpresso list of papers with genes in the 'Results' section

File:http://textpresso-dev.caltech.edu/concise_descriptions/textpresso/textpresso_papers_results_genes.txt

which is the first name (after the gene ID) in the above gene/synonym list

  • All synonyms, including cosmid names are mapped back to the gene ID, and hence to the first name
  • All synonyms that look like ‘Run’ are discarded, using the rule look only for synonyms that follow a string dash number or a string dot string or number.
  • List will be modified to include only those papers with 1-5 genes, all papers with more than 5 genes will not be included
  • Will keep all the files with the proper names

http://textpresso-dev.caltech.edu/concise_descriptions/textpresso/textpresso_papers_results_genes.all.txt (for all papers with all genes in results section)

  • Call the processed file that has only papers with only up to 5 different genes as:

http://textpresso-dev.caltech.edu/concise_descriptions/textpresso/textpresso_papers_results_genes.txt

since the community curation tracker for e-mailing looks at this URL.

Concise Description tracker CGI

  • The tracker will only know if a paper has been community-curated after community description has been entered in the concise OA by curator, with the relevant reference and WBPerson from the community
  • Consults above TP list for genes, same as the 'concise ready' CGI.
  • Queries for papers (reference field in the concise OA) where there is any WBPerson (the Person field in the concise OA)
  • Queries for papers where there is a WBPerson but gets the genes as well to say which genes are community curated
  • No longer valid: used the list of genes that show in the paper editor for a given paper, as WBGene ID, CGC name (from pap_gene + corresponding gin_locus tables in postgres)

Postgres tables

For New Mutant e-mail tracker:

  • com_app_emailsent : captures e-mail addresses sent request as well as timestamp when the e-mail was sent
  • com_app_emailresponse : stores text in the "email response" free text field
  • com_app_remark : stores text in the "remark" free text field

For Concise Description e-mail tracker, the similar tables are:

  • com_con_emailsent : captures e-mail addresses sent request as well as timestamp when the e-mail was sent
  • com_con_emailresponse : stores text in the "email response" free text field
  • com_con_remark : stores text in the "remark" free text field

Related scripts, predates the trackers

On tazendra, in this directory : /home/postgres/work/pgpopulation/concise_description/20150702_papgenes_after_lastupdate

  • gene_to_published_after - has a set of WBGenes + locus + count of papers + list of papers with a gene connection after the gene's date last updated (unsorted, but if you copy the file to your machine and put it in excel or something like that, you can sort it)
  • pis_to_papers_with_genes_published_after - has PIs sorted in descending order by count of papers that mention a gene that has a gene connection after the gene's date last updated. So, possibly some papers have multiple genes, but are counted just once.
  • This script takes some 40 minutes to run, because we're taking the date last updated, then for each of the 138056 doing a separate query to get paper-gene connections with a timestamp after that. So if we keep working on this script, it'd be kind-of-good to work out what we want since it takes so (relatively) long to run.
  • Chris has a similar list for newmutant-flagged papers that still need curation, so you can both talk about what you want to do next.

Back To Concise Descriptions