Person

From WormBaseWiki
Jump to navigationJump to search

Form URL

http://mangolassi.caltech.edu/~postgres/cgi-bin/cecilia/person_editor.cgi (sandbox for testing)

New postgreSQL Tables

Some tables do not have history, only have joinkey (two#), value, two_timestamp (timestamp) :

  • two -- second column 'two' is an integer
  • two_unsubscribe -- second column is text
  • two_curator_ip -- second column is text


All other tables have joinkey (two#), two_order (integer), two_

(data -- text), two_curator (two# -- text), two_timestamp (timestamp), new tables have a letter h in front to keep tracl of old timestamps (h_two_status):
  • two_firstname -- single value. always has a value.
  • two_middlename -- single value.
  • two_lastname -- single value. always has a value.
  • two_standardname -- single value. always has a value.
  • two_street -- multi value. (add i to Peson editor form section - always shows at least 4 fields.)
  • two_city -- single value.http://wiki.wormbase.org/index.php?title=Person&action=edit&section=2
  • two_state -- single value.
  • two_post -- single value.
  • two_country -- single value.
  • two_institution -- multi value.
  • two_old_institution -- multi value. (add it to Person submition form section-always shows at least 3 fields.)
  • two_old_inst_date -- multi value.
  • two_mainphone -- multi value.
  • two_labphone -- multi value.
  • two_officephone -- multi value.
  • two_otherphone -- multi value.
  • two_fax -- multi value.
  • two_email -- multi value.
  • two_old_email -- multi value.
  • two_old_email_date -- multi value.
  • two_pis -- multi value.
  • two_lab -- multi value.
  • two_oldlab -- multi value.
  • two_left_field -- single value.
  • two_unable_to_contact -- single value.
  • two_privacy -- single value.
  • two_aka_firstname -- multi value.
  • two_aka_middlename -- multi value.
  • two_aka_lastname -- multi value.
  • two_webpage -- multi value.
  • two_wormbase_comment -- multi value.
  • two_hide -- single value.
  • two_status -- single value.
  • two_mergedinto -- single value.
  • two_acqmerge -- single value.
  • two_comment -- multi value.
  • two_usefulwebpage -- multi value.

Changes between old and new tables

  • Adding history tables as h_two_...
  • Removing two_apu_ tables, two_groups
  • Adding two_usefulwebpage, two_old_inst_date, two_old_email_date
  • Changing two_comment into normal table with order
  • Changing old_timestamp column to two_curator (all values will be 'two1')

Questions

Please confirm that no one uses this form http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/confirm_paper.cgi In the last year 111 IPs have used it (mostly once) but a couple of them over 100 each. If you don't use it, is it in the emails you send out or in the submit forms in WormBase ? -- J answer Please keep form, must be the one linking to old papers associated but not verified -- Ceci Sounds good, added to list of forms to check -- J thanks ceci

1681 two_middlename without value, is that okay ? SELECT * FROM two_middlename WHERE two_middlename = ' '; <-- no space in between the singlequotes --J answer It was from when we thoght that a data was needed in tag, please delete if not needed -- Ceci.

1004 two_middlename with NULL order, value, old_timestamp, what should we do with that ? SELECT * FROM two_middlename WHERE two_order IS NULL;

answer i've been adding aka_middlename = NULL to every aka entry, if it is not necessary them we can delete it, but if it is necessary please keep it. Talk tomorrow? Ceci. I am also fixing some typos showing searching in new Person editor Ceci I don't know if you need it to say 'NULL' or not, but I thought you were never supposed to have blank values, you can skype me whenever you want -- J will talk in person about all middle/aka names, empty values

1) two_aka_midlename always has NULL (when none middlename to fill) 2) two_middlename Primero dijimos all tags filled Mas adelante me parece recordar que habiamos dicho que cuando la gente no tiene middle name no se ponia. http://www.wormbase.org/db/misc/person?name=WBPerson1823;class=Person Si esto esta mal hay que volver a poner 1 espacio vacio a los que no tienen data en two_middename.

I don't know if it needs to be a certain way, when I set it up I let you know how it should be and you should make a note of it so you can tell me if something is wrong later.


two427 had under wormbase_comment :

  • two427 | 1 | Due to illness, please get in touch with: Mark Edgley (edgley@interchange.ubc.ca) for gene knockout inquiries. Teresa Rogalski (rogalski@zoology.ubc.ca) for muscle lab matters. 2003-03-07 18:36:25.089309-08 | 2003-03-19 11:38:57.295736-08 |
  • two427 | 1 | Lab:DM or VC | 2003-03-10 16:23:56.877045-08 | 2003-03-19 11:38:57.295736-08
  • two427 | 2 | DM his own lab and VC head of the KO | 2008-01-31 00:00:00-08 | 2008-01-30 10:04:25.884154-08

Which has two order '1', I've deleted the first one, if you want to add it again as order '3' -- J. answer Deleted OK, thanks Ceci

Please comment on these changes :

  • I've added history tables, and changed the postgres tables to store the curator in the 4th column where the old_timestamp used to be. Updating data updates the data table with current timestamp ; removes any history data for that table-joinkey-order in the last 10 minutes, and inserts with current timestamp. answer OK Ceci
  • Street field always has at least 4 fields. answer YES ceci
  • name and aka_name now show first/middle/last in a row with timestamp below them. answer Like it Ceci
  • old_email shows data from old_email_date horizontally with timestamp. answer OK ceci
  • old_institution shows old_inst_date data, and webpage shows usefulwebpage vertically because I assume you need more space, if you'd rather display them horizontally like old_email, let me know. answer OK ceci
  • All changes to the person editing section of the person editor are done, please let me know if I overlooked something. answer looks ok, will look more in detail, thanks Juancarlos ceci Cool, let me know. Thanks -- J two_comment is missing ceci// added - J// thanks, c
  • In the Search Paper (to create people from XML) section, do you want to see input fields to create new people for authors that are already verified 'YES' ? Right now people that are verified 'YES' have a grey background, if you don't want to see the inputs, do you still want the grey background ? Probably easier to talk about this in person / skype. Some papers I tried were 00003865 00026893 -- J
  • wow thanks, looks really great, will look in more detail. Yes talk in person, lots of new stuffs, when do you have time? I have to be in Pasadena on Friday, is that OK with you? See email, we can talk until around 1pm OK answered email
  • I like the grey part. Paper 00026893 is showing aid77392 Davis J - xml Jerel C. Davis - 1 matches two4351 (Ralph E. Davis) -> Are matches to last names even when it doesn't match first initial or first name? ceci no, it's the same matching script that we wrote a few weeks ago, it matches to fullname exactly, we went over how that worked, right ? If not we can go over it in person


  • Last name Davis, first initial J shouldn't it match to two3329 Justin R Davis, two4914 Jamie Davis, two7656 Joseph S. Davis? ceci It matches exactly on xml first+last not initials, the single_match and histogram script you run manually matches with initials and creates them automatically, this is for manual curation
  • Like the grey background, but when clicked on Create people from XML volvio a crear Nadia D. Singh, va a haber un boton para marcar Do not create or ignore it?. Really love the multiple institutions - addresses (hide-show-assign), really like it!!! cec If you assign an institution it does something, if you don't it ignores the person. Yes, I noticed after trying several different things, it is in may notes to talk to you tomorrow
  • could you add old_institution? it will be useful when creating Persons form oldpapers,o when pdf shows currently at different affiliation, I could add both datas. c - Create new people from XML. Please add old_institution to this part (some people no longer at inst) Just 1 line, no address. Are those two issues the same thing ? If so the old institutions would be one of the 20 normal institutions, but you'd assign it under an old_inst dropdown instead of the normal inst dropdown. So you're saying that if selected in the old_inst dropdown, it would only take the institution line from that inst# and add it to the two_old_institution and the current date to two_old_inst_date. Would there only ever be one old institution ? Would you ever want to add both old institution and current institution ? yes. sometimes I'd add both current and old institution
  • WBPaper00026893. When editing another institution, it is showing street from 1st institution at creating people from XML fixed OK, thanks
  • We only need to change this script for person stats: /home/postgres/work/get_stuff/for_paul/curation_stats/wbperson_creation_stats/get_recent.pl which keys off of the two_display.cgi (unless we change the two_display.cgi to keep working,do you want it to ?) YES from To Do below it's decided that we'll have a display and editor in the person_editor with a checkbox for display mode (default off), and links between display - editor and editor - display. New question : Sometimes outside people have to look at the paper display, will they ever need to look at some kind of person display ? Because if so we don't want them to see the editor, so we should keep the two forms separate in that case. ok good idea, when people see their contact data to update, does it link to two_display? http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/person.cgi?action=Display&number=WBPerson1 no, the person.cgi gets its own person display independent from the two_display.cgi People outside Caltech don't ever use the two display, so we can get rid of it an just use the new person editor in display vs. edit mode.
  • Confirm paper: When they connect papers through confirm_paper.cgi (always)

It says ``WBPerson$two Thank you for updating your Author Person Paper connection When they connect papers through confirm_paper.cgi (saying 'NO') It says ``assigned new join $join to author $aid because said no to paper $joinkey When they comment through confirm_paper.cgi It says ``$two $curator comment for paper connections

When someone verifies a paper through generic.cgi It says ``${wbperson}, thank you for updating your Author Person Paper connection but only if /home/postgres/public_html/cgi-bin/data/confirm_paper_mailing.txt hasn't changed any data in the last day.

1.- generic.cgi (where is it?) You can see almost all the forms (including this one) in the sitemap http://tazendra.caltech.edu/~azurebrd/cgi-bin/index.cgi You can also see it in the emails that you send people to verify papers, when anyone clicks yes or no it's a link to that form

Looking at site map/ generic http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi shows: Your IP is : I clicked yes to a paper http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=VerifyPaper&two_number=two10606&aid=105080&pap_join=1&yes_no=YES

So when you ask me where is the generic.cgi should i tell you the above link or is there a shorter generic version?

The generic.cgi has multiple uses. The default is to show the IP, the one that relates to person verification is http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=VerifyPaper

but the whole link is more useful because it makes it clear what values need to be passed in, and in what format.

2.- confirm_paper.cgi http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/confirm_paper.cgi yes

  • Kimberly flag papers for Person curation

She's going to flag them in the table pap_curation_flags as 'author_person_priority' if they're priority and it will not have a value if they're not priority. So when you look at the checkout you can just look at the ones that are 'author_person_priority'.

Kimberly, this made me realize that since it's going into curation_flags the value can only either be there or not be there, so we might want to get rid of the 'blank' value because it implies that blank and not-priority are different, when they're stored the same way. We could make it a checkbox like functional annotation. Also, to both of you, should I just set all existing papers to'priority' ? Sounds good to me, c

I'm fine with leaving it as a drop-down but with only two values and priority as the default. We could set all existing papers to priority for now, but that doesn't preclude us from re-categorizing some if we decide at a later time that Reviews, for example, could reasonably be set to not_priority, right? Kimberly

in the Enter New Papers section : - the author-person select at the top with the box for free pmids has an author-person dropdown with values priority and not_priority which populate pap_curation_flags with 'author_person' - each pmid has an aut-per_priority dropdown that works the same way

I've updated the pap_match.pm to allow the extra flag

I'll now work on populating all the papers except for those with functional_annotation as 'author_person' priority in pap_curation_flags. Cecilia, this means that in the future Kimberly will always flag this for you and not set that for those that are functional annotation, so you don't need to think about functional annotation anymore, just whether it's flagged as priority or not.



Cecilia, please organize below this line

  • WBPaper00000000 identifier link_to_paper_editor curated What's curated ? -- J

Person curation, ceci does this work now ? I don't understand, plz explain

  • New script: that looks at all the stuff that was verified in the last 24 hours or by day, except for those exceptions below.

I want to receive daily email notifications or list for every time a paper is verified NOT mine. Unless: - pap_author_verified 'NO Cecilia Nakamura' - pap_author_verified YES after verified NOT same day - do not read NOT if there is a YES in latest timestamp We'd get data from pap_author_verified, we can't tell what came in to there from generic.cgi as opposed to confirm_paper.cgi, we can only know the values and their timestamps. Thanks, ceci And also filter out the papers you've already done, right ? Do you want to give me the range or papers that you've already done so I can enter them ? done, but I don't have documented all I have already curated, will mark them done/delete when I see them. ceci

  • change two_display.cgi to keep working

keys off means that it gets values from. In this case, it means that it knows what tables to look at by looking at the code of the two_display which lists the tables. So the two_display would need to list all the new tables and not the old tables.

Yo si quiero seguir usando el two_display! What's the advantage of the two_display vs. the future paper_editor ? J

http://tazendra.caltech.edu/~postgres/cgi-bin/cecilia/two_display.cgi I like to to-display because it shows only data, and I can do searches I use it all the time.

Ok, the searches should be better in the future person_editor, but the only-data display would be better in the two_display.cgi Or we could add a section in the paper_editor to do only display.

Would this section be able to do also searches? Otherwise I'd like a two_display.cgi where I can both. search and display. c You could probably toggle in the front page whether you wanted to use it in editor mode, or in display mode. But once you were in the search results page you couldn't change your mind at that point. Or maybe in the editor there could be a link to the display and viceversa. What would you prefer ? (think about it, let me know in wiki) J I'd prefer a link to the display and viceversa. c ok

Ceci, please fix the multiple entries in the postgres table two by running /home/cecilia/work/gaps_in_twos/get_recent.pl which now looks for gaps and tells you of multiples :

  • testdb=# SELECT * FROM two WHERE two = 871;
    • two871 | 871 | 2002-08-12 09:25:21.17393-07
    • two871 | 871 | 2004-11-27 14:44:38.614083-08
  • testdb=# SELECT * FROM two WHERE two = 871 AND two_timestamp > '2004-11-26' AND two_timestamp < '2004-11-28';
    • two871 | 871 | 2004-11-27 14:44:38.614083-08

Once you do the SELECT, you can do the DELETE to get rid of just that one

To Do

For new person editor

Check these forms / scripts (get rid of two_fullname view and replace with code looking at firstname middlename and lastname) :

  • /home/postgres/public_html/cgi-bin/cecilia/two_display.cgi
  • /home/postgres/public_html/cgi-bin/cecilia/twoeditor.cgi
  • /home/postgres/public_html/cgi-bin/paper_editor.cgi
  • /home/postgres/work/get_stuff/for_paul/curation_stats/wbperson_creation_stats/get_recent.pl
  • /home/azurebrd/public_html/cgi-bin/forms/paper_display.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/person.cgi # DONE
  • /home/azurebrd/public_html/cgi-bin/forms/person_lineage.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/confirm_paper.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/generic.cgi?action=VerifyPaper&two_number=two11941&aid=125345&pap_join=1&yes_no=YES
  • /home/azurebrd/work/parsings/authorperson/citaceLineage/update_twos_in_two_lineage.pl
  • /home/cecilia/UPLOAD/new-upload/connect_single_match_authors_and_get_histogram.pl
  • /home/cecilia/UPLOAD/new-upload/verify_by_labs_or_lineage.pl
  • /home/cecilia/UPLOAD/new-upload/email_connected_authors.pl
  • /home/acedb/cecilia/citace_upload/get_pap_person_ace.pl
  • /home/cecilia/work/gaps_in_twos/get_recent.pl


---Juancarlos is this still current? i don't know how this works I don't know , if you're using we have to talk about it, if not then we don't ---

  • /cecilia/home/postgres/work/pgpopulation/pap_papers/author_person -- directory to associate single match, verify papers by lineage and labs, sent emails for verification: Cecilia, find the correct path and reformat this
  • /home/postgres/work/pgpopulation/pap_papers/author_person/email_connected_authors.pl

for upload: Cecilia, reformat this /home/acedb/cecilia/citace_upload/get_pap_person_ace.pl Creando : /home/acedb/cecilia/citace_upload/errors_in_pap_person.ace


Cecilia, put range of papers to mark as curated for author_person here to populate pap_curation_done with value 'author_person' for those WBPaperIDs's joinkeys 33058-33200 33201-33300 33301-33400 33401-33500 33501-33586 33601-33644 34636-34641 35601-35700 35701-35800 35801-35900 35901-36000 36738 36750 36765 36796 36870 37108-37109 37135 37144 34157-37161 37637 37909 37923 37998 38037 38126 38142 38144 38152 38193

For Person verification


Change subject letter to email author for verification to show standard name instead of WBPerson id NO email to Cecilia standard automatic reply to paper verification, Yes to cronjobs to keep track of response instead of email to cecilia Possibly we could get rid of that and implement a separate way to track who's responded. We're already planning to have a cronjob check all the 'NO' verification in a 24 hour period, we could have it check all the verifications and sort that into 'NO' and 'YES', or some other system that she'll device. Cecilia, please clarify what you want to change and how to change it

  • script to check last 24 hours of YES/NO Cecilia, tell me when you want this to run, and how you want the data grouped (by person / by paper / by yes/no ?)

Run at noon. Data grouped by Papers verified NO, showing aid# + Person#. At the end of run, Persons verified YES and count of #Papers.

Update contact information form to have extra lines for old_institution, to not show akas, I don't want it to be editable, we'll talk about this later ok

Full name (discontinued) changed to manually get values

  • Table full_name sees 3 tables, First + middle + last names, if a table is missing it doesn't see it.
  • Changed to manually get 3 values (first_name, middle_name, last_name) if available and combine them instead of full_name. This will resolve problem of people with no middle name, adding an empty value to middle_name tag.

two_display -> Person editor

Old two_display replaced by this really awesome multi tasking new one. Can do multiple combination searches.

Person editor, display mode

  • Functionality:
  1. Search
    1. Priority = two#, will ignore rest.
    2. Combine search = unlimited search fields
      1. example- email + caltech + case insensitive = Will give results of all People with caltech in email address.
      2. example- ^ (carat) Li (last_name) substring
      3. example- ard$ (last_name) + caltech (email) substring = two699 Alison Claire Woollard lastname : Woollard
  2. Create New Person: Manual curation
  3. Search Paper: Individual discriminate paper curation. Will show data from Pubmed xml
    1. Bold red message Person curation done.
    2. WBPaper00026893 paper editor link.
    3. Identifiers and pdf links
    4. pmid 15965246 xml found
    5. Affiliation : Stanford University, Stanford, California 94305, USA. ndsingh@stanford.edu
    6. Inst1 - click to show another institution - hide another institution
      1. institution : create all institutions linked to paper.
      2. street
      3. street
      4. street
      5. street
      6. city
      7. state
      8. post
      9. country
    7. Under this section will show data from xml and postgres pap_ tables:
      1. From postgres pap_ tables aid,author, possible two#,sent, verified
      2. From pubmed xml and Cecilia manual curation: standard name, firstname, middlename, lastname
      3. From Cecilia manual curation: email, inst, old inst.
    8. Enter each author's data (not yet a Person), Under inst select #1, #2 etc, as well of oldinst. NO# institution will not create new Person (For existing Person do not mark any Institution here
    9. CREATE people from XML
    10. Creating persons from authors in WBPaper00026893 paper editor link.
    11. Checkout next paper id WBPaper00026894 checkout page.
  4. Checkout Papers: Will show papers not yet Person curated, flagged from 'author_person' priority in pap_curation_flags, Valid papers.

scripts that run automatically

Note: make sure to keep it updated when adding new scripts and forms, or when I stop using others. (also let Juancarlos know when stop using some so he can remove them from tazendra.)

script for person stats : run to get stats, and they're called by this script /home/postgres/work/get_stuff/for_paul/curation_stats/wrapper.sh which runs every monday at 2am.

/home/postgres/work/get_stuff/for_paul/curation_stats/wbperson_creation_stats/get_recent.pl

This one for person lineage stats : /home/postgres/work/get_stuff/for_paul/curation_stats/wbperson_lineage_stats/get_recent.pl

This one for paper-author-person stats : /home/postgres/work/get_stuff/for_paul/curation_stats/wbpaper_author_person_stats/get_recent.pl

Note: April 17, 2011 We only need to change the first one, which keys off of the two_display.cgi (unless we change the two_display.cgi to keep working,do you want it to ?)YES. keys off means that it gets values from. In this case, it means that it knows what tables to look at by looking at the code of the two_display which lists the tables. So the two_display would need to list all the new tables and not the old tables

scripts and forms

Current May 18, 2011

  • /home/postgres/public_html/cgi-bin/cecilia/two_display.cgi
  • /home/postgres/public_html/cgi-bin/cecilia/twoeditor.cgi
  • /home/postgres/public_html/cgi-bin/paper_editor.cgi
  • /home/postgres/work/get_stuff/for_paul/curation_stats/wbperson_creation_stats/get_recent.pl
  • /home/azurebrd/public_html/cgi-bin/forms/paper_display.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/person.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/person_lineage.cgi
  • /home/azurebrd/public_html/cgi-bin/forms/confirm_paper.cgi
  • /home/azurebrd/work/parsings/authorperson/citaceLineage/update_twos_in_two_lineage.pl
  • /home/cecilia/UPLOAD/new-upload/connect_single_match_authors_and_get_histogram.pl
  • /home/cecilia/UPLOAD/new-upload/verify_by_labs_or_lineage.pl
  • /home/cecilia/UPLOAD/new-upload/email_connected_authors.pl
  • /home/cecilia/work/gaps_in_twos/get_recent.pl
  • /home/acedb/cecilia/citace_upload/get_pap_person_ace.pl
  • http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=VerifyPaper
  • http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/confirm_paper.cgi
  • http://tazendra.caltech.edu/~azurebrd/cgi-bin/index.cgi (site map)

Notes

  • update to Paper editor (2011-05-18)Updated to add the author_person priority flag, creating the table pap_curation_done and moving the genestudied_done to

pap_curation_done as 'genestudied'. in the Enter New Papers section : - the author-person select at the top with the box for free pmids has an author-person dropdown with values priority and not_priority which populate pap_curation_flags with 'author_person' - each pmid has an aut-per_priority dropdown that works the same way

I've updated the pap_match.pm to allow the extra flag

I'll now work on populating all the papers except for those with functional_annotation as 'author_person' priority in pap_curation_flags. Cecilia, this means that in the future Kimberly will always flag this for you and not set that for those that are functional annotation, so you don't need to think about functional annotation anymore, just whether it's flagged as priority or not.

Useful links

http://www.mediawiki.org/wiki/Help:Formatting