Difference between revisions of "Expression Pattern"

From WormBaseWiki
Jump to navigationJump to search
Line 88: Line 88:
 
*''Cell'' - multiontology on anatomy terms (26 objects)-> when this is live consolidate these objects with the Anatomy_term field
 
*''Cell'' - multiontology on anatomy terms (26 objects)-> when this is live consolidate these objects with the Anatomy_term field
 
*''Pseudogene'' - text (1 object)
 
*''Pseudogene'' - text (1 object)
*''Sequence'' - text (13 objects) F54E2 2x (clone), R05D8 2x (clone), Y38B5A (clone), "Z28375" -C "EMBL Z28375" (sequence), "Z28376" -C "EMBL Z28376" (sequence), "Z28377" -C "EMBL Z28377" (sequence), R11H6 (clone), Y40H4A (clone), U14525, C47G2 (clone), Z32673 (sequence). We should consolidate these objects with Clones or Genes not sure what this consolidation means -- J. '''We will keep these objects as text in the beginning (this is to parse into old Expr_pattern data) but Wen and I have to find a way to get rid of this category in the long run and merge the Sequence with the clone, when possible. D'''
+
*''Sequence'' - text (13 objects) F54E2 2x (clone), R05D8 2x (clone), Y38B5A (clone), "Z28375" -C "EMBL Z28375" (sequence), "Z28376" -C "EMBL Z28376" (sequence), "Z28377" -C "EMBL Z28377" (sequence), R11H6 (clone), Y40H4A (clone), U14525, C47G2 (clone), Z32673 (sequence). We should consolidate these objects with Clones or Genes not sure what this consolidation means -- J. We will keep these objects as text in the beginning (this is to parse into old Expr_pattern data) but Wen and I have to find a way to get rid of this category in the long run and merge the Sequence with the clone, when possible. D '''okay, we're not working on this OA for a while yet, so if that gives you time to clean up this data, that'd be good.  otherwise we can do it down the line -- J'''
*''MovieURL''(32 objects) text ? -- J '''yes D'''
+
*''MovieURL''(32 objects) text ? -- J yes D
*''Laboratory'' - text (17 objects) There's an ontology of laboratories used for 3 OAs, if you want to use that.  The labs are not updated though, so if you want to use "new" labs, text is fine -- J '''great, then we can use the laboratory ontology D'''
+
*''Laboratory'' - ontology (17 objects) There's an ontology of laboratories used for 3 OAs, if you want to use that.  The labs are not updated though, so if you want to use "new" labs, text is fine -- J great, then we can use the laboratory ontology D '''I've changed the type to ontology, if you want multiontology, go ahead and change it -- J'''
 
*''Author'' - Multiontology on people
 
*''Author'' - Multiontology on people
 
*''Date'' - text (2617 objects)
 
*''Date'' - text (2617 objects)
*''Curated_by'' - text (6228 objects) Not Curator, meaning WBPerson ? The curator field is required already, but this is a different thing ? -- J. '''no, this is a legacy thing, the values are only Hinxton and Caltech. Wen would like to get rid of it evntually but for the moment we are keeping it there D'''  
+
*''Curated_by'' - text (6228 objects) Not Curator, meaning WBPerson ? The curator field is required already, but this is a different thing ? -- J. no, this is a legacy thing, the values are only Hinxton and Caltech. Wen would like to get rid of it evntually but for the moment we are keeping it there D '''ah, ok -- J'''  
  
 
notes: In the future we will get rid of the following tags: CDS, Sequence, Pseudogene and Protein and we will propose a model change for that. We will also get rid of Protein_description, Cell, Cell_group.
 
notes: In the future we will get rid of the following tags: CDS, Sequence, Pseudogene and Protein and we will propose a model change for that. We will also get rid of Protein_description, Cell, Cell_group.

Revision as of 01:40, 8 March 2011

Expression Pattern

Tags currently used in Expr_pattern objects (based on WS221):

Laboratory Expr_pattern Pattern Life_stage Gene Antibody Subcellular_localization GO_term Western Transgene Protein_description In_Situ Author Anatomy_term Reporter_gene Picture Date Reference Expressed_in Antibody_info Protein Northern Clone Cell RT_PCR Strain Remark MovieURL Pseudogene Curated_by Sequence

Types of fields Juancarlos can implement:

   * text : text
   * bigtext : text box expanded
   * dropdown : few values
   * ontology : controlled vocabulary 
   * multiontology / multidropdown : allows multiple values
   * toggle : on / off

OA interface

Tab1

  • Pgdbid -- postgres database ID, generates automatically upon entry.
  • Expr_pattern -- same as in interaction
    • no Expr_pattern ID is generated by clicking on 'new'. when 'duplicate', the ID from old entry will be in the field, but need to be deleted in order to get an new ID.
    • Expr_pattern ID should be assigned by cronjob daily at 4 am. start assigning with Expr10001
  • Reference -- multiontology on paper WBPaperID - Daniela add wish list for term info
  • Gene -- ontology on genes WBGeneID - show WBID, locus, and synonym in term info as in GO OA
  • Anatomy term - multiontology. Controlled vocabulary found here: http://brebiou.cshl.edu/viewcvs/*checkout*/Wao/WBbt.obo (same as in Picture OA). We need to have 3 different Anatomy term boxes, one for the Partial, one for the certain and one for the uncertain Qualifiers. We also have to think on how to inplement the text options for each anatomy term. normally the text would go for the parial qualifiers.
  • GO_term - multiontology of GO_Term like gop_goid.
  • Subcellular_localization - bigtext, details on subcellular localization.
  • Life_stage - multiontology like in the phenotype OA and picture OA

Tab2

  • Type - Wen suggested multidropdown select from: Antibody, Reporter_gene, In_situ, RT_PCR, Northern, Western but this is not possible because we have text associated to those values. For J, would be ideal to have a dropdown and once we choose from the dropdown we should have a text box associated with it

NB:

    • Antibody^t " this tag was used 462 times and has text associated -> not possible just to toggle
    • Reporter_gene^t " this tag was used 7273 times and has been used twice for the same object! -> We need a separator between lines. We will add lots of text and would be good to have that text split into parts
    • In_situ^t " this tag was used 434 times and has always text -> not possible just to toggle
    • RT_PCR^t " this tag was used 165 times has text associated -> not possible just to toggle
    • Northern^t " this tag was used 347 times and has text or just Northern label -> not possible just to toggle
    • Western^t " this tag was used 19 times and has always text -> not possible just to toggle

all those above are the values of "type" right ? right D From the Reporter_gene description, does this mean you need to add text to this dropdown ? Do you want a "type" dropdown and a "type text" bigtext ? yes, would be great to be able to select one of the above with a dropdown and, once selected have a bigtext box next to it D. Well, we can have a Type multidropdown, and a Type_text bigtext, but each of the types you pick in the multidropdown won't be associated with anything specific in the big block of bigtext. If you wanted to have associations, you'd have to pick RT_PCR and Antibody (for example) in the multidropdown then in the bigtext you'd have to type RT_PCR <some rtpcr text> | Antibody <some antibody text> using the pipe ( | ) as a divider to separate the different things. At this point there's no point in having a multidropdown because you're typing everything in the bigtext field anyway. If you want to do things this way, add a "Type_text" bigtext field. I would instead suggest that if you want a tag + text associated with each other, you get rid of "Type" and make a lot of toggle_text fields, one for each of the types, then you could just click the toggle and type the text. We should probably talk about this in person since I'm not sure how you were originally picturing it working - J

  • Picture Multiontology on Picture ->notify picture person when there is a new picture to curate Notify how ? -- J I put that note for myself but in the long run would be good to have a way to notify other curators when there is a new object they should curate. For the Expr_pattern OA this applies to Picture, transgene and antibody D It's still unclear to me how curators should get notified that there's a new value. We should probably talk about this. If this is something that "would be nice, but isn't important" but is still necessary for this field to exist, then okay, we don't have to talk about it. But if it turns out that we set it up in a way that won't work, I'm not going to want to talk about it after all the code's done and rewrite the code. Of course, we're not doing anything yet, we're just talking about how we will do this _eventually_ so there's no huge rush to talk about it -- J
  • Antibody_info -- multiontology on antibodies
  • Reporter_gene bigtext, details on reporter gene construct. Multiline Not sure what you mean by multiline, if you mean the .ace file should have the tags multiple times (yes)we'd have to decide what the separator would be, you'd type the separator manually, and we'd have the dumper split on it -- J yes, I thnk this is the way to go is to add a separator manually D okay, we've pretty much always used | so just use that to separate entries, and let me know when we write the dumper to split on | and print out data in different tags. -- J
  • pattern bigtext, details on tissue distribution. Multiline
  • remark bigtext, if any comments required. Multiline
  • transgene multiontology on transgenes.
  • Curator - Multiontology on people
  • No dump - Toggle

Tab3

  • Protein_description - text (30 objects)
  • Clone - multiontology on clones (341 objects) (when OA is in place discuss with Chris on the clone class). Is there a better place to get clones than http://textpresso-dev.caltech.edu/gsa/worm/known_entities/Clone?
  • Strain - multiontology on strains (812 objects). is there a better place to take the strain list then http://textpresso-dev.caltech.edu/gsa/worm/known_entities/Strain?
  • Cell - multiontology on anatomy terms (26 objects)-> when this is live consolidate these objects with the Anatomy_term field
  • Pseudogene - text (1 object)
  • Sequence - text (13 objects) F54E2 2x (clone), R05D8 2x (clone), Y38B5A (clone), "Z28375" -C "EMBL Z28375" (sequence), "Z28376" -C "EMBL Z28376" (sequence), "Z28377" -C "EMBL Z28377" (sequence), R11H6 (clone), Y40H4A (clone), U14525, C47G2 (clone), Z32673 (sequence). We should consolidate these objects with Clones or Genes not sure what this consolidation means -- J. We will keep these objects as text in the beginning (this is to parse into old Expr_pattern data) but Wen and I have to find a way to get rid of this category in the long run and merge the Sequence with the clone, when possible. D okay, we're not working on this OA for a while yet, so if that gives you time to clean up this data, that'd be good. otherwise we can do it down the line -- J
  • MovieURL(32 objects) text ? -- J yes D
  • Laboratory - ontology (17 objects) There's an ontology of laboratories used for 3 OAs, if you want to use that. The labs are not updated though, so if you want to use "new" labs, text is fine -- J great, then we can use the laboratory ontology D I've changed the type to ontology, if you want multiontology, go ahead and change it -- J
  • Author - Multiontology on people
  • Date - text (2617 objects)
  • Curated_by - text (6228 objects) Not Curator, meaning WBPerson ? The curator field is required already, but this is a different thing ? -- J. no, this is a legacy thing, the values are only Hinxton and Caltech. Wen would like to get rid of it evntually but for the moment we are keeping it there D ah, ok -- J

notes: In the future we will get rid of the following tags: CDS, Sequence, Pseudogene and Protein and we will propose a model change for that. We will also get rid of Protein_description, Cell, Cell_group.

Tags used only once that should be fixed

  • Expressed_in - text 1 entry
  • Protein - text 1 entry could be put in Protein_description