Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m (Replaced content with '2009 Meetings ==2011 Meetings== February [[WormBase-Caltech_Weekly_Calls_March_2011…')
Line 7: Line 7:
  
 
[[WormBase-Caltech_Weekly_Calls_March_2011|March]]
 
[[WormBase-Caltech_Weekly_Calls_March_2011|March]]
 
 
 
==April 7, 2011==
 
 
 
Transgene Model
 
*On Wiki
 
*Sent out to people
 
*Have a look; report any concerns
 
*Can follow on BitBucket; search for transgene; link to Wiki
 
*No objections at Caltech; Karen will send to Paul Davis
 
*Changes to ACE dumping script; Karen will talk to Juancarlos
 
*Changes needed in OA (softer deadline than dump)
 
 
 
Interactions
 
*Murky genetic interaction curation?
 
*Err on the side of generality/trusting author statements
 
*When in doubt, curate as "genetic interaction"
 
*Chris is working on decision tree/pipeline for curation
 
*Kimberly working on Physical Interaction model
 
 
 
BioGRID meeting at Princeton in May
 
*Call in
 
*What will Rose propose?
 
 
 
Expression Pattern Curation (Daniela/Wen)
 
*Daniela sent out picture page for review
 
*Expr Pattern OA wiki is in place:
 
**http://wiki.wormbase.org/index.php/Expression_Pattern
 
*As soon as Juancarlos is done with the modularization will start working on the code.
 
*In the meanwhile Daniela will curate expression pattern writing .ace files
 
*Expr_pattern OA should be ready by the next upload (May26th). (I really doubt this, parsing in data, writing dumpers, and checking it take a long time.  Picture and Interaction each probably took longer than 2 months, and we're not starting Expr until May at the earliest -- Juancarlos)
 
 
 
Patch file/Interbuild (Raymond)
 
*Developed good patch file
 
*Tested patch file to update WS224 to WS225 - seems OK
 
*Less than 5 minutes for upload
 
*Testing now should be done by Todd/OICR team
 
 
 
Uma started
 
*Working on concise descriptions of gene classes
 
*Karen has reviewed with Uma; Uma is reading papers
 
*Discussing details of descriptions
 
*Inconsistencies/discrepancies of gene class names
 
*>2400 gene classes
 
*Can work on generating formula for this curation
 
*Arun can help with automation
 
*May need to get Uma an interface to enter data into postgres
 
*Adapt concise description CGI for her? (probably write a whole new interface depending on goal -- Juancarlos)
 
*Gene class name and a text field
 
*Using Textpresso/WormMart output; sentence saver?
 
 
 
eggNOG data into citace?
 
*Who's going to handle the data? curate?
 
*Michael? OK
 
 
 
 
==April 14, 2011==
 
 
Gene Class Descriptions
 
*Concerns about maintenance and redundancy
 
*Uma here for ~ 3 months
 
*How many gene classes have alleles?
 
*How many are named by phenotype rather than just molecular data?
 
*How is this different from gene concise descriptions?
 
*Should it be a summary of all gene concise descriptions of the class?
 
*Things currently focused on:
 
**using WormMart to look at genes in a class
 
**pulls out all concise descriptions
 
**look at similarities
 
**interesting things to highlight
 
*Gene concise descriptions vs class descriptions
 
**Gene-centric vs Class-centric
 
**Consolidating/pooling all concise descriptions from individual genes?
 
*Going for maintenance-free statements
 
*Potentially building an interface
 
*Richard Durbin: development vs behavior?
 
*Prioritization?
 
*Focus on phenotype-based classes like UNC?
 
*Factors for prioritization:
 
**Numbers of genes curated
 
**molecular vs phenotype-based
 
**Amount of info currently available?
 
**Historical points
 
**Most actively worked currently? (most mentioned in last year's publications?)
 
*Uma and Karen could communicate with Kimberly and Ranjana about
 
*What is most efficient for Uma to focus on?
 
*Uma can look at gene class description makes sense
 
*Skip gene classes for which only one gene exists
 
*GO term stats on each class?
 
 
 
Papers missing from Textpresso
 
*Issue: Genetics papers for GSA markup are missing from SVM analysis
 
*Juancarlos' file on caprica
 
*Discrepancy between papers on Textpresso and those gone through SVM
 
*SVM doesn't pick up GSA papers
 
*Generate a filtering to detect which ones have been missed by SVM
 
*Michael looking into reasons why the pipeline isn't working
 
*Tazendra vs Textpresso discrepancies?
 
*Ruihua will process 56 missing papers retroactively
 
*Still working on how to avoid this in the future
 
 
 
 
==April 21, 2011==
 
 
 
WormMart
 
*Stable for IWM?
 
*WormMart presentation for WormBase workshop
 
**Query examples
 
**Data content
 
**Features
 
**Plans for the next year; what data to be made available
 
**Discuss stability?
 
 
 
Igor's machine: elbrus
 
*Can have problems
 
*Input info has changed for WS225
 
*RNAi script stopped working
 
*Migration issues
 
**Decide on priorities
 
**Who will maintain what?
 
**Migrate things over to newer machines, locally
 
*On elbrus now:
 
**RNAi scripts (sequence mapping); Migrating to EBI
 
**Anatomy ontology browser (optional)
 
**Microarray tool (Wen will take over)
 
**Get-Sequence CGI (how many people use it?)
 
*Meet with Norie at IWM?
 
*We should write everything down first
 
 
 
BioGRID-WormBase data exchange:
 
*Questions about how to best do this
 
*How to handle/organize genetic interactions
 
*BioGRID meeting in May; will not be sufficient to solve the genetic interaction issues
 
*Still working on genetic interaction organization scheme
 
 
 
SPELL/Microarray
 
*Discuss at IWM?
 
*Patching changes into WS225
 
 
 
ModENCODE Meeting
 
*Gary Williams should go
 
 
 
Picture Page
 
*New developments
 
*Elsevier publishing; how to handle them? ($17.5 per figure!!!)
 
 
 
Informatics Resources Assessment
 
*Can we develop plan to determine informatics resources available
 
*Long-term plans for sustainable informatics resources
 
 
 
 
==April 28, 2011==
 
 
Cecilia - Person report
 
*"AKA" ("Also Known As") added manually
 
**"also publishes as" created automatically based on verified author-person connections
 
**Over populating AKA manually is redundant
 
**Populating unique or new AKAs manually are still necessary
 
*Verified names happen automatically
 
**Author-Person connections happen with a script that Cecilia runs weekly manually
 
**Creating connections if an author name matches to exactly 1 person's name / aka (if 0 or 2+ matches then no connection occurs)
 
**Connection verifications happen weekly with a script that Cecilia also runs manually based on other verified authors in the paper sharing a Lineage or Laboratory
 
**Manual verification are emailed monthly to Persons who verify.
 
*Connection to person in GSA markup pipeline
 
**Automatic if script unambiguously identifies one individual person
 
**Karen (/ Daniela ?) will keep manual touch with Cecilia to create people for GSA because people links are necessary for the URLs.
 
*Extracting person info from lab web pages, papers, worm meeting registration?
 
**This takes varying amounts of time for questionable benefit
 
**Concensus at meeting was that lab website + PI website + papers + meetings are good sources. 
 
*WormBase policy? Should unverified person info be included?
 
**Raymond had question about whether it's a good idea in general to create people without explicit contact with the person via email.
 
**Juancarlos agrees that it's better data if they explicitly contact, but it's not necessarily better in a practical sense and anyone can spoof an email.
 
*Address? Remove details irrelevant to mailing address? Institute name?
 
**Leave it as is
 
*Prioritize name and e-mail verification for new paper/person connections?
 
**Juancarlos + Cecilia + Raymond will talk to Paul about priorities
 
 
 
Karen - Abby, Aldrin Montana - Google Summer of Code
 
*Aldrin didn't get accepted
 
*asks if he could work with us anyway?
 
*11 weeks during summer, full time
 
*would work remotely
 
*any ideas? modENCODE, microarray, pre-canned queries, Reactome (assigning confidence values to pathways)
 
*we can collect more ideas
 
*has coding experience (js, Perl, CSS), just not much with bioinformatics, he wants to learn
 
*Aldrin is working on a bioinformatics masters
 

Revision as of 18:28, 5 May 2011