Difference between revisions of "WormBase-Caltech Weekly Calls"

Revision as of 17:21, 3 May 2012

2009 Meetings

2011 Meetings

2012 Meetings

May 3, 2012

Curator Timestamps

Determining what data was provided directly by curator vs. what was populated automatically (e.g. mapping scripts)
Older data provided by curators that are no longer here will be problematic
We should archive all data-processing scripts in GitHub
Scripts can be made to create a unique timestamp that identifies that script after the fact

Interaction model change

Pulling Variation, Transgene, Antibody, and Expr_pattern tags out of the Interactor_info hash and into the main model
This was originally to be able to capture intragenic interactions
The problem is the inherent disconnect between an interactor entity (e.g. gene) and these objects
Making this change would force a post-curation mechanism that ties these entities together for intuitive data display
- Such a linking mechanism may be error prone, faulty, and potentially a headache for the web team
Is there a better way to handle this type of data?
Chris will discuss with Todd to see how much of a problem this would pose to the web team

@@ Line 14: / Line 14: @@
 [[WormBase-Caltech_Weekly_Calls_March_2012|March]]
+[[WormBase-Caltech_Weekly_Calls_April_2012|April]]
-== April 12, 2012 ==
-RNAi OA
-*OA almost ready to go live
-*Testing now with test curation
-*Should go live next week for official curation
-New Website
-*Most problems are being fixed in a timely manner
-*Curators can now edit links and add custom widgets
-*Issues (tracked on GitHub) being dealt with quickly
-BioCurator Meeting
-*Good meeting, bigger than before
-*Common themes: data standards, how to educate users of database materials and how to use it (and think critically)
-*How can MODs work better with journals and PubMed to solve the 'triage' problem?
-**Streamlining the paper acquisition/curation process
-**MODs should ask NLM to take the burden of retrieving PDFs
-**Get lawyers involved to make available?
-**Publishers tend to be lax on text mining rules, maybe will evolve into an easier process
-*Maybe write a grant for research project as a proof-of-principle that triage can be done in an effective/efficient manner
-*May ask ISB (Int Society Biocurators) for help with this
-*Sequence and protein curation: tools, databases (topic-specific; pathways, cancer, etc.)
-*GeneWiki for human gene annotation
-**One page for each gene; already have ~10,000 articles
-**~Dozen editors, credibility of authors checked (?)
-**Reasonably satisfied with coverage of human disease genes
-*Whole-genome sequencing of individuals
-**Newly identified genetic disorder
-**VAST instead of BLAST
-*Tool to identify primers from papers and map them to the genome automatically
-*Intermine discussed
-**Comparable to WormMart
-**Object-oriented database
-**Performs similar to WormBase
-**Many pre-canned queries
-**Advanced search Query-builder available
-**MODs switched over to Intermine from BioMart
-**WormMart - Will Spooner tried to provide queries that are more natural
-**We can work to build an interface on top of Intermine, etc.
-**Todd has made progress with getting Intermine for WormBase
-*Lot's of specialized talks, reduced the productivity (compared to BioCreative meeting)
-*Curators explaining their curation pipeline
-*Textpresso still popular ;)
-**Six out of seven MODs using Textpresso
-**Discussed text mining in particular applications (eg. CCC)
-**Textpresso only tool using full-text for mining
-**Pete from FlyBase: SVM results are deteriorating (similar to WormBase)
-***Start training from scratch; hopefully get better recall/precision numbers
-*Natural language processing on figure legends/captions
-**Tries to find text in the body that relates to figure
-**Possible collaboration with Texpresso
-*NLP research group in Germany
-**'Actor', 'agent' etc. and relationships (RDF triplets)
-*Doug Howe (ZFIN), zebrafish corpus small enough, doesn't need Textpresso
-*Julio Collado-Vides, Textpresso for E. coli fell apart, but trying to get back together
-Paul will meet someone from Elsevier
-*Image curation/ rights issues
-Genetic Interaction ontology
-*SGD on board with ontology so far; performing trial curation
-*FlyBase interested in using as well; will meet with Chris and Rose in May to discuss
-== April 19, 2012 ==
-Interaction object displays on WormBase website
-*Chris and Maher will sort out on GitHub
-*Chris will map data from old tags to new tags and suggest display changes for new data types where necessary
-*One issue to deal with is the complex objects with multiple Interaction_types (and intended to be separate objects)
-Interaction model and intragenic suppression
-*We need to make some modifications to the new Interaction model if we want to accommodate intragenic suppression (or other intragenic) events
-*Proposed change is to:
-#Make each allele a separate object
-#Move the Variation (and Transgene) tag out of the Interactor_info hash and into the main Interaction model under the Interactor tag
-#Add a Cis_intragenic_suppression and a Trans_intragenic_suppression tag under the Interaction_type tag (perhaps also intragenic_enhancement?)
-*With these changes:
-**Each variation (and transgene) can be listed as an interactor with Interactor_info indicating Affected, Effector, or Non_directional
-**Genes associated with intragenic, interacting variations will display (in Cytoscape view) as interacting with themselves via a Genetic Interaction
-**Mary Ann can then indicate/curate the flanking sequences for each allele
-Life_stage objects still dump as names, not IDs
-*This is because ACEDB only handles names, not IDs
-*Daniela is in charge of this class; we can discuss with her when she's back
-*We likely want to change to a system where we use only IDs in .ACE objects
-URL Constructors for GSA markup
-*Todd has taken care of much of the issue of URL construction for GSA marked-up papers
-*Karen will send Todd examples of Anatomy_term/Anatomy_name links that need to be checked
-*GSA papers will need to be rechecked to ensure that all links are working
-Network outages
-*Various office network ports are non-functional as of yesterday
-*IMSS/Network admins aware of issue and working on it
-Interaction and Gene_regulation objects for next upload
-*Conversion scripts will need to be run again to convert objects to new model format
-*Chris will look into whether or not the mapping files (needed to update Gene_regulation objects) will need to be updated for the newest data
-*Xiaodong will dump Gene_regulation objects out of the OA using the old dumping script
-== April 26, 2012 ==
-Meeting with Elsevier rep
-*Elsevier getting more open to text-mining
-*People build apps and then put them on the Science-Direct site (e.g. TAIR app)
-*Wanted a couple sentences on what we want from text-mining
-*GO consortium would like text-mining for triage of new papers
-*'Climate is better now'
-Yeast-two-hybrid data issues
-*Lots of redundancies, bogus objects, many objects per bait/target (Sequences, CDSs, genes, etc.)
-*Provenance of data isn't clear
-*Should mv PCR products be mapped each build to genes?
-*May want to start from scratch and collect YH data from Vidal and Walhout labs
-*Check if BioGrid is curating this data already
-Next WormBase grant due in 6 months
-*30 pages
-*Need to figure out what we want to do in next 5 years; how we want to organize
-*Combine SAB meeting and grant writing?
-*New page types lagged behind due to updating of web site: e.g. Process pages
-*What is reasonable/realistic for what new content can get online?
-Curation wish-list on Wiki (Ranjana)
-*Many papers on new topics coming out
-*Drug-screening, drug interaction
-*Infection, parasitism
-Anatomy links from Worm Atlas broken
-*Links need to be fixed/cleaned up
-*Going forward, may need some sort of DOI system (stable links indefinitely)
-*An issue of GSA markup as well
-*Published links will never change; will need to accommodate
-Ontology searches
-*Trying to adapt AMIGO to use our .OBO files
-*National Center for Biomedical Ontologies uses Protege instead of OBO Edit
-*Consider adopting Protege/OWL files? Conversion could be trivial
-*Parent-child relationships file for C. elegans cell lineage; need to accommodate indeterminacy
-*Use synonym assignment to handle different possible outcomes/identities?
-Elbrus has reached capacity limit
-*Broke RNAi curation pipeline
-*Useful bits of code on elbrus
-**Data submission forms (RNAi data)
-**Microarray query tool (broken/toss)
-*Should put (working) code on GitHub repository
-User datamining demands
-*We need to accommodate users requests for data
-*Fix WormMart/incorporate Intermine
-*Bring back Batch Gene query
-*Custom query building (by curators) based on user requests?
-*Look at help desk e-mails and determine what users want
-*Pre-canned queries?
-*AcePerl scripts could perform batch gene queries

Difference between revisions of "WormBase-Caltech Weekly Calls"

Revision as of 17:21, 3 May 2012

2012 Meetings

May 3, 2012

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools