|
|
Line 14: |
Line 14: |
| [[WormBase-Caltech_Weekly_Calls_March_2012|March]] | | [[WormBase-Caltech_Weekly_Calls_March_2012|March]] |
| | | |
− | | + | [[WormBase-Caltech_Weekly_Calls_April_2012|April]] |
− | | |
− | | |
− | | |
− | == April 12, 2012 ==
| |
− | | |
− | RNAi OA
| |
− | *OA almost ready to go live
| |
− | *Testing now with test curation
| |
− | *Should go live next week for official curation
| |
− | | |
− | | |
− | New Website
| |
− | *Most problems are being fixed in a timely manner
| |
− | *Curators can now edit links and add custom widgets
| |
− | *Issues (tracked on GitHub) being dealt with quickly
| |
− | | |
− | | |
− | BioCurator Meeting
| |
− | *Good meeting, bigger than before
| |
− | *Common themes: data standards, how to educate users of database materials and how to use it (and think critically)
| |
− | *How can MODs work better with journals and PubMed to solve the 'triage' problem?
| |
− | **Streamlining the paper acquisition/curation process
| |
− | **MODs should ask NLM to take the burden of retrieving PDFs
| |
− | **Get lawyers involved to make available?
| |
− | **Publishers tend to be lax on text mining rules, maybe will evolve into an easier process
| |
− | *Maybe write a grant for research project as a proof-of-principle that triage can be done in an effective/efficient manner
| |
− | *May ask ISB (Int Society Biocurators) for help with this
| |
− | *Sequence and protein curation: tools, databases (topic-specific; pathways, cancer, etc.)
| |
− | *GeneWiki for human gene annotation
| |
− | **One page for each gene; already have ~10,000 articles
| |
− | **~Dozen editors, credibility of authors checked (?)
| |
− | **Reasonably satisfied with coverage of human disease genes
| |
− | *Whole-genome sequencing of individuals
| |
− | **Newly identified genetic disorder
| |
− | **VAST instead of BLAST
| |
− | *Tool to identify primers from papers and map them to the genome automatically
| |
− | *Intermine discussed
| |
− | **Comparable to WormMart
| |
− | **Object-oriented database
| |
− | **Performs similar to WormBase
| |
− | **Many pre-canned queries
| |
− | **Advanced search Query-builder available
| |
− | **MODs switched over to Intermine from BioMart
| |
− | **WormMart - Will Spooner tried to provide queries that are more natural
| |
− | **We can work to build an interface on top of Intermine, etc.
| |
− | **Todd has made progress with getting Intermine for WormBase
| |
− | *Lot's of specialized talks, reduced the productivity (compared to BioCreative meeting)
| |
− | *Curators explaining their curation pipeline
| |
− | *Textpresso still popular ;)
| |
− | **Six out of seven MODs using Textpresso
| |
− | **Discussed text mining in particular applications (eg. CCC)
| |
− | **Textpresso only tool using full-text for mining
| |
− | **Pete from FlyBase: SVM results are deteriorating (similar to WormBase)
| |
− | ***Start training from scratch; hopefully get better recall/precision numbers
| |
− | *Natural language processing on figure legends/captions
| |
− | **Tries to find text in the body that relates to figure
| |
− | **Possible collaboration with Texpresso
| |
− | *NLP research group in Germany
| |
− | **'Actor', 'agent' etc. and relationships (RDF triplets)
| |
− | *Doug Howe (ZFIN), zebrafish corpus small enough, doesn't need Textpresso
| |
− | *Julio Collado-Vides, Textpresso for E. coli fell apart, but trying to get back together
| |
− | | |
− | | |
− | Paul will meet someone from Elsevier
| |
− | *Image curation/ rights issues
| |
− | | |
− | | |
− | Genetic Interaction ontology
| |
− | *SGD on board with ontology so far; performing trial curation
| |
− | *FlyBase interested in using as well; will meet with Chris and Rose in May to discuss
| |
− | | |
− | | |
− | | |
− | == April 19, 2012 ==
| |
− | | |
− | Interaction object displays on WormBase website
| |
− | *Chris and Maher will sort out on GitHub
| |
− | *Chris will map data from old tags to new tags and suggest display changes for new data types where necessary
| |
− | *One issue to deal with is the complex objects with multiple Interaction_types (and intended to be separate objects)
| |
− | | |
− | | |
− | Interaction model and intragenic suppression
| |
− | *We need to make some modifications to the new Interaction model if we want to accommodate intragenic suppression (or other intragenic) events
| |
− | *Proposed change is to:
| |
− | #Make each allele a separate object
| |
− | #Move the Variation (and Transgene) tag out of the Interactor_info hash and into the main Interaction model under the Interactor tag
| |
− | #Add a Cis_intragenic_suppression and a Trans_intragenic_suppression tag under the Interaction_type tag (perhaps also intragenic_enhancement?)
| |
− | *With these changes:
| |
− | **Each variation (and transgene) can be listed as an interactor with Interactor_info indicating Affected, Effector, or Non_directional
| |
− | **Genes associated with intragenic, interacting variations will display (in Cytoscape view) as interacting with themselves via a Genetic Interaction
| |
− | **Mary Ann can then indicate/curate the flanking sequences for each allele
| |
− | | |
− | | |
− | Life_stage objects still dump as names, not IDs
| |
− | *This is because ACEDB only handles names, not IDs
| |
− | *Daniela is in charge of this class; we can discuss with her when she's back
| |
− | *We likely want to change to a system where we use only IDs in .ACE objects
| |
− | | |
− | | |
− | URL Constructors for GSA markup
| |
− | *Todd has taken care of much of the issue of URL construction for GSA marked-up papers
| |
− | *Karen will send Todd examples of Anatomy_term/Anatomy_name links that need to be checked
| |
− | *GSA papers will need to be rechecked to ensure that all links are working
| |
− | | |
− | | |
− | Network outages
| |
− | *Various office network ports are non-functional as of yesterday
| |
− | *IMSS/Network admins aware of issue and working on it
| |
− | | |
− | | |
− | Interaction and Gene_regulation objects for next upload
| |
− | *Conversion scripts will need to be run again to convert objects to new model format
| |
− | *Chris will look into whether or not the mapping files (needed to update Gene_regulation objects) will need to be updated for the newest data
| |
− | *Xiaodong will dump Gene_regulation objects out of the OA using the old dumping script
| |
− | | |
− | | |
− | | |
− | == April 26, 2012 ==
| |
− | | |
− | | |
− | Meeting with Elsevier rep
| |
− | *Elsevier getting more open to text-mining
| |
− | *People build apps and then put them on the Science-Direct site (e.g. TAIR app)
| |
− | *Wanted a couple sentences on what we want from text-mining
| |
− | *GO consortium would like text-mining for triage of new papers
| |
− | *'Climate is better now'
| |
− | | |
− | | |
− | Yeast-two-hybrid data issues
| |
− | *Lots of redundancies, bogus objects, many objects per bait/target (Sequences, CDSs, genes, etc.)
| |
− | *Provenance of data isn't clear
| |
− | *Should mv PCR products be mapped each build to genes?
| |
− | *May want to start from scratch and collect YH data from Vidal and Walhout labs
| |
− | *Check if BioGrid is curating this data already
| |
− | | |
− | | |
− | Next WormBase grant due in 6 months
| |
− | *30 pages
| |
− | *Need to figure out what we want to do in next 5 years; how we want to organize
| |
− | *Combine SAB meeting and grant writing?
| |
− | *New page types lagged behind due to updating of web site: e.g. Process pages
| |
− | *What is reasonable/realistic for what new content can get online?
| |
− | | |
− | | |
− | Curation wish-list on Wiki (Ranjana)
| |
− | *Many papers on new topics coming out
| |
− | *Drug-screening, drug interaction
| |
− | *Infection, parasitism
| |
− | | |
− | | |
− | Anatomy links from Worm Atlas broken
| |
− | *Links need to be fixed/cleaned up
| |
− | *Going forward, may need some sort of DOI system (stable links indefinitely)
| |
− | *An issue of GSA markup as well
| |
− | *Published links will never change; will need to accommodate
| |
− | | |
− | | |
− | Ontology searches
| |
− | *Trying to adapt AMIGO to use our .OBO files
| |
− | *National Center for Biomedical Ontologies uses Protege instead of OBO Edit
| |
− | *Consider adopting Protege/OWL files? Conversion could be trivial
| |
− | *Parent-child relationships file for C. elegans cell lineage; need to accommodate indeterminacy
| |
− | *Use synonym assignment to handle different possible outcomes/identities?
| |
− | | |
− | | |
− | Elbrus has reached capacity limit
| |
− | *Broke RNAi curation pipeline
| |
− | *Useful bits of code on elbrus
| |
− | **Data submission forms (RNAi data)
| |
− | **Microarray query tool (broken/toss)
| |
− | *Should put (working) code on GitHub repository
| |
− | | |
− | | |
− | User datamining demands
| |
− | *We need to accommodate users requests for data
| |
− | *Fix WormMart/incorporate Intermine
| |
− | *Bring back Batch Gene query
| |
− | *Custom query building (by curators) based on user requests?
| |
− | *Look at help desk e-mails and determine what users want
| |
− | *Pre-canned queries?
| |
− | *AcePerl scripts could perform batch gene queries
| |
| | | |
| | | |