Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchLine 77: | Line 77: | ||
** For strains: | ** For strains: | ||
*** Show recorded genotype for verification; maybe ask to update/modify if needed? | *** Show recorded genotype for verification; maybe ask to update/modify if needed? | ||
− | * Mapping data: still ask for? Maybe for balancers, but no one is reporting that | + | * Mapping data: still ask for? Maybe for balancers, but no one is reporting that. Could still ask if there's interest |
Revision as of 19:28, 11 January 2018
Contents
Previous Years
2018 Meetings
January 4, 2018
WS264 Upload
- Citace upload to Wen, Tuesday January 16th, by 10am PST
- Upload to Hinxton on Jan 19th
Strain data import to AGR for disease
- Will begin to consider pulling in strains into AGR
- Will need to think about how genotypes are built and stored at other MODs
- We should encourage authors to include strain IDs
- Diseases are annotated to genes, alleles, and strains within WB
Curating phenotypes and diseases to strains or genotypes
- Should we generate a ?Genotype class to capture genotypes without a known strain name? Or to capture relevant/relative genotypes thought to be responsible for a phenotype or disease?
- We could create un-named strain objects, that use a new unique identifier as a primary identifier and represent the entire genotype of a strain used
- Introduction of a new ?Strain class attribute of a unique serial identifier (like WBStrain00001) would be very costly to implement; would need to consider how crucial this is before implementing
- We can, instead, use new strain (public) names like "WBPaper00012345_Strain1", etc. instead of creating new unique ID attribute for un-named strains
- When curating phenotypes to strains, we will want to specify what is the relevant/relative genotype that is causative/correlated with the disease or phenotype observation
- Would be best if the specification of the relevant genotype used controlled vocabularies (when possible) and free text (when needed); would need to work out the logistics/mechanics of such curation
- Transgene-phenotype curation currently specifies causative gene, but would be more complicated for strains
- Alternatively, we could create the ?Genotype class to represent the abstract "relative"/"relevant" genotype thought to be responsible for the phenotype or disease, and annotate directly to that ?Genotype object
- ?Strain approach:
- Use strain if named (but important to know if the control strain is not simply N2)
- If control strain is simply N2, causative genotype (and respective components) can be inferred from strain genotype
- If control strain is not N2, causative genotype and components would need to be specified at the moment of phenotype/disease curation (by mechanism to be worked out)
- If no strain name provided, create "un-named" strain that contains the entire genotype provided by authors
- Control strain issues above would still need to be addressed
- Use strain if named (but important to know if the control strain is not simply N2)
- ?Genotype approach:
- ?Genotype class could represent individual instances of relevant/relative genotypes that are suggested to be causative for a disease or phenotype
- ?Genotype objects would be created with formal construction, with DB associations to each component object (e.g. alleles, transgenes, etc.) as well as free text descriptions (for components with no corresponding DB object)
- Such ?Genotype objects could be used repeatedly throughout a paper when applicable, but would likely not be used in any other papers (we would likely accumulate redundant objects in the DB)
- We may want to consider strains with same public name that have diverged
- Apply new strain names with prefixes/suffixes? Create new strain objects? Keep original?
- Need to determine how each AGR member DB curates phenotypes or diseases to genotypes: is each "genotype" a relative or absolute genotype?
January 11, 2018
IWM swag
- Eppendorf tube openers with WormBase logo?
Update on AFP Form
- Idea is to move from author flagging to author validation of text mining and data submission wherever possible
- Goal is to flag all data types in a paper and either curate at WB or share with a group that does curate that data
- SVM flags and author flags can/will be used as filters in TPC
- Provide examples of what we want for each type of data to help avoid confusion
- Recognize entities automatically and show list to author
- Species, strains, genes, alleles, etc.
- Ask to verify or add unrecognized
- Could show known/existing objects with checkboxes
- Possibly include unrecognized pattern matching objects? Ask author to verify if these are real?
- For strains:
- Show recorded genotype for verification; maybe ask to update/modify if needed?
- Mapping data: still ask for? Maybe for balancers, but no one is reporting that. Could still ask if there's interest