Difference between revisions of "Associating genes with papers"

From WormBaseWiki
Jump to navigationJump to search
 
(8 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
Several possible strategies to test.
 
Several possible strategies to test.
  
For now, restricting to research articles (primary data).
+
Ideally, we'd like to get associations made as quickly as possible (better for users, helps for curation).
 +
 
 +
Different pipelines will likely be needed for different types of publications:
 +
 
 +
# Smaller scale research papers
 +
# High-throughput, large-scale papers
 +
# Reviews, Comments, etc.
  
 
Results may differ for sectioned vs non-sectioned papers, and will likely differ for Reviews, etc.
 
Results may differ for sectioned vs non-sectioned papers, and will likely differ for Reviews, etc.
Line 12: Line 18:
 
#[[Genes for which there is curated data]]
 
#[[Genes for which there is curated data]]
 
#[[Some combination of the above]]
 
#[[Some combination of the above]]
 +
#[[What to do about large-scale papers]]
 +
#[[What to do about new gene names not yet in WB]]
 +
#[[What to do about supplemental data]]
 +
#[[What to do about non-standard nomenclature]]
 +
#[[Current pipeline - what happens when gene IDs are made invalid?]]
  
 +
Back to [[Paper Pipeline]]
  
Back to [[Paper Pipeline]]
+
 
 +
[[Category:Curation]]

Latest revision as of 19:42, 10 August 2010

Several possible strategies to test.

Ideally, we'd like to get associations made as quickly as possible (better for users, helps for curation).

Different pipelines will likely be needed for different types of publications:

  1. Smaller scale research papers
  2. High-throughput, large-scale papers
  3. Reviews, Comments, etc.

Results may differ for sectioned vs non-sectioned papers, and will likely differ for Reviews, etc.

  1. Abstracts
  2. Gene frequency
  3. Genes in Results (or equivalent)
  4. Gene frequency in Results (or equivalent)
  5. Genes mentioned along with word in Figure or Table category
  6. Genes for which there is curated data
  7. Some combination of the above
  8. What to do about large-scale papers
  9. What to do about new gene names not yet in WB
  10. What to do about supplemental data
  11. What to do about non-standard nomenclature
  12. Current pipeline - what happens when gene IDs are made invalid?

Back to Paper Pipeline