Difference between revisions of "Pipeline for identifying papers with drugs"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
 
*Initial plan was to use the 'molecules' list to identify papers with drugs, output was overloaded with papers with biomolecules, does not work for drugs.
 
*Initial plan was to use the 'molecules' list to identify papers with drugs, output was overloaded with papers with biomolecules, does not work for drugs.
*Building the drug lexicon: The following sources were used:
+
 
 +
'''Building the drug lexicon: The following sources were used:'''
 +
 
 
*Antifungal agents: http://en.wikipedia.org/wiki/Antifungal_medication
 
*Antifungal agents: http://en.wikipedia.org/wiki/Antifungal_medication
 
*Antibiotics--antimicrobial, anti-fungal, anti-viral, anti-parasitic and anti-tumor agents:
 
*Antibiotics--antimicrobial, anti-fungal, anti-viral, anti-parasitic and anti-tumor agents:
Line 22: Line 24:
 
Exceptions:
 
Exceptions:
 
--Skip pure numbers
 
--Skip pure numbers
 
Ideas worth trying:
 
*Skip the sections: 'Materials and Methods', 'Materials', 'Methods'
 
*Make rule: Any list term has to occur with the word 'drug' at document level.
 
  
 
08/06/12:
 
08/06/12:
Line 43: Line 41:
 
  cholinergic  acetic acid  amber  serine  steroid  calcium  AMP  constancy
 
  cholinergic  acetic acid  amber  serine  steroid  calcium  AMP  constancy
 
  liver extract  bovine serum albumin  rabbits  valine Saccharomyces cerevisiae
 
  liver extract  bovine serum albumin  rabbits  valine Saccharomyces cerevisiae
 +
 +
'''Ideas worth trying:'''
 +
*Skip the sections: 'Materials and Methods', 'Materials', 'Methods'
 +
*Make rule: Any list term has to occur with the word 'drug' at document level.

Revision as of 20:44, 6 August 2012

  • Initial plan was to use the 'molecules' list to identify papers with drugs, output was overloaded with papers with biomolecules, does not work for drugs.

Building the drug lexicon: The following sources were used:

http://antibiotics.toku-e.com/antimicrobial

  • Antiparasitic drugs--Aldicarb, Ivermectin, Levamisole
  • Anti-depressants, anti-depressants, anticonvulsants, anti-psychotic and psycho-active drugs:

http://en.wikipedia.org/wiki/List_of_antidepressants

http://www.nimh.nih.gov/health/publications/mental-health-medications/alphabetical-list-of-medications.shtml

http://en.wikipedia.org/wiki/Psychoactive_drug (Table under the heading Affected neurotransmitter systems, capture columns 'Classification' and 'Examples')

http://www.surgeryencyclopedia.com/Fi-La/Immunosuppressant-Drugs.html

Exceptions: --Skip pure numbers

08/06/12:

Notes:

  • Need to add Resveratrol, Gingko biloba and the anticonvulsants--Ethosuximide, Arimethadione, if not present in lists
  • Will drop the above nutritional supplement list, this list is huge, thousands of terms, too big to clean and the Textpresso run output is really bad.
  • After dropping the supplement list, script re-run, still having problems with the following terms:
AGa  ATP  alanine  acetate  aspartic acid  Ca2  bovine serum albumin
biotin  chloroform  choline  date  deoxyribonucleic acid  DRAKE  ethanol
EDTA  fluoride  histidine  glycine  hydroxyapatite  same  soma  ROS
NADH  sodium dodecyl sulfate  violet  nucleotides  nucleic  tetramisole
tyrosine  tryptophan
pears  potassium  protease  PEPE  sodium phosphate  GABA  glycerol  phenylalanine
pyruvic acid  alanine  glutamate  glutathione  baker's yeast  protamine  lysine
fatty acid  fluoride  methionine  nitrogen  succinic acid  sulphate  oatmeal
cholinergic  acetic acid  amber  serine  steroid  calcium  AMP  constancy
liver extract  bovine serum albumin  rabbits  valine Saccharomyces cerevisiae

Ideas worth trying:

  • Skip the sections: 'Materials and Methods', 'Materials', 'Methods'
  • Make rule: Any list term has to occur with the word 'drug' at document level.