In vitro flagging
Papers flagged for in vitro data were initially flagged via manual first pass and contained data on enzymatic and transporter activity, as well as other data, such as lipid quantitation and protein-protein interactions. For the purposes of semi-automating this area of GO Molecular Function curation we need a training set that is homogeneous, at least in terms of enzymatic and transporter activity.
To get this:
1) All papers previously flagged by manual first pass were reviewed.
Several different classes of papers were identified:
--papers that are true positives for describing an in vitro activity, but the activity cannot be traced to a specific protein based on the data presented in the paper. This is observed particularly with older papers that describe an 'activity' but do not clone the gene. These papers would be SVM positive for in vitro training, but are not curatable.
--papers that have in vitro experiments (Western blots of worm lysates) but do not describe any enzymatic activity or other GO-curatable Molecular Function. These papers would be SVM negatives and were removed from the training set.