Prediction of Post-translational Modificaitons

Dataset

We examined the performance of PostMod by using Phospho.ELM database. To evaluate the new method positive (phosphorylation) and negative (non-phosphorylation) peptides are needed to make the reference set. Positive peptides are extracted from Phospho.ELM database. From the database we selected kinase groups which contain more than 20 known phosphorylation sites, resulting in 48 different kinase groups for the test. Negative peptides are randomly selected from sequences which share same phosphorylation residues with positive peptides and contain at least one positive peptide. We selected negative peptides to 10 times the number of positive peptides.

Performacne Assessment

We assessed the prediction performance with leave-one-out cross validation (LOOCV). The accuracy (ACC), precision (P), and recall (R) were calculated to measure the performance. We defined phosphorylation peptides as a positive class, and TP, TN, FP, and FN indicate true-positives, true-negatives, false-positives, and false-negatives, respectively.
  • ACC = (TP+TN)/(TP+FP+TN+FN)
  • P = TP/(TP+FP)
  • R = TP/(TP+FN)

Algorithm

  • The detail search procedure will be posted after publication.