Skip to main content

Table 2 Test data set analysis.

From: RepSeq – A database of amino acid repeats present in lower eukaryotic pathogens

   2+ SRRs 3+ SRRs
   Loose Repeat Threshold Loose Repeat Threshold
Total Proteins SRR Proteins Total True positives False positives Total True positives False positives
5000 250 342 250 (100%) 92 248 248 (99.2%) 0
5000 1250 1306 1250 (100%) 56 1237 1237 (99.0%) 0
10000 500 674 500 (100%) 174 492 492 (98.4%) 0
10000 2500 2633 2500 (100%) 133 2466 2466 (98.6%) 0
   Normal Repeat Threshold Normal Repeat Threshold
Total Proteins SRR Proteins Total True positives False positives Total True positives False positives
5000 250 256 250 (100%) 6 248 248 (99.2%) 0
5000 1250 1253 1248 (99.8%) 5 1237 1237 (99.0%) 0
10000 500 506 499 (99.8 %) 7 492 492 (98.4%) 0
10000 2500 2504 2496 (99.8%) 8 2466 2466 (98.6%) 0
   Strict Repeat Threshold Strict Repeat Threshold
Total Proteins SRR Proteins Total True positives False positives Total True positives False positives
5000 250 245 245 (98.0%) 0 244 244 (97.6%) 0
5000 1250 1220 1220 (97.6%) 0 1219 1219 (97.5%) 0
10000 500 485 485 (97.0%) 0 484 484 (96.8%) 0
10000 2500 2424 2424 (97.0%) 0 2420 2420 (96.8%) 0
  1. Proteomes containing 5000 or 10000 proteins (5% or 25% of which contained repeat regions) were created and analysed using RepSeq.