Skip to main content

Table 1

From: Identification of putative domain linkers by a neural network – application to a large sequence database

Sequence regions detected No. of sequencesa No. of sequence regionsb No. of residuesc % residuesd
All 101602   37315215 100.00
PDB 38470 410090 10210325 27.36
CDD 64349 124888 16207467 43.43
Low-complexity regions (45, 3.4, 3.75)e 48641 70373 8474412 22.71
Low-complexity regions (45, 2.9, 3.2) 6735 8539 803001 2.15
Low-complexity regions (45, 2.6, 2.9) 3208 3970 359227 0.96
Low-complexity regions (45, 2.45, 2.75) 2340 2786 250796 0.67
Putative domain linkers (0.90)f 14239 20876 1051607 2.82
Putative domain linkers (0.91) 12670 18193 953097 2.55
Putative domain linkers (0.92) 11160 15620 856149 2.29
Putative domain linkers (0.93) 9554 13053 752119 2.02
Putative domain linkers (0.94) 7977 10591 644472 1.73
Putative domain linkers (0.95) 6387 8133 529884 1.42
Putative domain linkers (0.96) 4819 5892 415150 1.11
Putative domain linkers (0.97) 3099 3592 281009 0.75
Putative domain linkers (0.98) 1326 1469 128455 0.34
Low-complexity regions (45, 2.9, 3.2) + Putative domain linkers (0.95)g 10364 13946 1139983 3.06
  1. Statistics of SWISSPROT sequences. a Number of SWISSPROT sequences that contained the detected sequence regions. b Number of sequence regions detected in the SWISSPROT sequences. c Total number of residues in the detected sequence regions. d Percentage of residues in the detected regions relative to all of the residues in the SWISSPROT sequences. e The values of the three parameters used for the SEG program, namely, the trigger window, the trigger and extension complexities are listed in the parentheses. f The cutoff parameter used for our neural network is indicated in the parentheses. g Predictions obtained by merging putative domain linkers and the low-complexity regions.
\