Skip to main content

Table 1

From: Identification of putative domain linkers by a neural network – application to a large sequence database

Sequence regions detected

No. of sequencesa

No. of sequence regionsb

No. of residuesc

% residuesd

All

101602

 

37315215

100.00

PDB

38470

410090

10210325

27.36

CDD

64349

124888

16207467

43.43

Low-complexity regions (45, 3.4, 3.75)e

48641

70373

8474412

22.71

Low-complexity regions (45, 2.9, 3.2)

6735

8539

803001

2.15

Low-complexity regions (45, 2.6, 2.9)

3208

3970

359227

0.96

Low-complexity regions (45, 2.45, 2.75)

2340

2786

250796

0.67

Putative domain linkers (0.90)f

14239

20876

1051607

2.82

Putative domain linkers (0.91)

12670

18193

953097

2.55

Putative domain linkers (0.92)

11160

15620

856149

2.29

Putative domain linkers (0.93)

9554

13053

752119

2.02

Putative domain linkers (0.94)

7977

10591

644472

1.73

Putative domain linkers (0.95)

6387

8133

529884

1.42

Putative domain linkers (0.96)

4819

5892

415150

1.11

Putative domain linkers (0.97)

3099

3592

281009

0.75

Putative domain linkers (0.98)

1326

1469

128455

0.34

Low-complexity regions (45, 2.9, 3.2) + Putative domain linkers (0.95)g

10364

13946

1139983

3.06

  1. Statistics of SWISSPROT sequences. a Number of SWISSPROT sequences that contained the detected sequence regions. b Number of sequence regions detected in the SWISSPROT sequences. c Total number of residues in the detected sequence regions. d Percentage of residues in the detected regions relative to all of the residues in the SWISSPROT sequences. e The values of the three parameters used for the SEG program, namely, the trigger window, the trigger and extension complexities are listed in the parentheses. f The cutoff parameter used for our neural network is indicated in the parentheses. g Predictions obtained by merging putative domain linkers and the low-complexity regions.