Skip to main content

Table 1 The coefficients of the 74 predictive variables in the three methods

From: Estimating Phred scores of Illumina base calls by logistic regression and sparse modeling

x

Description

L1LR

BE-AIC

BE-BIC

x0

intercept

1.09

11.47

7.63

x1

largest intensity

1.48

-

-

x2

second largest intensity

-1.73

-4.84

-4.42

x3

average of x1

-1.18

-

-

x4

average of (x1-x2)

-4.65

-6.2

-5.65

x5

standard error of (x1-x2)

3.19

-10.03

-

x6

1/x3

-2.37

-3.22

-2.88

x7

x5

0.54

1.42

0.77

x8

log(x5)

-0.93

2.69

-

x9

piecewise function of |x1-x2|

0.59

-

-

x10

 

3.53

4.94

4.71

x11

 

3.45

6.62

6.3

x12

 

2.42

9.32

8.74

x13

 

1.44

12.35

11.43

x14

 

0.34

15.41

14.21

x15

 

-

23.06

21.45

x16

 

-

118.87

46.79

x17

 

-

-

-

x18

current cycle number

-0.016

-0.019

-0.018

x19

inverse distance

-0.24

-

-

x20

indicators of the first 7th cycles

-0.3

-2.99

-

x21

 

-0.15

-

-

x22

 

-

-

-

x23

 

-

-

-

x24

 

-0.25

-

-

x25

 

-0.54

-1.22

-

x26

 

0.32

12.49

-

x27

A(AC)

-0.11

-

-

x28

A(AG)

-0.91

-2.21

-1.32

x29

A(AT)

-0.67

-3.39

-1.15

x30

A(CA)

1.29

-

-

x31

A(CG)

0.86

-2.89

-

x32

A(CT)

0.25

-5.31

-

x33

A(GA)

1.44

-3.23

-

x34

A(GC)

0.21

-5.8

-

x35

A(GT)

1.66

-6.51

-

x36

A(TA)

0.89

-6.96

-

x37

A(TC)

0.44

-8.77

-

x38

A(TG)

-

-10.79

-

x39

C(AC)

2.27

2.88

2.29

x40

C(AG)

-

-1.34

-

x41

C(AT)

-

-2.77

-

x42

C(CA)

-0.95

-2.65

-1.4

x43

C(CG)

-0.7

-5.29

-

x44

C(CT)

-0.7

-5.29

-

x45

C(GA)

-1.29

-7.09

-1.68

x46

C(GC)

0.89

-3.51

-

x47

C(GT)

0.63

-5.31

-

x48

C(TA)

0.68

-7.14

-

x49

C(TC)

-

-9.25

-

x50

C(TG)

-0.54

-11.32

-

x51

G(AC)

0.58

-1.09

-

x52

G(AG)

0.05

-1.09

-

x53

G(AT)

-0.45

-3.32

-1.1

x54

G(CA)

0.18

-1.4

-

x55

G(CG)

-0.18

-4.54

-

x56

G(CT)

-1.02

-6.89

-1.52

x57

G(GA)

1.6

-2.78

-

x58

G(GC)

0.24

-5.76

-

x59

G(GT)

-0.75

-9.81

-1.28

x60

G(TA)

0.93

-7.26

-

x61

G(TC)

0.24

-9.18

-

x62

G(TG)

0.7

-10.12

-

x63

T(AC)

-

-

-

x64

T(AG)

-0.23

-1.68

-

x65

T(AT)

2.03

-

-

x66

T(CA)

0.21

-1.28

-

x67

T(CG)

-0.74

-5.27

-

x68

T(CT)

-0.1

-5.72

-

x69

T(GA)

0.16

-4.64

-

x70

T(GC)

0.73

-5.15

-

x71

T(GT)

1.94

-6.55

-

x72

T(TA)

-

-8.09

-

x73

T(TC)

-0.29

-9.76

-

x74

T(TG)

-0.99

-11.72

-

  1. We denote these 74 variables by x=(x 0,x 1,,x 74). In the first row of the table, ‘L1LR’ means the L 1-regularized logistic regression, ‘BE-AIC’ indicates the backward deletion with AIC, and ‘BE-BIC’ represents the backward deletion with BIC. The details of the variables in each row are described in Methods. x 27 to x 74 are corresponding to the 3-letter sequences, which indicate the type of the base in the previous cycle, type of the base with the largest and the second largest intensity in current cycle. Meanwhile, ‘-’ implies that the method has removed the feature