Skip to main content

Table 2 The contribution of each feature in the three methods: the backward deletion with either AIC or BIC and the L 1 regularization method

From: Estimating Phred scores of Illumina base calls by logistic regression and sparse modeling

 

Selected methods

 

Contribution

 

L1 & AIC & BIC

Description

L1

AIC

BIC

1

x2

second largest intensity

-7.2243

-20.211

-18.458

2

x4

average of (x1-x2)

-21.93

-29.24

-26.646

3

x6

1/x3

-8.696

-11.815

-10.567

4

x7

x5

0.47013

1.2363

0.67037

5

x10

piecewise function of |x1-x2|

35.389

49.525

47.219

6

x11

 

14.579

27.974

26.622

7

x12

 

7.4602

28.731

26.943

8

x13

 

4.3013

36.89

34.142

9

x14

 

2.3397

106.05

97.787

10

x18

current cycle number

-0.00054878

-0.00065167

-0.00061738

11

x28

A(AG)

-5.3348

-12.956

-7.7384

12

x29

A(AT)

-4.2171

-21.337

-7.2382

13

x39

C(AC)

14.771

18.74

14.901

14

x42

C(CA)

-8.0916

-22.571

-11.925

15

x45

C(GA)

-10.411

-57.22

-13.558

16

x53

G(AT)

-3.3127

-24.44

-8.0976

17

x56

G(CT)

-7.7223

-52.163

-11.508

18

x59

G(GT)

-5.893

-77.08

-10.057

 

AIC & BIC

Description

L1

AIC

BIC

1

x15

piecewise function of |x1-x2|

0

859.11

799.13

2

x16

 

0

19180

7549.6

 

L1 & AIC

Description

L1

AIC

BIC

1

x5

standard error of (x1-x2)

121.95

-383.44

0

2

x8

log(x5)

-2.8995

8.3866

0

3

x20

indicators of the first 7th cycles

-3.0296

-30.195

0

4

x25

 

-5.4533

-12.32

0

5

x26

 

3.2316

126.13

0

6

x31

A(CG)

5.7108

-19.191

0

7

x32

A(CT)

1.864

-39.591

0

8

x33

A(GA)

9.9989

-22.428

0

9

x34

A(GC)

1.4679

-40.542

0

10

x35

A(GT)

11.712

-45.93

0

11

x36

A(TA)

5.7597

-45.042

0

12

x37

A(TC)

2.8418

-56.643

0

13

x43

C(CG)

-5.6759

-42.894

0

14

x44

C(CT)

-5.8779

-44.42

0

15

x46

C(GC)

7.3038

-28.805

0

16

x47

C(GT)

5.051

-42.573

0

17

x48

C(TA)

4.8103

-50.508

0

18

x50

C(TG)

-4.0949

-85.841

0

19

x51

G(AC)

4.0089

-7.5339

0

20

x52

G(AG)

0.3575

-7.7936

0

21

x54

G(CA)

1.1807

-9.1835

0

22

x55

G(CG)

-1.2149

-30.643

0

23

x57

G(GA)

13.802

-23.981

0

24

x58

G(GC)

1.9919

-47.805

0

25

x60

G(TA)

6.2969

-49.157

0

26

x61

G(TC)

1.9621

-75.049

0

27

x62

G(TG)

5.7278

-82.807

0

28

x64

T(AG)

-1.6306

-11.911

0

29

x66

T(CA)

1.6158

-9.8488

0

30

x67

T(CG)

-5.0538

-35.991

0

31

x68

T(CT)

-0.72808

-41.646

0

32

x69

T(GA)

1.0712

-31.065

0

33

x70

T(GC)

5.0709

-35.774

0

34

x71

T(GT)

12.661

-42.749

0

35

x73

T(TC)

-1.7016

-57.268

0

36

x74

T(TG)

-6.1194

-72.443

0

 

AIC

Description

L1

AIC

BIC

1

x38

A(TG)

0

-72.981

0

2

x40

C(AG)

0

-8.7049

0

3

x41

C(AT)

0

-19.167

0

4

x49

C(TC)

0

-63.57

0

5

x72

T(TA)

0

-48.82

0

 

L1

Description

L1

AIC

BIC

1

x1

largest intensity

1.0896

0

0

2

x3

average of x1

-6.0558

0

0

3

x9

piecewise function of |x1-x2|

13.791

0

0

4

x19

inverse distance

-2.0626

0

0

5

x21

indicators of the first 7th cycles

-1.5148

0

0

6

x24

 

-2.5247

0

0

7

x27

A(AC)

-0.59405

0

0

8

x30

A(CA)

11.7

0

0

9

x65

T(AT)

15.749

0

0

 

None

Description

L1

AIC

BIC

1

x17

piecewise function of |x1-x2|

0

0

0

2

x22

indicators of the first 7th cycles

0

0

0

3

x23

 

0

0

0

4

x63

T(AC)

0

0

0

  1. The contribution is defined by the t-score, namely the coefficient divides by its standard error. All 74 features are classified into different groups by the method it is selected