Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques

Table 5 CNN hyperparameter space explored with hg19 and hg38 data through Bayesian optimization. A. Architecture and learning hyperparameters of the fixed-CNN; B. Architecture and hyperparameter space of the Bayesian-CNN models trained on the hg19 dataset; C. Architecture and hyperparameter space of the Bayesian-CNN models models trained on the hg38 dataset

Layers	Type	Units	Kernel	Activation	Notes
A: fixed-CNN
3	Convolutional	64	5	ReLU	–
1	Max pooling 1D	–	–	–	size 2
3	Convolutional	128	3	ReLU	–
1	Max pooling 1D	–	–	–	size 2
3	Convolutional	128	3	ReLU	–
1	Average pooling 1D	–	–	–	–
1	Dropout	–	–	–	Probability 0.5
2	Dense	10	–	ReLU	–
1	Dropout	–	–	–	Probability 0.5
1	Dense	1	–	Sigmoid	–
Learning parameters
Learning rate	0.002
Batch size	256
Optimizer	Nadam
Epochs	100

Layers	Type	Units	Kernel	Activation	Notes
		Hyperparameter space	Hyperparameter space
B: Bayesian-CNN (hg19 dataset)
3	Convolutional + batch norm	{32, 64, 128}	5	ReLU	–
1	Max pooling 1D	–	–	–	Size 2
1	Convolutional + batch norm	{32, 64, 128}	{5, 10}	ReLU	–
1	Max pooling 1D	–	–	–	Size 2
1	Flatten	–	–	–	–
1	Dense	{10, 32, 64}	–	ReLU	–
1	Dropout	–	- -	–	Probability 0.1
1	Dense	{10, 32, 64}	–	ReLU	–
1	Dropout	–	–	–	Probability 0.1
1	Dense	1	–	Sigmoid	–
Learning parameters
Learning rate	0.002
Batch size	256
Optimizer	Nadam
Epochs	100

Layers	Hyperparameter space	Activation
C: Bayesian-CNN (hg38 dataset)
No. of convolutional groups	\([0 \ldots 2]\)
No. of hidden convolutional layers, composing the group	\(\{0, \ldots , 3\}\)	ReLU
No. of filters in the convolutional layer	\([0 \ldots 128]\)
2D kernel size in the convolutional layer	\([2 \ldots 8] \times [1,2]\)
Max pooling 2D	\([1 \ldots 8] \times [1,2]\)
Dropout	\([0 \ldots 0.5]\)
No. of dense groups	\([0 \ldots 2]\)
No. of hidden dense layers, composing the group	\(\{0, \ldots , 3\}\)	ReLU
No. of units in dense layer	\([0 \ldots 64]\)
Dropout	\([0 \ldots 0.5]\)
Output	1	Sigmoid
Learning parameters
Learning rate	0.002
l1 regularizer	0.0001
l2 regularizer	0.0001
Batch size	256
Optimizer	Nadam
Epochs	100

In Tables B and C, for each otpimized hyperparameter, the search hyperparameter space is shown, where square brackets are used for continuous hyperparameter spaces, while curly brackets are used for discrete ones. “Max Pooling 1D” and “Max Pooling 2D” refer, respectively, to max-pooling 1D and 2D layers, “Average Pooling 1D” refers to average-pooling 1D layer, “Dropout” refer to dropout layers, and “Batch Norm” refers to batch normalization layer

ISSN: 1471-2105