 Research
 Open Access
 Published:
DISTEMA: distance mapbased estimation of single protein model accuracy with attentive 2D convolutional neural network
BMC Bioinformatics volume 23, Article number: 141 (2022)
Abstract
Background
Estimation of the accuracy (quality) of protein structural models is important for both prediction and use of protein structural models. Deep learning methods have been used to integrate protein structure features to predict the quality of protein models. Interresidue distances are key information for predicting protein’s tertiary structures and therefore have good potentials to predict the quality of protein structural models. However, few methods have been developed to fully take advantage of predicted interresidue distance maps to estimate the accuracy of a single protein structural model.
Result
We developed an attentive 2D convolutional neural network (CNN) with channelwise attention to take only a raw difference map between the interresidue distance map calculated from a single protein model and the distance map predicted from the protein sequence as input to predict the quality of the model. The network comprises multiple convolutional layers, batch normalization layers, dense layers, and SqueezeandExcitation blocks with attention to automatically extract features relevant to protein model quality from the raw input without using any expertcurated features. We evaluated DISTEMA’s capability of selecting the best models for CASP13 targets in terms of ranking loss of GDTTS score. The ranking loss of DISTEMA is 0.079, lower than several stateoftheart singlemodel quality assessment methods.
Conclusion
This work demonstrates that using raw interresidue distance information with deep learning can predict the quality of protein structural models reasonably well. DISTEMA is freely at https://github.com/jianlincheng/DISTEMA
Introduction
Estimation of protein model accuracy (EMA) or assessment of protein model quality (QA) is an important problem in protein structure prediction. Since the seventh Critical Assessment of Techniques for Protein Structure Prediction (CASP7) [1] EMA (or QA) has been a prediction category in CASP experiments. A lot of methods have been developed to evaluate the quality of protein models [2,3,4,5,6]. These EMA methods fell into two main categories: multimodel methods and singlemodel methods. A multimodel method takes a pool of prediction structure models of the same target as input to evaluate their quality based on the similarity between the models and possibly other structural features. A singlemodel method predicts the quality of a single protein without comparing it to any other structure models. A multimodel method’s performance depends on the proportion of good models in the pool and may perform poorly when there are only a few good models. In contrast, a singlemodel EMA method [7] can estimate the accuracy of a single protein model without being influenced by the existence of other models. A recent study [8] shows the singlemodel methods can perform better than multimodel methods in some cases. Moreover, different from multimodel methods that can only predict relative quality of models in a pool, singlemodel methods can predict the absolute quality of a single model, which is important for users to decide how to use the model. Therefore, singlemodel quality assessment is receiving more and more attention, even though its average performance was still lower than multimodel methods in the past several CASP experiments.
Numerous machinelearning methods have been developed to combine various protein structural features to assess the quality of protein models. ProQ2 [9] and Model Evaluator [7] applied support vector machines (SVM) with residue contacts, secondary structure information, solvent accessible surface area, and/or sequence features to predict a global quality score—the global similarity between a protein model and its native structure. ProQ3 [9] added the Talaris energy as a new feature on top of the ProQ2. ProQ3D [10] used a multilayer perceptron with the same features used in ProQ3 for protein model quality prediction. Recently, deep learningbased models have been applied to improve the estimation of model accuracy. DeepQA [3] utilized deep belief networks to predict the global quality score. ProQ4 [11] exploited the transfer learning and 1D convolutional neural network (CNN) to predict the Local Distance Difference Test (LDDT) score [12]. DeepRank [5] applied deep learning to integrate multiple features including residueresidue contact features to predict model quality and performed best in selecting best protein models in the CASP13 experiment. DeepRank2 [6] added a new interresidue distance feature with a deeper and wider neural network to predict global model quality. Some recent methods leverage more complex deep learning architectures. Treating a protein structural model as a graph, ProteinGCN [13], GraphQA [14] and VoroCNN [15] applied graph convolutional networks (GCN) to estimate the model accuracy. ResNetQA [16] and DeepAccNet [17] used deep residue networks to address the problem.
In addition to the inference technology, the performance of EMA method depends on input features. In CSAP13, DeepRank [5] demonstrated that accurate residueresidue contacts (a simplified representation of distances between residues) predicted by deep learning improved the prediction of the quality of protein structural models, suggesting that more detailed residueresidue distance predictions could further improve EMA. However, only a few methods [6, 16, 17], use residueresidue distances to estimate the accuracy of protein structural models.
Instead of extracting features from the predicted residueresidue distance maps based on human intuition or expertise as most existing methods did, we designed a 2D convolutional neural network (2DCNN) with the channelwise attention to directly use the raw difference map between the distance map of a model and the distance map predicted from the protein sequence to estimate the accuracy of a single protein model. On the CASP13 dataset, our method—DISTEMA—achieved the better performance than other stateofart singlemodel methods in terms of the ranking loss of selecting the best models for protein targets. The results show that the attentive 2DCNN methods can automatically extract useful information from raw residueresidue distance maps alone to predict the quality of a single protein model without using other protein structural features.
Results and discussion
Results on the CASP13 dataset and comparison with singlemodel QA methods
We evaluated DISTEMA with several singlemodel EMA methods on CASP13 dataset. The results of ProQ2 [18], ProQ3 [9], ProQ3D [10], ProQ4 [11], and the two VoroMQA methods [19] on the CASP13 dataset were taken from [8]. The results of a distancebased method—QDeep [20] were obtained by running it on the same CASP13 dataset. The average ranking loss and Pearson’s Correlation Coefficient (PCC) of these methods are reported in Table 1. DISTEMA and ProQ4 have the lowest ranking loss of 0.079, which is a 9% improvement over the secondlowest loss of 0.086. The PCC of DISTEMA is 0.929, which is 38% higher than the second highest PCC achieved by QDeep.
Figure 1 is the scatter plot of the true GDTTS score of the best model of each target against the true GDTTS score of the top model selected by DISTEMA for the target. The solid red line denotes the regression line between predicted GDT TS and true GDT TS and the yellow line is the 45degree line on which points have 0 loss. Larger the distance between a data point and the yellow line, bigger the loss is. For four targets (i.e., T0949, T0987, T0980s2, T1019s2), their best models were successfully selected as top models by DISTEMA, yielding a loss of 0. Two outliers—T1008, T1022s2—have the largest loss for DISTEMA. Figure 2 illustrates the distribution of the ranking losses of DISTEMA on the CASP13 dataset. The vertical dashed black line is the mark for the mean loss. More data points are located on the left side of the black line. The skewness of the distribution \(\frac{{\frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i}  \overline{x}} \right)^{3} }}{{\left( {\frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i}  \overline{x}} \right)^{2} } \right)^{\frac{3}{2}} }}\;{\text{is}}\;2.377.\)
In addition to the ranking loss, we applied a nonparametric method Kolmogorov Smirnov test (KS test) to measure the distance between the distribution of true GDTTS scores and that of predicted GDTTS scores of the CASP13 models. We conducted the KS test on the two datasets. The first dataset contains the true GDTTS scores of all the CASP13 models, and their GDTTS scores predicted by DISTEMA. The distributions of the two kinds of scores were compared. The second dataset include the true GDTTS scores of the best models for the CASP13 targets and the true GDTTS scores of the top models selected for them. For both tests, we used the same null hypothesis H0: no difference between the two distributions. We calculated Kolmogorov–Smirnov statistics \(D_{{\left( {n,m} \right)}}\) (i.e., \(\mathop {\sup }\limits_{x} {\mid }F_{1,n} \left( x \right)  F_{2,m} \left( x \right)\)) the measurement of the difference between the two distributions. Here, \(sup_{x}\) is the supremum function, which in this case is considered as the \(max\) function. \(F_{1,n}\) and \(F_{2,m}\) are the empirical distribution functions for first and second sample respectively, where \(n\) and \(m\) are sample size.
On the dataset 1, Dstatistics of the KS test is 0.11266, and the pvalue (2.2e − 16) is smaller than a significance threshold (i.e., 0.05), which means these two samples come from different distributions. Figure 3 shows the two samples’ cumulative distribution function (CDF) curves. The red vertical dashed line is the Dstatistics, representing the maximum absolute difference between these two CDF. The shapes of the two curves are different.
The analysis shows that the distribution of the true GDTTS scores of all the models and the distribution of their predicted GDTTS scores have somewhat different distributions, indicating that the quality scores of some models (e.g., some models of very low quality) are hard to predict. Enlarging the training dataset may alleviate the problem.
In contrast, on the dataset 2, the Dstatistics is 0.2 and the pvalue of KStest is 0.08152, higher than the threshold, suggesting the null hypothesis be accepted. That is, the distribution of the true GDTTS score of the best model for each target has no difference than the distribution of the true GDTTS score of the top model selected for each target. Figure 4 illustrates the two distributions’ CDF curves, where the blue line and red line are generally in the same shape. The similar distribution of the GDTTS scores of best models and top selected models further confirm that the ranking capability of DISTEMA is sound.
Comparison with a distancedbased EMA method
We further compared DISTEMA with the distancebased singlemodel method QDeep on the same 3000 models of 20 CASP13 targets whose true structures are publicly available [21] used to evaluate QDeep in [20]. QDeep uses one dimensional CNN with interresidue distance features derived from distance predictions, sequence information, and energy scores, while DISTEMA only uses the raw distance maps as input. Table 2 reports the results of the two methods. DISTEMA performed better than QDeep according to the ranking loss even though it only used one kind of input information, but worse than QDeep according to PCC. The results show that using only raw distance maps with deep learning can predict the quality of a single protein model reasonably well and integrating other features with the distance information may further improve the prediction performance.
Contribution of squeezeandexcitation (SE) blocks with attention
We trained two deep learning networks to investigate the impact of SE blocks with attention. The two networks have the same architecture except one network has SE blocks, but another does not. The two networks were trained with the same experimental setting and were evaluated on the CASP13 dataset. The network with SE blocks has the ranking loss of 0.079, 7.5% lower than 0.085 of the network without SE blocks, indicating that attentive SE blocks can improve the performance of model quality prediction. The attention mechanism in SE blocks can more effectively pick up the relevant features anywhere in the input and assign them higher weights to improve the prediction performance.
Conclusion and future work
We designed and developed an attentive 2D CNN with the channelwise attention to directly leverage a raw interresidue distance map to predict the global quality of a single protein model. Using only the protein distance information, the deep learning method with the attention mechanism is able to automatically extract features relevant to model quality from the raw input and achieves the lower model ranking loss than other stateoftheart singlemodel EMA methods that use various expertcurated protein structural features. The results demonstrate that raw protein distance maps contain substantial information that can be captured by advanced deep learning methods to estimate the accuracy of a single protein model. In the future, larger training datasets, additional input features, and more advanced deep learning architectures [22] can be used to further improve the distance mapbased methods for improving the prediction of protein model quality.
Methods and materials
Difference map as input feature
We applied a realvalue distance predictor DeepDist [23] to predict an interresidue distance map from the sequence of a protein target as matrix \(A \left( {L \times L} \right)\), where \(L\) denotes the sequence length and \(A\left[ {i,j} \right]\) is the distance between residues \(i\) and \(j\). A was compared with the distance matrix \(B \left( {L \times L} \right)\) calculated from the coordinates of residues in a protein structure model to generate a difference map \(D\). Because \(A\) can be considered the expected distances between residues and \(B\) the actual distances between residues in a model, \(D\) measures how well the model meets the expectation and therefore provides useful information about the quality of the model. Considering that large distances tend to have little impact on the fold of a protein structure, before the generation of the distance map, a distance threshold (i.e., \(16\) Angstrom) was applied \(A\) and \(B\) to filter out the distances that are greater than the threshold. That is, if either \(A\left[ {i,j} \right]\) or \(B\left[ { i,j} \right]\) is greater than 16, both \(A\left[ {i,j} \right]\) or \(B\left[ {i,j} \right]\) were set to 0, producing two filtered distance matrices \(A^{*}\) and \(B^{*}\). The difference map \(D\) was an elementwise subtraction between \(A^{*}\) and \(B^{*}\). Since \(A^{*}\) and \(B^{*}\) are symmetrical, \(D\) is also symmetrical. To speed up the training of the deep learning method, the distances in the lower triangle of \(D\) is set to \(0\) to produce a matrix \(U\). \(U\) that only contains the values of the upper triangle of \(D\) is used as input for the deep learning method to predict model quality. For example, Fig. 5 visualizes \(A^{*}\), \(B^{*}\), \(D\), and \(U\) of a model of CASP13 target T0949.
Deep learning architecture and training
The architecture of the deep learning network of DISTEMA is illustrated in Fig. 6. The network takes the difference map U of a model as input to predict the global distance test total score (GDTTS) [24] of the model. A GDTTS score ranges from 0 to 1, measuring the global similarity between a model and its corresponding native structure. Higher the score, better is the model quality. A true GDTTS score of a model can be calculated by comparing a model with its native structure using some tools like TMscore [25] if the latter is known. Otherwise, the GDTTS score needs to be estimated or predicted from the features of the model.
The protein models of CASP812 whose true GDTTS scores are known were used to train the deep learning method to predict their GDTTS scores. The input size for the deep learning method is \(b \times 1 \times L \times L\), and the output size is \(b \times 1\). Here \(b\) denotes the batch size, \(1 \times L \times L\) the size of the difference map, and 1 the number of input channel (i.e., the distance difference value). We let the input in the same batch have the same \(L\) to speed up training, even though the \(Ls\) in different batches can be different. The deep network is composed of four convolutional (Conv) blocks, a global pooling layer, a flatten layer, and four dense blocks. The four Conv blocks extract features from the input. Each of the first two Conv blocks contains a squeezeandexcitation (SE) block [26] with the channelwise attention mechanism to automatically assign higher weights to more relevant features. Both SE blocks use the same squeezeandexcitation ratio (i.e., 16). In each SE block, the global average pooling layer extracts single average value from the previous convolutional layer’s channels; two fully connected layers shrank the inner neural size first and then increase the size to the original number; the sigmoid function scales each value into the range [0, 1], which is treated as an independent weight score for each channel; and the previous convolutional layer’s weights multiply the weight score as the rescored weights.
The four Conv blocks increase the input channel number from 1 to 256. A global maxpooling layer is applied to the last Conv block to extract each channel’s max value. A flatten layer combined these features and reshaped the size to b × 256. The following four dense blocks reduce the feature size from b × 256 to \(b \times 1\) to get the predicted GDTTS score.
Except the sigmoid activation function used in the two SE blocks and the ReLu activation used in the last output layer, all the other layers use the LeakeyReLu activation function if applicable. The deep learning network above was trained with the Smooth L1 loss function [27]. Unlike the mean squared error (MSE) loss and standard L1loss function, the Smooth L1 loss is less sensitive to the outliers and derivable at 0 point. The Eq. 1 is the formula of the smooth L1 loss, where x denotes the difference between the predicted and true GDTTS scores. It is a combination of MSE loss and L1 loss. The derivative of the smooth L1 loss is represented by Eq. 2. The derivative is x when x is in the range [1, 1], which is linear. Otherwise, it is a constant (1 or − 1). This property ensures the deep network is stable and converges fast.
For the convolutional and linear layers, we utilized the kaiming initialization [28] to initialize the start values. We implemented the deep network with PyTorch [29]. It was trained by Adam optimization method [30] with \(\beta_{1} = 0.9\) and \(\beta_{2} = 0.999\). The learning rate was set as a constant value of 0.00005 and the batch size as 16.
Datasets and evaluation metrics
We generated the difference map for each structural model predicted for CASP813 targets by CASP813 structure prediction servers. Each CASP protein target may have up to a few hundred structural models (decoys). The true GDTTS scores of these models were calculated as labels to train the deep learning method. 120,064 structural models were used for training and validation, and 14,580 structural models of CASP13 were used as the test dataset. CASP812 targets used for training have different sequence lengths (see Fig. 7 for the length distribution). To improve the effectiveness of training on the models of different lengths, the CASP812 models were divided into many batches, each of which consisted of 16 structural models of the same length. 80% of randomly selected models in each batch were pooled together to form the training dataset and the remaining models were used as the validation dataset. Figure 8 illustrates the GDTTS score distribution of the training dataset. The density plot shows that it is approximately a mixture distribution composed of two Gaussian distribution components.
The predicted performance of DISTEMA and other methods for a target was evaluated by the GDTTS score loss of ranking the models of the target, which is defined as the absolute difference between the true GDTTS score of the best model of a target and that of the top model selected by the predicted GDTTS scores of the models of the target. A ranking loss of 0 means that the best model for a target has been selected by the predicted GDTTS scores. The average GDTTS loss of ranking models of all the targets in the test dataset was used to evaluate the performance of the EMA methods. Moreover, the Pearson’s correlation coefficient between the predicted GDTTS scores of the models of a target and their GDTTS scores was calculated. The average Pearson’s correlation coefficient over all the targets in the test dataset was also employed to estimate the performance of the EMA methods.
Availability of data and materials
Availability: https://github.com/jianlincheng/DISTEMA.
Abbreviations
 DISTEMA:

Distancebased estimation of protein model accuracy
 CASP:

Critical assessment of techniques for protein structure prediction
 CNN:

Convolutional neural network
 GDTTS:

Global distance test score
References
Cozzetto D, Kryshtafovych A, Ceriani M, Tramontano A. Assessment of predictions in the model quality assessment category. Prot Struct Funct Bioinf. 2007;69(S8):175–83.
McGuffin LJ, Buenavista MT, Roche DB. The modfold4 server for the quality assessment of 3d protein models. Nucleic Acids Res. 2013;41(W1):368–72.
Cao R, Bhattacharya D, Hou J, Cheng J. Deepqa: improving the estimation of single protein model quality with deep belief networks. BMC Bioinf. 2016;17(1):495.
Karasikov M, Pages G, Grudinin S. Smooth orientationdependent scoring function for coarsegrained protein quality assessment. Bioinformatics. 2019;35(16):2801–8.
Hou J, Wu T, Cao R, Cheng J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in casp13. Prot Struct Funct Bioinf. 2019;87(12):1165–78.
Chen X, Akhter N, Guo Z, Wu T, Hou J, Shehu A, Cheng J. Deep ranking in templatefree protein structure prediction. In: Proceedings of the 11th ACM international conference on bioinformatics, computational biology and health informatics, pp. 1–10 (2020).
Wang Z, Tegge AN, Cheng J. Evaluating the absolute quality of a single protein model using structural features and support vector machines. Prot Struct Funct Bioinf. 2009;75(3):638–47.
Cheng J, Choe MH, Elofsson A, Han KS, Hou J, Maghrabi AH, McGuffin LJ, MenendezHurtado D, Olechnovic K, Schwede T, et al. Estimation of model accuracy in casp13. Prot Struct Funct Bioinf. 2019;87(12):1361–77.
Uziela K, Shu N, Wallner B, Elofsson A. Proq 3: Improved model quality assessments using rosetta energy terms. Sci Rep. 2016;6(1):1–10.
Uziela K, Menendez Hurtado D, Shu N, Wallner B, Elofsson A. Proq3d: improved model quality assessments using deep learning. Bioinformatics. 2017;33(10):1578–80.
Hurtado DM, Uziela K, Elofsson A. Deep transfer learning in the assessment of the quality of protein models. arXiv preprint arXiv:1804.06281 (2018).
Mariani V, Biasini M, Barbato A, Schwede T. lddt: a local superpositionfree score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013;29(21):2722–8.
Sanyal S, Anishchenko I, Dagar A, Baker D, Talukdar P. Proteingcn: protein model quality assessment using graph convolutional networks. BioRxiv (2020).
Baldassarre F, Menendez Hurtado D, Elofsson A, Azizpour H. GraphQA: protein model quality assessment using graph convolutional networks. Bioinformatics (2020). https://doi.org/10.1093/bioinformatics/btaa714/34192500/btaa714.pdf
Igashov I, Olechnovic K, Kadukova M, Venclovas C, Grudinin S. Vorocnn: deep convolutional neural network built on 3d voronoi tessellation of protein structures. bioRxiv (2020).
Jing X, Xu J. Improved protein model quality assessment by integrating sequential and pairwise features using deep learning. bioRxiv (2020).
Hiranuma N, Park H, Baek M, Anishchenko I, Dauparas J, Baker D. Improved protein structure refinement guided by deep learning based accuracy estimation. Nat Commun. 2021;12(1):1340. https://doi.org/10.1038/s4146702121511x.
Ray A, Lindahl E, Wallner B. Improved model quality assessment using proq2. BMC Bioinf. 2012;13(1):224.
Olechnoviˇc K, Venclovas C. Voromqa: assessment of protein structure quality using interatomic contactˇ areas. Prot Struct Funct Bioinf. 2017;85(6):1131–45.
Shuvo MH, Bhattacharya S, Bhattacharya D. QDeep: distancebased protein model quality estimation by residuelevel ensemble error classifications using stacked deep residual neural networks. Bioinformatics. 2020;36(Supplement1):285–91. https://doi.org/10.1093/bioinformatics/btaa455.
https://www.predictioncenter.org/download_area/CASP13/targets/casp13.targets.T.4public.tar.gz Date of access: 20181120 23:59.
Zagoruyko S, Komodakis N. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).
Wu T, Guo Z, Hou J, Cheng J. Deepdist: realvalue interresidue distance prediction with deep residual convolutional network. BMC Bioinf. 2021;22(1):30. https://doi.org/10.1186/s12859021039609.
Zemla A. Lga: a method for finding 3d similarities in protein structures. Nucleic Acids Res. 2003;31(13):3370–4.
Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Prot Struct Funct Bioinf. 2004;57(4):702–10.
Hu J, Shen L, Sun G. Squeezeandexcitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018).
Ren S, He K, Girshick R, Sun J. Faster rcnn: towards realtime object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99 (2015).
He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: surpassing humanlevel performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015).
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L. et al. Pytorch: an imperative style, highperformance deep learning library. In: Advances in neural information processing systems, pp. 8026–8037 (2019).
Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
Acknowledgements
We thank CASP for providing the data for public use.
About this supplement
This article has been published as part of BMC Bioinformatics Volume 23 Supplement 3, 2022: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM 2021): bioinformatics. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume23supplement3 .
Funding
The project is partially supported by two NSF Grants (DBI 1759934 and IIS1763246), one NIH Grant (GM093123), two DOE Grants (DESC0020400 and DESC0021303), and the computing allocation on the Summit supercomputer provided by Oak Ridge Leadership Computing Facility (DOE Grant: DEAC0500OR22725). The funders do not play a role in conducting this research. The publication cost is funded by an NSF Grant (IIS1763246).
Author information
Authors and Affiliations
Contributions
JC and XC designed this project. XC implemented it and collected the results. XC and JC wrote the manuscript. Both authors read and approved the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Chen, X., Cheng, J. DISTEMA: distance mapbased estimation of single protein model accuracy with attentive 2D convolutional neural network. BMC Bioinformatics 23 (Suppl 3), 141 (2022). https://doi.org/10.1186/s12859022046831
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859022046831
Keywords
 Protein quality
 Distance map
 Deep learning