 Research
 Open access
 Published:
Simcryocluster: a semantic similarity clustering method of cryoEM images by adopting contrastive learning
BMC Bioinformatics volumeÂ 25, ArticleÂ number:Â 77 (2024)
Abstract
Background
Cryoelectron microscopy (CryoEM) plays an increasingly important role in the determination of the threedimensional (3D) structure of macromolecules. In order to achieve 3D reconstruction results close to atomic resolution, 2D singleparticle image classification is not only conducive to singleparticle selection, but also a key step that affects 3D reconstruction. The main task is to cluster and align 2D singlegrain images into nonheterogeneous groups to obtain sharper singlegrain images by averaging calculations. The main difficulties are that the cryoEM singleparticle image has a low signaltonoise ratio (SNR), cannot manually label the data, and the projection direction is random and the distribution is unknown. Therefore, in the low SNR scenario, how to obtain the characteristic information of the effective particles, improve the clustering accuracy, and thus improve the reconstruction accuracy, is a key problem in the 2D image analysis of single particles of cryoEM.
Results
Aiming at the above problems, we propose a learnable deep clustering method and a fast alignment weighted averaging method based on frequency domain space to effectively improve the class averaging results and improve the reconstruction accuracy. In particular, it is very prominent in the feature extraction and dimensionality reduction module. Compared with the classification method based on Bayesian and great likelihood, a large amount of single particle data is required to estimate the relative angle orientation of macromolecular single particles in the 3D structure, and we propose that the clustering method shows good results.
Conclusions
SimcryoCluster can use the contrastive learning method to perform well in the unlabeled highnoise cryoEM single particle image classification task, making it an important tool for cryoEM protein structure determination
Background
In the life sciences, the structure of living organisms determines the function, and the threedimensional structure of organisms is becoming more and more important for the basic research and application of life sciences. Structural biology methods mainly include Xray crystallography (Xray crystallography) [1], nuclear magnetic resonance spectroscopy (NMR) [2] and cryoelectron microscopy (CryoEM). In recent years, technological advances in sample preparation, computation, and especially instrumentation have made the singleparticle cryoEM method increasingly important in the field of structural biology. In CryoEM, in order to construct a highresolution 3D reconstruction of protein structure using CryoEM technology, hundreds of thousands of singleparticle images extracted must be accurately 2D classification [3]. 2D classification is an important intermediate stage in cryoEM 3D reconstruction [4], and the class average results obtained at this stage can be used both as a template for singleparticle selection and as the basis for the construction of subsequent 3D initial models [4, 5]. In the sample preparation process, in order to avoid high dose electron beams causing radiation damage to the sample and destroying atomic covalent bonds, lowdose electron beam imaging is generally selected, which results in a very low SNR of the obtained micrographs [6]. In addition, for singleparticle lowelectron beam imaging in the free state, it is impossible to manually label according to the projection angle, and the data with real labels cannot be obtained, so that the deep learning classification model that currently performs well in supervised classification tasks can not be directly applied in CryoEM singleparticle image classification, and it is difficult to evaluate the quality of the classification results. Aiming at the above problems, it is of great significance to find a highprecision and effective twodimensional classification method for the results of threedimensional reconstruction.
Over the past few decades, many different approaches have been proposed for 2D classification of cryoEM singleparticle images. The main methods of unsupervised 2D classification that are currently popular, the following are: Crosscorrelation (CC) and multivariate statistical analysis (MSA) enables Kmeans clustering with referencefree alignment [7, 8], Unsupervised maximum likelihood (ML) or maximum posterior (MAP) classification [9], statistical manifold learning algorithm (ROME) [10] for unsupervised singleparticle deep clustering, variational selfencoder (VAE) [11,12,13,14] and multireference alignment (MRA) classification [15, 16]. The first two are traditional unsupervised classification methods, and the latter two are referencefree clustering methods based on deep learning. In the first method, the classification accuracy is affected by the noiseinduced misalignment resulting from false peaks in crosscorrelation calculation. Noise in singlegrain images also affects the calculation of distances in kmeans clusters. When the SNR is reduced, the performance of this classification method is also degraded. Compared to the Kmeans method, the MLbased method explores the optimal probability of measuring image similarity and exhibits good robustness in noisy singleparticle image alignment tasks. A key problem is that the likelihood matching insufficiently differentiates structural heterogeneity among similar but critically different views. In each set of results after classification, due to structural heterogeneity, the number of valid categories is low, while increasing the cycle of ML optimization. In order to overcome the shortcomings of the above two traditional referencefree classification methods, Jiayi Wu et al. [17] proposed a statistical manifold learning algorithm for unsupervised singleparticle deep clustering. The algorithm can effectively detect the structural differences between classes and classes, and improve the detection accuracy. However, there is still a lot of room for improvement in accuracy on highnoise images. Subsequently, Guowei Ji et al. [11] proposed a classification method based on variational autoencoders (VAE) and multireference alignment (MRA) to complete 2D classification. This method first uses VAE for noise reduction, and then uses the MRAbased Kmeans algorithm for unsupervised clustering, which can effectively process electron microscopy singleparticle images under low SNR. Vignesh Prasad et al. [12] proposed an image clustering method using VAE and GMM priors, which jointly learns the prior and posterior and in turn learns a latent space representation for accurate clustering. This method does not require pretraining and is the first fully unsupervised VAE image clustering method. Nina Miolane et al. [13] combined VAE and GAN to learn the latent space of cryoEM images, where the encoder encodes the image into a latent variable and the decoder decodes it into a reconstructed image, while the discriminator determines the probability that the input image is a real image. This method can compute the orientation and camera parameters of a given image. Alireza Nasiri et al. [14] proposed a translational and rotational groupequivariant variational autoencoder architecture, which enables learning of translation and rotationinvariant object representation in images in an unsupervised manner. However, the performance of this method on the real particle image of cryoEM still needs to be strengthened.
Based on the shortcomings of the above methods, we propose a low SNR singleparticle image classification method based on contrast learning, which performs well in both the simulation dataset and the real data set. The main contributions of this article are as follows: (1) In this paper, a cryoEM clustering model based on contrast learning is proposed, which is used to complete the feature extraction task of unlabeled cryoEM images, calculate similar features to generate pseudolabel data, and then complete the classification. (2) This paper proposes to use the frequency domain spatial interpolation method for efficient alignment in each set of data after the classification is completed, and then complete the class average calculation.
In this study, we divide the task into three steps. The first is preprocessing, then using the deep learning model to develop a good feature extractor for feature extraction, and finally to make learnable clustering of the extracted features. The method in this paper trained and tested three datasets of 80Â S ribosome, superpolarized cyclic nucleotide HCN1, polypeptide toxin and gummolycer glyceride toxin complex TRPV1, and in the 80Â s ribosome verification experiment, the ACC could still reach 78.59 at the SNR=0.1, thus demonstrating the effectiveness of this method.
Methods
Our SimCryoCluster framework for 2D classification of cryoEM singleparticle images is shown in Fig.Â 1. In this framework, there are two main parts of work, the first part is the data preprocessing part, and the second part is the image clustering part, in which we propose a new method, which mainly uses a contrast learning model to complete the unsupervised clustering task of feature extraction of singleparticle images, which can be completed without considering the data distribution. This process consists of four main stages: data preprocessing, feature extraction based on contrast learning, learnable clustering and reference alignment within the frequency domain space class. In preprocessing stage, several image processing methods are applied to enhance the input cryoEM single particle image such as denoising base on GAN, Contrast Enhancement Correction (CEC),etc. Feature extraction is the use of contrast learning networks to extract feature vectors to reduce the dimensionality of data, similar to sPCA. Without labels classification is to maximize the similarity of feature vectors extracted by the network, calculate k near neighbors of each sample, and store the neighbor characteristics of each sample for subsequent learning clustering tasks. Alignment within the frequency domain spatial class is an invariant feature of estimating the rotation of an image, performing inplane rotation and translation alignment on classified data in Fourier space.
Feature extraction and contrast clustering
Due to the random angle, noisy noise, and unknown distribution of cryoEM singleparticle images, methods based on maximum likelihood do not yield good results. In such a low SNR scenario, how to obtain effective features and improve clustering accuracy is a key issue in the analysis of singleparticle data of cryoEM. A labelfree classification network based on contrast learning is constructed on the basis of this problem, which in turn accomplishes 2D classification. Before the electron microscopic single particle images are fed into the feature extraction network, data enhancement of the input data will be performed as a pretask. First, for each image in the dataset, two enhancement combinations are performed (i.e., crop and resize and recolor, resize and recolor, crop and recolor, and many other combinations). The two enhanced images are essentially different versions of the same image. The two images are fed into the feature extraction network model, and each image generates a corresponding feature vector, with the goal of training the model to output a similar representation of a similar image. Details can be showed on Fig.Â 2.
For each original image X, through data enhancement, transform into \(X_i\) and \(X_j\), and then by training the base encoder network \(f(\cdot )\) to obtain the feature vector of the enhanced image, and then use a small neural network projection head \(g(\cdot )\) Map representations to the contrast loss space and maximize the contrast loss to ensure consistency of features after network training. After the training is complete, we discard the projection head \(g(\cdot )\) and use the encoders \(f(\cdot )\) and the representation h for the downstream task. where \({\varvec{h}_{i}=f\left( \tilde{\varvec{x}}_{i}\right) =ResNet({\tilde{\varvec{x}}_{i})}}\), where \({\varvec{h}_{k} \in R^{d}}\) represents the output after averaging pooling. \(h_i\) is the sample feature output after using the backbone network f, and \(z_k\) is the feature after the projected output terminal g.
Since this task needs to output the image as 11 classes, the 512dim feature is reduced to an 11dim feature vector through a linear layer, and then the eigenvector is converted to a probability vector by the softmax layer, and then the cosine similarity function is used to calculate the similarity of the probability vector to obtain the final clustering result. The main purpose of using probability vectors is to generate pseudolabels by calculating the probability values in the classification, and when the probability is greater than 0.95, it is set as a pseudolabel, and feedback is given to train the network again to update the weights.
Alignment based on frequency domain space
Image alignment is a basic and essential step in the 2D classification task of cryoEM singleparticle images [18, 19]. The purpose of image alignment is to estimate the three parameters of alignment, namely the angle of rotation, and the translational movement in the direction of the x and yaxes. Image rotation alignment and translation are also often used in time domain space, but in time domain space it is usually matched by rotation in a certain step, it takes multiple iterations to calculate the alignment parameters, and the result is an integer [20]. In frequency domain space, the calculation alignment parameters can be calculated directly without enumeration. On this basis,we use an alignment algorithm based on twodimensional neighbor interpolation in the frequency domain of the image, which can improve the accuracy of the estimated parameters. The specific steps can be divided into rotational alignment and translation alignment, for the calculation of rotational alignment, first of all, the two images in the class are parallel fast Fourier transform (PFFT), the crosscorrelation matrix of the two images is calculated, positioned to the maximum value in the matrix, twodimensional interpolation around the maximum value, and the rotation angle between the two images can be directly determined according to the position of the maximum value in the matrix. For the calculation of translation alignment, only two images need to perform a fast Fourier transform (FFT) on it. In the single particle selection, usually use a certain radius size of the circle for selection, when extracting (Extract), the extraction box is usually selected not less than the diameter of the circle square box for frame. Therefore, the size of the singlegrain image we are dealing with can be set to \(n\times n\), and the rotational alignment used in this article is based on the square image. The main process of rotation alignment is to calculate the crosscorrelation matrix, complete the twodimensional interpolation of the nearest neighbor, and finally calculate the rotation angle, a total of three steps, the specific process can be seen in Fig.Â 3.
First of all, suppose that the input two images are \(N_i\), \(N_j\), through the parallel fast Fourier transform can obtain the two images related spectrogram \(F_i\), \(F_j\), its size is \((n/2)\times 360\), by calculating the spectrum map to calculate the crosscorrelation matrix P, the specific calculation such as Eq.Â 1.
where \(conj(\cdot )\) denotes the computation of the complex conjugate function. \(ifft(\cdot )\) denotes the twodimensional fast Fourier inverse transform \(abs(\cdot )\) denotes the absolute value function, and all three functions can be represented in MATLAB. The values in the reciprocal matrix P need to be cyclically shifted by m/4 positions to exchange the horizontally centered maximum value, and the function for shifting the values in the matrix can be implemented using the \('cirshit'\) function in MATLAB.
A twodimensional interpolation occurs near the maximum value of the crosscorrelation matrix. the angle of rotation of the image \(N_j\) relative to the \(N_i\) that can be determined based on the position of the maximum value in the crosscorrelation matrix P on the xaxis. The \(\omega\) value calculated here is an integer. First, first find the maximum value of the crosscorrelation matrix P, interpolate based on the nearest neighbor of the maximum value, that is, extract the maximum value from the matrix of the central matrix \(\widehat{P}\), as shown in the red line in Fig. 9, twodimensional interpolation in the \(\widehat{P}\) matrix, the specific function can refer to the \('interp2'\) function in MATLABL.
The final is to calculate the rotation angle, according to the position of the maximum value in the matrix P after xaxis interpolation, the rotation angle \(\omega\) can be directly calculated, where \(\omega \in [\pi , \pi ]\). For a better representation of the angle of rotation, adjust it to a positive integer as specifically shown in Eq.Â 2.
Class averaging
In order to further improve the SNR of cryoEM singleparticle images, the results of each class are averaged after classification, and the class average plot is obtained. To improve the result of the class average, we discarded the traditional direct averaging method and chose the weighted average, which is based on the probability vector of the softmax output in Fig. 8b, \([p_1, p_2,... p_i]\) determines the outcome of each class, and saves the probability values of the corresponding category, normalizes all the probability values of the final category to obtain the corresponding weight w, multiplies each image in \(I_j\) the same category by the weight w,among \(0 \le \omega < 1\). And then adds up to obtain the final class average result. The final result is shown in Fig. 6b.
Results
Construct simulation dataset
Since the original cryoEM singleparticle images without label data cannot be labeled by experts, this poses a huge challenge to the evaluation of 2D classification results in 3D reconstruction. Aiming at the above problems, We constructed a simulated cryoEM dataset by using Scipion to perform 2D projection of particles and adding real noise to the resulting images. [21]. When obtaining the projection map, we first selected three protein macromolecules with PDB structures from the electron microscopy database (EMDB) [22], corresponding to IDs of 3j7a, 5u6o and 5irx, and their specific parameters are shown in Table 1. Then, Use the XMIPP [23] software processing package to simulate the effects of a real microscope. The projection is mainly based on the rotation angle (rot) and tilt angle transformation, the step size of both is set to 5\(^{\circ }\) when the data is constructed, and the rotation angle is projected in Eq.Â 3 when rotating, and the change in the tilt angle is the same as above.
where \({rot}_0\) represents the minimum, maximum \({rot}_F\) and step value \({rot}_{Step}\) of the rotation angle, the rotation angle range is from 0\(^{\circ }\) to 360\(^{\circ }\) in degrees, tilt is also in degrees, the range is 0\(^{\circ }\) to 180\(^{\circ }\), when \(tilt=0\) represents the top view, \(tilt=90\) represents the side view. We generated 1100 projected images for each protein, covering 11 different horizontal rotation angles (grouped within 5\(^{\circ }\)), i.e. 100 projected images per rotation angle.
Fusion analog noise
In previous methods of constructing simulation datasets, researchers typically used Gaussian white noise to simulate cryoEM noise. However, real cryoEM data noise is difficult to obtain in reality or the noise distribution is difficult to derive. In general, we cannot obtain cryoEM images with known noise distribution. In real life, the noise of these unknown noise images is very complex and the distribution is unknown, so using existing models trained on a particular noise does not yield good results [24]. In this paper, the UNet network is used to split the noise block and the pixel block, and the extracted noise block is superimposed with the projection map, and the simulated noise single particle image corresponding to the clean particle is constructed, which can be used for subsequent network model training [21]. The visual comparison of the dataset construction is shown in Fig.Â 4.
Performance evaluation metrics
To better evaluate the results of this method on the simulated dataset, we used accuracy (ACC) to evaluate the results of the feature extraction phase and FowlkesMallows index (FMI) to evaluate clustering performance [25],such as Eq.Â 4.
where \(\omega\) represents the set of results after the knearest neighbor classification is calculated by feature extraction, \(\omega =\left\{ I_1,I_2,\ldots ,I_j\right\}\), A collection of real label datasets \(\textbf{C}=\left\{ c_1,c_2,\ldots {,c}_k\right\}\). The results of the classification after the completion of feature extraction may not match the original label data at the time of verification, which will lead to lower ACC results,such as Eq.Â 5. When validating the results, this paper uses the KuhnMunkres algorithm to calculate the maximum match.
where TP is the number of particles correctly classified in the total image of a single particle, FP is the number of particles that are misclassified, and FN is the number of particles that are incorrectly predicted as incorrect. By simulating the dataset, the real live labels can be effectively recorded according to the images of different projection angles.
Data preprocessing
Step 1: Voxel image conversion
The acquired raw cryoEM images are stored in Mixed Raster Content (MRC) format, which defines a threedimensional grid (array) of voxels, each with a value corresponding to the electron density or potential. In order to facilitate preprocessing and improve the SNR, we converted the cryoEM single particle image mrc format to the commonly used 16bit PNG format. We preprocess the simulated dataset and the real dataset separately. In order to facilitate the processing of real datasets, we name all files in ascending order during format conversion, so that the results after classification can be output according to the file name index. Our goal is to extract the feature information of the single particle image through the contrastive learning network, compare the features with similarity, and obtain the clustering results. Therefore, in order to make the network model learn better, we choose the denoising method based on the generative adversarial network (GAN)to improve the quality of cryoEM single particle images. In addition, we perform contrast adjustment on the transformed image.
Step 2: Contrast adjustment
Since the lowdose optical imaging module during cryoEM imaging is on the defocused particle region, the obtained singleparticle image has a lowcontrast property that is difficult to identify. Histogram equalization based on uniform distribution can be used to increase the intensity value of image pixels [26]. It increases and improves global image contrast by mapping the original image histogram to a unified histogram. Therefore, in order for subsequent network models to learn better, we perform contrast adjustments on the images.
Step 3: Singleparticle images denoising
Due to cryoEM imaging, electron beam electron doses are small, and the contrast between proteins and solvents is low and noisy. Image recovery techniques are commonly used for cryoEM singleparticle image denoising. Based on prior knowledge of the noise reduction process, image recovery recovers and improves image quality by identifying the type of noise and then eliminating it. Therefore, we chose Tang et al [21], the proposed improved denoising algorithm for generating adversarial networks (GANs) [27], In the architecture of the GAN network, the generator network adopts a symmetric structure that consists of three blocks: the convolution block, the residual block, and the subpixel convolution block. The discriminator employs a convolution network with five layers, including batch normalization layers and LRelu layers. The training dataset for this network uses a simulated dataset constructed with various particle images, and the network is tested with the EMD0406 dataset and EMD23579 dataset, each with added noise blocks of varying levels. which preserves as much protein internal conformational information as possible while reducing noise,,We use relion to particles picking from Plasmodium falciparum 80Â S ribosome and EMD3347, and conduct denoising experiments on the particle picking results. which improves the quality of cryoEM singleparticle images as shown in Fig.Â 5.
Construct network training and test
In order to evaluate the effect of the network model used in this paper in the process of feature extraction and clustering, we construct a simulation dataset to verify the proposed method. We assume that the particles in all singlegrain images in the simulation dataset are active particles, select 11 directions with different projection angles, divide all particles into 11 classes, and randomly rotate each class to obtain a 2900 singlegrain image dataset. We split all singleparticle datasets into training sets and validation sets, of which about \(70\%\) were used for training sets (2900 particle images, 1800 for training and 200 for validation) and about \(30\%\) testing (900 particle images). In addition,in order to avoid overfitting, the input image was horizontally flipped, randomly cropped, rotated, filled, etc. to expand the data set. The batch size was set to 128, the number of iterations was set to 500, and the model optimizer used SGD(Stochastic Gradient Descent), the learning rate adopts a dynamic adjustment strategy, and the initial learning rate is 0.4, We added different levels of noise to the training set and validation set, and the images with different signaltonoise ratios of SNR=0.1 and SNR=0.6 are constructed as new data sets.
Experiments on testing contrast learning classification models
Step 1: Experiments on feature extraction
In the first stage, we will build the labeled training set and the validation set into a binary file, store the training set randomly in 5 \(trainbatch\), and store the test set in the \(testbatch\). Labels are ignored during training based on contrast learning networks, and feedback training is performed on the network using augmented data from each data. Images of any size input can be converted to 128 \(\times\) 128, which can be applied to network extraction features. Among them, Resnet18 [28] is used as a backbone, and after network training, each image is converted into a 512dimensional feature vector, and then the output 512dim feature vector is fed into a multilayer perceptron (MLP) [29] and output as a normalized 128dimensional feature vector. In order to verify the effectiveness of network feature extraction, the K near neighbor (KNN) method is introduced to calculate the neighbors of the normalized eigenvectors, and the corresponding labels are found according to the index, which can effectively calculate the verification accuracy after KNN classification, and store the K neighbors of each eigenvector in a temporary library, which can be used for the input of the second stage. In addition, the change of loss curve during training with the KNN verification accuracy curve can be seen in Fig.Â 6a, b. From the display of Fig.Â 6a, it can be seen that when the SNR is relatively high, the loss function as a whole is in a state of decline, and the characteristic information can be effectively learned. When SNR = 0.1, network training converges slowly, and after the first 200 epochs learn slowly, after 200 epochs, the Loss function begins to converge gradually until it is near 500 epochs, and the curve tends to flatten. From Fig.Â 6b, it can be seen that the accuracy of the final verification will be different under different SNR. In this process, with the network training, the learning and characterization ability of the network can be effectively improved, and the effective feature information can be obtained, so as to achieve better verification accuracy. When SNR=0.6, its top5 accuracy tends to be about \(93.1\%\) for noisefree particle images. When SNR=0.1, its top5 accuracy is up to \(87.92\%\). In this process, we have also gone through several experiments on the choice of K value, and it is found that the verification accuracy can reach the highest when K=5, as shown in Fig.Â 7. According to this validation accuracy, on the one hand, the validity of the first stage in feature extraction can be effectively proved. On the other hand, the meaningful nearest neighbors in the first stage can be integrated into the second stage of the learnable clustering method as a priori knowledge.
Step 2: Experiments on contrastive clustering
Through the first stage of training weights as the a priori input of the second stage, and then the network is retrained to obtain the feature input comparison loss function, the network model is continuously optimized, and a learnable clustering network is formed to complete the clustering task. BackBone uses the standard ResNet18 [28]. For each sample, 10 nearest neighbors were identified by the instance discrimination task based on noise contrast estimation (NCE), and the clustering performance of the network could be effectively improved by fusing the near neighbor features in the first stage. In this paper, the EMPIAR10028 dataset is experimentally verified under different SNR, and the clustering results are shown in Table 2.
We compare our method with CL2D [32] and EMAN2 [33] on the simulation data of SNR = 0.1 and SNR = 0.6. In addition to these traditional methods, we also compared them with the classic convolutional autoencoder (CAE) and the improved iterative encoding method (IterVM), and the results are shown in Table 3.
Experiments on alignment and class averaging
The result of the classification can be calculated by the pixel mean, and a highprecision classaverage image can be obtained, which is also one of the methods to improve the SNR of the image. However, the image needs to be aligned before the average pixel is calculated, and this paper proposes to use a fast frequency domain spacebased maximum nearneighbor interpolation method that can estimate the rotation angle and translation alignment parameters in the xaxis and yaxis directions. The specific method implementation steps are in Alignment Based on Frequency Domain Space of the Methods. We randomly selected a class of results from the results after clustering EMPIAR10028, randomly selected a particle from the class as a reference for reference alignment, and then manually adjusted to remove the wrong particles according to the index value, the process introduced manual intervention, which can effectively improve the calculation result of the class average. We used the traditional cryoEM single particle reconstruction software (Relion) to conduct experiments because Relion has the characteristics of simple operation, clear process, and high reconstruction accuracy. FigureÂ 8aâ€“d is our class average result using relion, it show the results of the rotation and translation alignment visualization. The simulation experiments for both alignment and class averaging in this subsection were run on a sixcore system with 24 GB RAM in a Windows 10 environment, on MATLAB R2019b.
Performance comparison
Our SimCryoCluster classification model is compared to two mainstream 2D classification software (EMAN2 [33] and Relion3 [34]). Since multiple signal sources in the cryoEM imaging process will make the cryoEM image contain noise, and the SNR of the singlegrain image obtained is low, the performance comparison is carried out in the case of SNR=0.1 and SNR=0.6, respectively, and the 80Â s ribosome is selected as the data set, and the comparison results are shown in Table 4. Measuring the data running time and accuracy for different SNRs, SimCryoCluster achieved better classification results in the experiment, taking the least time to classify single particle projection images and achieving an accuracy of up to 94.20, indicating the effectiveness of our method.
Experiments on real datasets
In order to further verify the effectiveness of the deep clustering method proposed in this paper on the cryoEM singleparticle image, we selected 5000 singleparticle images automatically selected from the original cryoEM image, and entered the network for clustering and alignment after noise reduction. We use cl2d of Scipion software to compare with the relion classification experiment that incorporates our denoising results. We divide it into 20 classes, weight the average of the singleparticle images in each class according to SNR, sort the class average results, and select the visualization results of the first five valid classes as shown in Fig. 9b. where Fig. 9a is a visualization of class averaging using CL2D. From the visualization results, it can be seen that the results of the class averaging after noise reduction can effectively retain important particle information, and the noise is significantly lower than that of CL2D clustering.
Reconstruct the experiment
In order to further verify the effectiveness of the proposed method in this paper, the class average is generated by using the classification method and alignment algorithm in this paper, and then the class average is used for initial threedimensional reconstruction. The experiment used EMD5785 simulated cryosingleparticle cryoEM images and real cryoEM projection images in EMPIAR10028. This experiment uses the ASPIAR software package (http://spr.math.princeton.edu/). The resulting class average is initially reconstructed into a threedimensional structure using a covalent line reconstruction method based on the covalent line [35], which is implemented in the ASPIRE software package with the function \(''cryoestimatemean''\). The projection matching algorithm is used to estimate the projection direction of the cryoEM image, and the public line between the various types is estimated using the weighted voting algorithm we propose. All cryoEM 3D structures are visualized by UCSF Chimera software. The results of each visualization are shown in Fig. 10 below. It can be seen from the comparison figure that the method proposed in this paper has effective reconstitution of 70Â s ribosomes.
Discussion
Our method tackles significant challenges that other 2D classification approaches have faced such as the difficulty of processing low SNR micrographs, the effectiveness of classification results to be improved, and the difficulty of calculating projection orientation information. In view of the problem of low SNR of micrographs, We incorporate noise reduction before clustering, and set a mask, which can effectively reduce the loss of delocalization information around particles, and retain more feature information while improving the SNR as much as possible. Aiming at the problem of low accuracy of classification results, we propose to use a deep clustering network based on contrast learning, which is mainly divided into two stages, the first stage is characterized learning through the comparative learning network, and the second stage integrates the characteristic information and weights of the first stage for clustering, and it is proved through experiments that the methods superior to the current deep learning can be obtained on both the simulated data set and the real data set. Aiming at the problem that projection orientation information is difficult to calculate, we propose to use a fast nearest neighbor interpolation method based on the maximum value of the frequency domain space, which can effectively estimate the rotation angle and translation alignment parameters in the xaxis and yaxis directions, which has an important role in evaluating the same type of data. However, the proposed method in this paper performs poorly on the classification of highly symmetrical structures, and it is difficult to estimate the rotation angle and orientation information.
Conclusions
Our method performs best on ribosomes, which are easy to search for Fourier spatial angles due to the lack of high symmetry in the ribosome structure and the small molecular weight. Aiming at the problem that the training dataset cannot be labeled, the unlabeled cryoEM single particles are classified by using the improved contrast learning clustering method, and pseudolabel data can be obtained for selfsupervised training according to the probability vector of the second stage, which can be unaffected by molecular symmetry and obtain better classification results. Aiming at the problem that the projection direction of single particles of biological macromolecules is difficult to estimate, a fast maximum neighbor interpolation method based on frequency domain space is calculated by using the sample data in the class, which can effectively estimate the rotation angle and translation parameters. Finally, the aligned singlegrain image is weighted to average according to SNR, thereby improving the result of class averaging. The above method can also be applied to real data sets, and experimental results show that SimCryoCluster performs as well as the most advanced method of singleparticle 2D classification.
Availability of data and materials
The datasets used in this study and the source are availabel at https://www.ebi.ac.uk/empiar/.
Abbreviations
 CryoEM:

Cryoelectron microscopy
 Micrograph:

Digital image taken through a microscope
 MRC:

Medical Research Council
 PNG:

Portable network graphic
References
Rupp B. Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology. Garland Science (2009).
WÃ¼thrich K. NMR with proteins and nucleic acids. Europhys News. 1986;17(1):11â€“3.
Carroni M, Saibil HR. Cryo electron microscopy to determine the structure of macromolecular complexes. Methods. 2016;95:78â€“85.
Sieben C, Banterle N, Douglass KM, GÃ¶nczy P, Manley S. Multicolor singleparticle reconstruction of protein complexes. Nat Methods. 2018;15(10):777â€“80.
Singer A, Shkolnisky Y. Threedimensional structure determination from common lines in cryoEM by eigenvectors and semidefinite programming. SIAM J Imag Sci. 2011;4(2):543â€“72.
Milne JL, Borgnia MJ, Bartesaghi A, Tran EE, Earl LA, Schauder DM, Lengyel J, Pierson J, Patwardhan A, Subramaniam S. Cryoelectron microscopyâ€”a primer for the nonmicroscopist. FEBS J. 2013;280(1):28â€“45.
Frank J. Threedimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State. Oxford University Press (2006).
Van Heel M, Frank J. Classification of particles in noisy electron micrographs using correspondence analysis. Pattern Recognit Pract. 1980;1:235â€“43.
Scheres SH. A bayesian view on cryoEM structure determination. J Mol Biol. 2012;415(2):406â€“18.
Wu J, Ma YB, Congdon C, Brett B, Chen S, Xu Y, Ouyang Q, Mao Y. Massively parallel unsupervised singleparticle cryoEM data clustering via statistical manifold learning. PLoS ONE. 2017;12(8):0182130.
Ji G, Yang Y, Shen HB. Itervm: an iterative model for singleparticle cryoEM image clustering based on variational autoencoder and multireference alignment. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2018;999â€“1002. IEEE
Prasad V, Das D, Bhowmick B. Variational clustering: Leveraging variational autoencoders for image clustering. In: 2020 International Joint Conference on Neural Networks (IJCNN), 2020;1â€“10. IEEE.
Miolane N, Poitevin F, Li YT, Holmes S. Estimation of orientation and camera parameters from cryoelectron microscopy images with variational autoencoders and generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 970â€“971 (2020).
Nasiri A, Bepler T. Unsupervised object representation learning using translation and rotation group equivariant vae. arXiv preprint arXiv:2210.12918 (2022).
Ma C, Bendory T, Boumal N, Sigworth F, Singer A. Heterogeneous multireference alignment for images with application to 2D classification in single particle reconstruction. IEEE Trans Image Process. 2019;29:1699â€“710.
van Heel M, Harauz G, Orlova EV, Schmidt R, Schatz M. A new generation of the imagic image processing system. J Struct Biol. 1996;116(1):17â€“24.
Wu J, Ma YB, Congdon C, Brett B, Chen S, Ouyang Q, Mao Y. Unsupervised singleparticle deep clustering via statistical manifold learning. arXiv preprint arXiv:1604.04539 (2016).
Joyeux L, Penczek PA. Efficiency of 2D alignment methods. Ultramicroscopy. 2002;92(2):33â€“46.
Yang Z, Penczek PA. CryoEM image alignment based on nonuniform fast Fourier transform. Ultramicroscopy. 2008;108(9):959â€“69.
Wang X, Lu Y, Liu J. A fast image alignment approach for 2D classification of cryoEM images using spectral clustering. Curr Issues Mol Biol. 2021;43(3):1652â€“68.
Huanrong Tang SW, Ouyang J, Liu T. A noise extraction method for cryoEM singleparticle denoising. J Big Data. 2022;4(1):61â€“76.
Goodsell DS, Burley SK. RCSB protein data bank resources for structurefacilitated design of MRNA vaccines for existing and emerging viral pathogens. Structure. 2022;30(1):55â€“68.
DeÂ la RosaTrevÃn J, OtÃ³n J, Marabini R, ZaldÃvar A, Vargas J, Carazo J, Sorzano C. Xmipp 3.0: an improved software suite for image processing in electron microscopy. J Struct Biol 2013;184(2):321â€“328
Chen J, Chen J, Chao H, Yang M. Image blind denoising with generative adversarial network based noise modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018;3155â€“3164.
Fowlkes EB, Mallows CL. A method for comparing two hierarchical clusterings. J Am Stat Assoc. 1983;78(383):553â€“69.
AlAzzawi A, Ouadou A, Tanner JJ, Cheng J. Autocryopicker: an unsupervised learning approach for fully automated single particle picking in cryoEM images. BMC Bioinf. 2019;20(1):1â€“26.
Gupta H, Phan TH, Yoo J, Unser M. Multicryogan: Reconstruction of continuous conformations in cryoEM using generative adversarial networks. In: European Conference on Computer Vision, 2020;429â€“444. Springer
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016;770â€“778.
Ramchoun H, Ghanou Y, Ettaouil M, JanatiÂ Idrissi MA. Multilayer perceptron: architecture optimization and training (2016).
Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE. Big selfsupervised models are strong semisupervised learners. Adv Neural Inf Process Syst. 2020;33:22243â€“55.
VanÂ Gansbeke W, Vandenhende S, Georgoulis S, Proesmans M, VanÂ Gool L. Scan: Learning to classify images without labels. In: European Conference on Computer Vision, 2020;268â€“285. Springer.
Sorzano COS, BilbaoCastro J, Shkolnisky Y, Alcorlo M, Melero R, CaffarenaFernÃ¡ndez G, Li M, Xu G, Marabini R, Carazo J. A clustering approach to multireference alignment of singleparticle projections in electron microscopy. J Struct Biol. 2010;171(2):197â€“206.
Tang G, Peng L, Baldwin PR, Mann DS, Jiang W, Rees I, Ludtke SJ. Eman2: an extensible image processing suite for electron microscopy. J Struct Biol. 2007;157(1):38â€“46.
Zivanov J, Nakane T, Forsberg BO, Kimanius D, Hagen WJ, Lindahl E, Scheres SH. New tools for automated highresolution cryoEM structure determination in relion3. elife 7, 2018;42166.
Park W, Madden DR, Rockmore DN, Chirikjian GS. Deblurring of classaveraged images in singleparticle electron microscopy. Inverse Prob. 2010;26(3): 035002.
Acknowledgements
No applicable
Funding
This research has been supported by Key Projects of the Ministry of Science and Technology of the People Republic of China (2018AAA0102301).
Author information
Authors and Affiliations
Contributions
Complete experiments and paper writing
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
No applicable.
Consent for publication
No applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Tang, H., Wang, Y., Ouyang, J. et al. Simcryocluster: a semantic similarity clustering method of cryoEM images by adopting contrastive learning. BMC Bioinformatics 25, 77 (2024). https://doi.org/10.1186/s1285902305565w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1285902305565w