Novel group-based QSAR and combinatorial design of CK-1δ inhibitors as neuroprotective agents

Joshi, Kopal; Goyal, Sukriti; Grover, Sonam; Jamal, Salma; Singh, Aditi; Dhar, Pawan; Grover, Abhinav

doi:10.1186/s12859-016-1379-9

Volume 17 Supplement 19

15th International Conference On Bioinformatics (InCoB 2016): Bioinformatics

Research
Open access
Published: 22 December 2016

Novel group-based QSAR and combinatorial design of CK-1δ inhibitors as neuroprotective agents

Kopal Joshi¹,
Sukriti Goyal²,
Sonam Grover³,
Salma Jamal²,
Aditi Singh⁴,
Pawan Dhar⁵ &
…
Abhinav Grover⁵

BMC Bioinformatics volume 17, Article number: 515 (2016) Cite this article

1978 Accesses
9 Citations
Metrics details

Abstract

Background

Tar DNA binding protein 43 (TDP-43) hyperphosphorylation, caused by Casein kinase 1 (CK-1) protein isoforms, is associated with the onset and progression of Amyotrophic Lateral Sclerosis (ALS). Among the reported isoforms and splice variants of CK-1 protein superfamily, CK-1δ is known to phosphorylate different serine and threonine sites on TDP-43 protein in vitro and thus qualifies as a potential target for ALS treatment.

Results

The developed GQSAR (group based quantitative structure activity relationship) model displayed satisfactory statistical parameters for the dataset of experimentally reported N-Benzothiazolyl-2-Phenyl Acetamide derivatives. A combinatorial library of molecules was also generated and the activities were predicted using the statistically sound GQSAR model. Compounds with higher predicted inhibitory activity were screened against CK-1δ that resulted in to the potential novel leads for CK-1δ inhibition.

Conclusions

In this study, a robust fragment based QSAR model was developed on a congeneric set of experimentally reported molecules and using combinatorial library approach, a series of molecules were generated from which we report two top scoring, CK-1δ inhibitors i.e., CHC (6-benzyl-2-cyclopropyl-4-{[(4-cyclopropyl-6-ethyl-1,3-benzothiazol-2-yl)carbamoyl]methyl}j-3-fluorophenyl hydrogen carbonate) and DHC (6-benzyl-4-{[(4-cyclopropyl-6-ethyl-1,3-benzothiazol-2-yl)carbamoyl]methyl}-2-(decahydronaphthalen-1-yl)-3-hydroxyphenyl hydrogen carbonate) with binding energy of −6.11 and −6.01 kcal/mol, respectively.

Background

Amyotrophic Lateral Sclerosis (ALS) is a progressive neurodegenerative disease which results in paralysis, muscle wasting and death. Death of the motor neurons of the cortex, spinal cord and brain stem is a characteristic of this disease which eventually leads to death of the patient usually resulting from respiratory failure, mostly within 3–5 years from the appearance of symptoms [1]. The term “Amyotrophic” refers to muscle atrophy and weakness which are characteristic to the lower motor neuron disease while “Lateral Sclerosis” occurs due to hardness to palpitation of the lateral columns of spinal cord observed in autopsy specimens [2]. The prevalence or occurrence of the disease is about 3 per 100,000. The ratio of male:female prevalence of ALS is 1.8-2.0:1 [3]. ALS is usually classified as familial (fALS) and sporadic (sALS) where fALS is usually inherited as a dominant trait and is observed in approximately 10% of the total cases of ALS while sALS occurs in people who do not have any apparent history of this disease in their families [4–6].

The first symptoms of ALS include twitching of muscle, stiffness, cramping and weakness later followed by difficulty in chewing, swallowing, slurred speech and difficulty in fast eye movements [7]. Also, there is difficulty in breathing as there is weakening of muscles of respiratory system. Many deaths are caused by respiratory failure within 2–10 years of onset of disease or are caused due to pneumonia [8, 9]. ALS can be caused by mutations in many different genes such as SOD1 (superoxide dismutase1), TARDBP (transactive response DNA binding protein), and C9orf72 amongst many others [10]. Recently, pathological TDP-43 (transactive response DNA binding protein 43kDA) protein, encoded by TARDBP, has been identified in sALS, which gives scope for development of therapeutic agents [1]. The protein binds to both DNA and RNA. By binding to the former, it regulates the process of transcription and is also involved in the splicing and maturation of mRNAs [11]. Neuronal death is known to be caused in mice, zebrafish, worms, flies and monkeys due to the over expression of mutant TDP-43. It is believed the phosphorylation of TDP-43 may cause toxicity of the protein [1]. Casein Kinase-1 (CK-1) was the first kinase that has been reported to cause this pathological phosphorylation of the TDP-43 protein and its activity has also been found upregulated in spinal cord tissue in ALS and other neural disorders [12]. CK-1 is a unique family of serine/threonine protein kinases that express ubiquitously in the eukaryotes. Seven CK-1 isoforms, namely α, β, γ1, γ2, γ3, δ and ɛ and various splice variants have been reported in mammals [13, 14]. CK-1 family member proteins have a very high (53–98%) identity in the kinase domain and differ from other kinases by containing S-I-N sequence instead of an A-P-E in kinase domain number VIII [15]. CK-1 isoforms are involved in regulation of circadian rhythms, Wnt signaling, nucleo-cytoplasmic shuttling of transcription factors, DNA transcription and DNA repair [16, 17]. They have been found upregulated and mutated in various forms of cancer [18] and neurodegenative diseases. Among the known CK-1 variants CK-1δ has been found upregulated in various neurodegenrative diseases and known to phosphorylate TDP-43 at various sites [19], so in recent years, its importance as a leading target has been highlighted through various studies. CK-1δ inhibition was found be beneficial in cancer inducing the DNA damage and also significant role in pathological TDP-43 phosphorylation have been disclosed [20, 21].

Finding brain penetrant inhibitors of CK-1δ could prevent the occurrence of this phenomenon and present a strategy for the effective treatment of ALS.

Ligand based drug designing is one of the in silico based methods which aides in establishing a quantitative relationship between the structures of inhibitors and their inhibitory activities. The quantitative structure and activity relationship (QSAR) approach attempts at identifying and quantifying the relationship between molecular structures and certain physico-chemical structures thereby producing a model for the prediction of the data [22]. The QSAR approach is also described as one in which data analysis methods are applied to develop models for accurate prediction of biological activities of molecules on the basis of their structures [23]. A regression equation is obtained which explains the variation of one or more dependent variables (biological activity) in terms of independent variables (descriptors) [24]. This study makes use of fragment-based GQSAR modeling to correlate the biological activity of the N-Benzothiazolyl-2-Phenyl Acetamide derivatives with certain physico-chemical descriptors. These molecules have been reported to prevent TDP-43 phosphorylation in the in-vitro studies and have the ability to cross the blood-brain barrier. Group-based QSAR or GQSAR is a method which investigates the structure activity relationship based on molecular fragments of the set of molecules [25–32]. GQSAR is an advantageous and more informative approach than other conventional 2D and 3D QSAR methods. Conventional QSAR methods do not make clear exactly which part of the molecules should be substituted or modified in order to increase the activity. Unlike conventional QSAR methods, GQSAR is a recent fragment based approach that provide useful information about the significant substitution sites, their chemical nature as well as overall interaction that effects the activity of molecules [33–35]. The GQSAR model instead of analysing whole molecule, evaluates molecular fragments. The biological activity of molecular fragments and their descriptors are correlated, leading to QSAR model(s) which focuses on important substitution site with their chemical nature and interactions. The information derived from the developed model helps in suggesting significant molecular fragments that can be utilized as the building blocks while designing novel molecules [36].

The focus of this study was to perform fragment based QSAR modeling on a congeneric set of N-Benzothiazolyl-2-Phenyl Acetamide derivatives. This congeneric set of compounds has been developed by Salado et al. [1]. They have developed 55 molecules as N-Benzothiazolyl-2-Phenyl Acetamide derivatives by substituting chemical moieties and changing the linker that shows a increase or decrease in the molecule's inhibitory activity. There are more molecules showing inhibitory effect against CK-1δ but we have taken 37 molecules as these show greater than 60% inhibitory activity as well as have been generated by replacement at similar number of sites i.e., 6 while other compounds have different linker chain and replacement at 3 sites only. In this study, a GQSAR model based on N-Benzothiazolyl-2-Phenyl Acetamide derivatives was built. The model helps in the explanation of the variation of the biological activity of these derivatives as a function of their site-specific molecular fragments. The GQSAR model presented certain important descriptors which were essential for the compounds to exhibit an enhanced inhibitory activity. In this study, using combinatorial library approach we report novel lead compounds with inhibitory properties against CK-1δ [37, 38]. Through this study, it has been attempted to understand how different substituents at the different positions in the representative template structure of the ligands affects its inhibitory properties, in addition to predicting the biological activities of the designed lead compounds generated in the combinatorial library.

Methods

Preparation of the dataset

The structures of the congeneric dataset of 37 N-Benzothiazolyl-2-Phenyl Acetamide derivatives was prepared using MarvinSketch [1, 39]. The 2D structures were converted into 3D structures using the VLifeEngine module of VLifeMDS [40]. Energy minimization of 3D compounds was performed with the help of force field batch minimization module of VLifeEngine. This step is performed to optimize the molecules upto their lowest stable states of energy. The template was also drawn using MarvinSketch, keeping a common structural moiety in congeneric dataset of all N-Benzothiazolyl-2-Phenyl Acetamide derivatives. The template has 6 substitution sites, marked by dummy atoms and depicted as R1–R6 (Fig. 1).

Calculation of descriptors for GQSAR modeling

This step is performed using the GQSAR module of VLifeMDS [40, 41]. The pIC50 values of the compounds were incorporated into VLifeMDS manually which was followed by the calculation of various 2-D physico-chemical descriptors for the different functional groups present at different substitution sites of the compounds (Refer Table 1).

Table 1 Various N-Benzothiazolyl-2-Phenyl Acetamide derived compounds with substitutions at the six substitution sites, percentage inhibition showed at a concentration of 10 μM and their IC50 values

Full size table

Creation of training set and test set

The dataset that was used during the course of this study consisted of a total of 37 molecules. These molecules were manually divided into training and test set so as to keep a uniform distribution of active as well as inactive molecules in both the sets. The selected 37 compounds were divided into test set (30% of dataset) and training set (70% of dataset) to keep a balance ratio as also studied in different GQSAR studies [25, 35, 42]. Molecules 2, 11, 14, 15, 20, 23, 28, 29 and 33 were taken into the test set whereas the others were included in the training set.

Building of the GQSAR model

For building the GQSAR model, various Variable Selection and Model Building methods are used and implemented such as Step-wise Forward/Backward/Forward-Backward, Simulated Annealing, Genetic Algorithm methods for Variable Selection and Multiple Regression, Partial Least Square, Principal Component Regression methods for Model Building. In this study, Stepwise Forward variable selection method was employed in order to choose from the pool of descriptors, a subset of descriptors. The Step-wise Forward selection method begins with developing a trial model one step at a time with only one independent variable. At each step, the independent variables are added one by one and the model is refitted accordingly. This process is terminated if the last variable that enters the model has regression coefficient which is insignificant or if all the variables have been included in the model [43].

For model building, the Partial Least Square method was used. This method relates a matrix, say Y, of dependent variables (like biological activities of the molecule) to another matrix, say X, of independent variables (like physico-chemical descriptors). The two principle objectives of this method are to approximate the two matrices and to reduce correlation between them. In this method, matrix X is decomposed into several latent variables which correlate best with the molecule’s activity [44]. Variable Selection and Model Building wizard in VLifeMDS was utilized for this purpose.

Validation and evaluation of the model

Various statistical parameters such as r², q², predicted r² and F-test were used to analyze goodness of fit of the QSAR model developed. Squared correlation coefficient, r² is the square of the correlation between the response values and the predicted response values. It can take any value between 0 and 1. However, a value closer to 1 indicates that a greater proportion of variance is accounted for by the model. F-test is used for comparing statistical models which have been fitted to a dataset for identifying a model which is best fitted. A high F-test value indicates statistical significance of the model reducing the possibility of failure of the model. Low standard errors, r²_se, q²_se, pred_r²_se hint at lower probability of failure of the model and show that the quality of the model is high [25]. The developed model s considered to be robust it fulfils the following conditions- r ² > 0.6, q² > 0.6, pred_r² > 0.5 [45–47].

Cross-validation of the model

The developed QSAR model can be cross validated by using internal and external validation methods. The internal validation of the model was carried out by using the Leave One Out (LOO) method. The leave one out cross validated correlation coefficient, q², is used as a fitting function for the evaluation of the models. This method uses a single observation as the validation data from the sample and the remaining observations are taken as training set. This procedure is repeated in a way that every observation is used at least once as validation data. For calculating q², each compound in the training set is sequentially removed and the model is refitted using the same descriptors; the biological activity of the removed molecule being predicted with the help of the refit model [25].

The formula which calculates q² is-

$$ \mathrm{pred}\_{\mathrm{r}}^2=1-\frac{{\displaystyle \sum}\left({\mathrm{y}}_{\mathrm{i}}-{\widehat{\mathrm{y}}}_{\mathrm{i}}\right){}^2}{{\displaystyle \sum}\left({\mathrm{y}}_{\mathrm{i}}-{\mathrm{y}}_{\mathrm{mean}}\right){}^2} $$

Where,

$ {\mathrm{y}}_{\mathrm{i}} $ = actual activity of the i^th molecule in the training set
$ {\hat{\mathrm{y}}}_{\mathrm{i}} $ = predicted activity of the i^th molecule in the training set
$ {\mathrm{y}}_{\mathrm{mean}} $ = average activity of all the molecules in the training set

The external validation of the model was carried out by calculating the predicted r². This indicates how well the calculated model predicts responses for new observations. Predicted r² is calculated by removing each molecule/observation from the dataset systematically and then determining how well the removed observation has been predicted by the model. An important benefit of predicted r² is that it helps in the prevention of overfitting the model.

It is calculated by the following formula-

$$ \mathrm{pred}\_{\mathrm{r}}^2=1-\frac{{\displaystyle \sum}\left({\mathrm{y}}_{\mathrm{i}}-{\widehat{\mathrm{y}}}_{\mathrm{i}}\right){}^2}{{\displaystyle \sum}\left({\mathrm{y}}_{\mathrm{i}}-{\mathrm{y}}_{\mathrm{mean}}\right){}^2} $$

Where,

$ {\mathrm{y}}_{\mathrm{i}} $ = actual activity of the i^th molecule in the test set
$ {\hat{\mathrm{y}}}_{\mathrm{i}} $ = predicted activity of the i^th molecule in the test set
$ {\mathrm{y}}_{\mathrm{mean}} $ = average activity of all the molecules in the test set

Generation of combinatorial library

A combinatorial library was created using the LeadGrow module of VLifeMDS [40]. This was done by substituting various groups at the six different substitution sites of the N-Benzothiazolyl-2-Phenyl Acetamide template which represents the common substructure of the experimentally reported dataset (Fig. 1). The library thus created was generated by making different permutations and combinations of the substituents at the substitution sites and it was comprised of a total of 10,000 compounds. The GQSAR model was used to predict the activity of the compounds generated in the library.

Docking of the ligands with CK-1δ protein

The ligands with highest predicted activity were selected for docking studies and were prepared using the LigPrep utility of Schrodinger [48, 49]. With the help of this, energy minimized 3-D structures of the compounds were generated. The protein CK-1δ used in this study was obtained from Protein Data Bank (PDB Id- 3UYS). This protein was prepared for docking using the Protein Preparation Wizard of Glide [50]. CK-1δ protein preparation involved capping of termini, removal of water molecules, treatment of disulfides and addition of explicit hydrogen. This was followed by the optimization of the protein after which a grid was generated around the interacting residues making use of the Receptor Grid Generation utility of Schrodinger [49]. The ligands prepared with the help of LigPrep were docked with the protein, CK-1δ using the GlideXP module which resulted in their binding affinities [51, 52]. The ligands were ranked on the basis of their binding affinities. The one exhibiting the best binding affinity was merged with the protein and the resulting complex was separately analyzed. This step yielded top two compounds with high binding affinities. With the aim of getting better insight into the binding mode, the compound exhibiting the highest activity amongst the N-Benzothiazolyl-2-Phenyl Acetamide derivatives was also docked with CK-1δ protein and its binding affinity was recorded.

Results and discussions

Descriptors calculation and validation of the data in training and test set

The VLifeMDS software calculated a total of 1027 descriptors. These descriptors were preprocessed by the removal of invariable columns which resulted in a total of 373 descriptors. Nine compounds (2, 14, 15, 20, 23, 28, 29 and 33) were incorporated in the test set whereas the remaining compounds were included in the training set. The test set was chosen as maximum pIC50 value of the test set compound was less than or equal to that of the training set and the lowest pIC50 value of the test compound was more than or equal to that of the training set. This confirms that the test set has been derived from the maximum-minimum range of the train set and is interpolative. Unicolumn statistics of the training set and the test set were obtained (Table 2).

Table 2 Unicolumn statistics of the test and training set data for CK-1δ inhibitory activity

Full size table

Analysis of the GQSAR model

Using Stepwise Forward variable selection and Partial Least Square model building method, a robust GQSAR model best explaining the biological activity of the N-Benzothiazolyl-2-Phenyl compounds as a function of certain site-specific physico-chemical descriptors, was obtained. The model can be represented as follows-

$$ \mathrm{pIC}50 = 0.8709\times \left(\mathrm{R}2-\mathrm{slogp}\right)-1.2447\times \left(\mathrm{R}3-\mathrm{P}\mathrm{s}\mathrm{i}1\right)-0.6798\times \left(\mathrm{R}2-\mathrm{SssCH}2\mathrm{Count}\right) + 0.1867\times \left(\mathrm{R}6-\mathrm{HydrogensCount}\right)+5.9327 $$

with n = 28, degree of freedom = 25, r² = 0.90, r2_standard error = 0.24, q² = 0.85, q²_standard error = 0.29, pred_r² = 0.89, pred_r²_standard error = 0.30, F test = 108.59.

Here,

n = no. of compounds in regression
r² = correlation coefficient (squared)
q² = squared correlation coefficient (cross validated)
Pred_r² = predicted squared correlation coefficient
F-test = denotes that the results are not just based on chanced correlations

This model satisfies all the statistical parameters such as r ² > 0.6, q² > 0.6, pred_r² > 0.5. It also exhibits a very high F-test value and very low standard errors supporting the robustness of the model. The GQSAR model also indicates the influence of the four descriptors namely R2_slogp, R3_Psi1, R2_SssCH2Count and R6_HydrogensCount on their respective substitution sites.

The descriptor, R2_slogp, belongs to the sub class Individual. It describes the log of octanol/water partition coefficient and calculates the log p value from the structure which is given. This indicates the contribution of the descriptor at the R2 substitution site (Table 3). This descriptor has a positive contribution of 39.75%, as is evident from the contribution plot (Fig. 2) suggesting that the presence of hydrophobic groups at this position would enhance the inhibitory activity of the compound. The second descriptor, R3_Psi1, is a member of the sub class Extended Topochemical Atom Based Descriptors which gives a measure of the tendency of the molecules for hydrogen bonding or the polar surface area of molecules. It exhibits a negative contribution of 20.65% at the R3 substitution site indicating that an increase in the polar surface area of the molecule or the number of molecules capable of forming hydrogen bond may decrease the inhibitory action of the compound. The third descriptor, R2_SssCH2Count, belongs to the sub class Estate Numbers. It gives an indication about the total number of –CH2 groups which are connected with the help of two single bonds. It is shown to have a negative contribution of 23.28% at R2 substitution site of the compound hinting that a reduction in such groups would be better for the inhibitory activity of the compound. The final descriptor, R6_HydrogensCount, belongs to the sub class Element Count which is an indicator of the number of Hydrogens present in a particular compound. At R6 substitution site, this descriptor effects a positive contribution of 16.32% indicating the importance of hydrogen atoms at this site for a better inhibitory activity.

Table 3 Contribution of various physico-chemical descriptors

Full size table

Minimal difference between the actual and predicted values of the compounds is a measure of high quality of the model [53, 54]. The significance of a model is described by its various statistical parameters. A high value of the squared correlation coefficient, 0.90, along with very low standard error, 0.23, indicates that the model is highly accurate. Good internal predictive power of the model can be judged by a very high value of cross validated correlation coefficient. Similarly, the value of predicted squared correlation coefficient, 0.89, indicates that the model has good external predictive ability. Since a high F-test value was obtained, 108.59, it can be assured that there are very few chances that the model will fail. Low standard error values represent the fact that the quality of the model generated is very high and the model is predictive and reliable. A Fitness Plot (Fig. 3) and Radar Plots (Fig. 4) were obtained which represent and compare the actual and predicted activities of the molecules of the training set and the test set.

Analysis of the combinatorial library generated using the GQSAR model

The combinatorial library was created by substituting the various sites with rings, aromatic rings, alkyl groups and atoms. The inhibitory activities of the compounds generated were predicted using the GQSAR model generated previously. Around 10,000 compounds were generated in the combinatorial library whose predicted activities ranged from 3.83 to 39.44. Out of these 10,000 compounds, 240 compounds had predicted activity more than the highest activity of the compounds of the experimentally reported dataset (pIC50 = 7.8). The substituents in these compounds exhibiting high predictive power were rings at R1 and R4 positions, alkyl groups at R2 position, electronegative atoms such as fluorine and oxygen at R3 position, carbonic acids and acetate esters at R5 and aromatic rings at R6 position. The presence of highly electronegative atoms at R3 plays the most important role in deciding the activities of compounds. The compounds with atoms other than fluorine and oxygen display lower activity values as compared to those with these two atoms.

Docking analysis of the designed lead compounds with CK-1δ

The compounds which exhibited the best predicted inhibitory values, more than the highest value of the experimentally reported dataset, were selected for further docking analysis. Top two compounds were reported as potent lead compounds against CK-1δ.

The first compound, 6-benzyl-2-cyclopropyl-4-{[(4-cyclopropyl-6-ethyl-1,3-benzothiazol-2-yl)carbamoyl]methyl}j-3-fluorophenyl hydrogen carbonate (CHC) (Fig. 5a) consisted cyclopropane at R1, ethyl at R2, fluorine at R3, another cyclopropane at R4, carbonic at R5 and a benzyl group at R6. This compound displayed a binding score of −6.11 Kcal/mol. The other compound, 6-benzyl-4-{[(4-cyclopropyl-6-ethyl-1,3-benzothiazol-2-yl)carbamoyl]methyl}-2-(decahydronaphthalen-1-yl)-3-hydroxyphenyl hydrogen carbonate (DHC) (Fig. 5b) had cyclopropane at R1, ethyl at R2, hydroxyl at R3, cyclobutane at R4, carbonic at R5 and naphthalene moiety at R6. This compound possessed a binding score of −6.01 Kcal/mol. The various components of the Glide score of these two compounds are provided in Table 4. The compound N-[6-(trifluoromethyl)-1,3-benzothiazol-2-yl]-2-(3,4,5-trimethoxyphenyl)Acetamide (BTA), which exhibited the highest inhibitory activity in the experimental dataset with a pIC50 value of 7.8, was also docked with CK-1δ for a comparative analysis. The GlideXP score of this particular compound was −3.78 Kcal/mol, suggesting that the compounds designed (CHC and DHC) had better binding affinities for CK-1δ protein than the experimentally reported compounds. Structure analysis of the novel leads make it clear that both the compounds have cyclopropane ring at R1, ethyl at R2 and carbonic group at R5 in common. CHC is more active displaying higher binding score, having another cyclopropane ring at R4, a fluorine at R3 and another single six membered (benzene) ring at R6 in comparison to DHC that contain butane ring, hydroxyl group and a fused pair of six membered rings (naphthalene) at the respective positions.

Table 4 Glide score XP and its components

Full size table

Interactions of CHC, BHC with CK-1δ protein

Docking analysis of the top scoring compounds provided insights into the mode of interactions of the designed compounds with the protein Casein Kinase-1δ. Molecular binding is a phenomenon that relies on the entropy-enthalpy compensation and contain both entropic and enthalpy components. The binding may be entropy driven in case of the hydrophobic effect or enthalpy driven in case of the dominant non-covalent attractive forces. Fundamentally, both the entropic and enthalpy component must result into a negative Gibbs’ free energy for effective binding. In our study the reported lead molecules show high negative binding free energy in comparison to the in-vitro reported compound which displays a significant binding capability of the lead molecules. The hydrophobic interactions are the most important forces in stabilizing biological structures ranging from native conformations of proteins to cellular membranes. In our study, high negative value of van der Waal energy represents the massive hydrophobic interaction (Table 4) and hydrogen bonds as the non-covalent attractive forces.

The first compound, CHC displayed four hydrogen bond interactions with three residues of CK-1δ- Glu52, Tyr56 and Lys38. The bond with a length of 3.31 Å was formed with Glutamic acid. The second hydrogen bond of bond length 2.49 Å was formed between the same atom of CHC with Tyrosine. Third and the fourth hydrogen bonds were formed between the fourth oxygen of carbonic group of CHC and Lysine (bond length = 2.74 Å) and the second oxygen atom of carbonic group of CHC and Lysine (bond length = 2.88 Å). CHC also exhibited hydrophobic interactions with several residues such as Asp149, Asp91, Leu135, Ile148, Gly86, Leu84, Leu85, Pro87, Ile23 and Met82 (Fig. 6).

The second lead compound DHC exhibited two hydrogen bonds with CK-1δ. The first one was formed between the nitrogen of DHC and Asp91 (bond length = 3.04 Å). The second hydrogen bond was formed between the fifth oxygen of DHC and Lys38 (bond length = 2.74 Å). DHC also exhibited hydrophobic interactions with various residues like Phe95, Lys130, Asn133, Gly21, Ile148, Asp149, Ile23, Met82, Leu85, Leu135, Pro87, Gly86 and Ala36 (Fig. 7). A summary of these interactions is provided in Table 5.

Table 5 Various CK-1δ residues involved in different kinds of interactions with CHC and DHC

Full size table

The interacting residues in case of both the lead molecules lie in common to the reported ATP binding site residues of the CK-1δ protein. This confirms the structural reasons for inhibitory activity of the lead molecules [1].

Conclusions

In this study, an attempt was made at creating a novel GQSAR model for the derivatives of N-Benzothiazolyl-2-Phenyl Acetamide which act as inhibitors of Casein Kinase-1δ protein. This protein causes the phosphorylation of TAR DNA Binding Protein-43 (TDP-43), a phenomenon which is associated with the onset and progression of a neurodegenerative disorder, Amyotrophic Lateral Sclerosis (ALS). A QSAR equation was obtained which constituted four descriptors namely, R2-slogp, R3-Psi1, R2-SssCH2count and R6-HydrogensCount. The first descriptor displayed a positive contribution at the substitution site R2 whereas the second one displayed negative contribution at R3. The third descriptor exhibited a negative contribution at R2 and the last descriptor was shown to contribute positively to R6 substitution site. GQSAR model was analysed on various statistical parameters and found to be robust. Internal validation of the model was carried out by the leave one out method and external validation was carried out by predicting the activity of the test set molecules.

A combinatorial library was prepared and the activities of the compounds were predicted using the developed QSAR model. An analysis of the compounds generated from this library revealed that the presence of cyclic rings at R1 and R4, alkyl groups at R2, electronegative atoms such as fluorine and oxygen at R3 acetate esters at R4 and aromatic rings at R6 were beneficial in enhancing the inhibitory activity of the compounds. This was followed by docking, resulting in the top scoring compounds, CHC and DHC, with highest binding affinities with the protein CK-1δ. This study provides substantial amount of evidence that these compounds can be considered as potential leads against the CK-1δ protein, inhibiting the phosphorylation of TDP-43 and thus preventing ALS. These molecules have been developed on the basis of a highly accurate and validated GQSAR model and have also proved to have high binding affinity towards CK-1δ as displayed through the docking analysis. CHC and DHC can be the good leads for further in-vitro testing as CK-1δ inhibitors and have the potential to be include in the drug development pipeline as CK-1δ antagonists.

References

Salado IG, Redondo M, Bello ML, Perez C, Liachko NF, Kraemer BC, Miguel L, Lecourtois M, Gil C, Martinez A, Perez DI. Protein kinase CK-1 inhibitors as new potential drugs for amyotrophic lateral sclerosis. J Med Chem. 2014;57(6):2755–72. doi:10.1021/jm500065f.
Article CAS PubMed PubMed Central Google Scholar
Rowland LP, Shneider NA. Amyotrophic lateral sclerosis. N Engl J Med. 2001;344(22):1688–700. doi:10.1056/NEJM200105313442207.
Article CAS PubMed Google Scholar
Yasri A, Hartsough D. Toward an optimal procedure for variable selection and QSAR model building. J Chem Inf Comput Sci. 2001;41(5):1218–27.
Article CAS PubMed Google Scholar
Andersen PM, Al-Chalabi A. Clinical genetics of amyotrophic lateral sclerosis: what do we really know? Nat Rev Neurol. 2011;7(11):603–15. doi:10.1038/nrneurol.2011.150.
Article CAS PubMed Google Scholar
Fecto F, Siddique T. Making connections: pathology and genetics link amyotrophic lateral sclerosis with frontotemporal lobe dementia. J Mol Neurosci. 2011;45(3):663–75. doi:10.1007/s12031-011-9637-9.
Article PubMed Google Scholar
Kiernan MC, Vucic S, Cheah BC, Turner MR, Eisen A, Hardiman O, Burrell JR, Zoing MC. Amyotrophic lateral sclerosis. Lancet. 2011;377(9769):942–55. doi:10.1016/S0140-6736(10)61156-7.
Article CAS PubMed Google Scholar
Cohen B, Caroscio J. Eye movements in amyotrophic lateral sclerosis. J Neural Transm Suppl. 1983;19:305–15.
CAS PubMed Google Scholar
Simpson CL, Al-Chalabi A. Amyotrophic lateral sclerosis as a complex genetic disease. Biochim Biophys Acta. 2006;1762(11–12):973–85. doi:10.1016/j.bbadis.2006.08.001.
Article CAS PubMed Google Scholar
Strong MJ, Kesavapany S, Pant HC. The pathobiology of amyotrophic lateral sclerosis: a proteinopathy? J Neuropathol Exp Neurol. 2005;64(8):649–64.
Article CAS PubMed Google Scholar
Vance C, Rogelj B, Hortobagyi T, De Vos KJ, Nishimura AL, Sreedharan J, Hu X, Smith B, Ruddy D, Wright P, Ganesalingam J, Williams KL, Tripathi V, Al-Saraj S, Al-Chalabi A, Leigh PN, Blair IP, Nicholson G, de Belleroche J, Gallo JM, Miller CC, Shaw CE. Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science. 2009;323(5918):1208–11. doi:10.1126/science.1165942.
Article CAS PubMed PubMed Central Google Scholar
Pasinelli P, Brown RH. Molecular biology of amyotrophic lateral sclerosis: insights from genetics. Nat Rev Neurosci. 2006;7(9):710–23. doi:10.1038/nrn1971.
Article CAS PubMed Google Scholar
Perez DI, Gil C, Martinez A. Protein kinases CK1 and CK2 as new targets for neurodegenerative diseases. Med Res Rev. 2011;31(6):924–54.
Article CAS PubMed Google Scholar
Knippschild U, Wolff S, Giamas G, Brockschmidt C, Wittau M, WüRL PU, Eismann T, Stöter M. The role of the casein kinase 1 (CK1) family in different signaling pathways linked to cancer development. Oncol Res Treat. 2005;28(10):508–14.
Article CAS Google Scholar
Cheong JK, Virshup DM. Casein kinase 1: complexity in the family. Int J Biochem Cell Biol. 2011;43(4):465–9.
Article CAS PubMed Google Scholar
Price MA. CKI, there’s more than one: casein kinase I family members in Wnt and Hedgehog signaling. Genes Dev. 2006;20(4):399–410.
Article CAS PubMed Google Scholar
Eide EJ, Virshup DM. Casein kinase I: another cog in the circadian clockworks. Chronobiol Int. 2001;18(3):389–98.
Article CAS PubMed Google Scholar
Etchegaray J-P, Machida KK, Noton E, Constance CM, Dallmann R, Di Napoli MN, DeBruyne JP, Lambert CM, Elizabeth AY, Reppert SM. Casein kinase 1 delta regulates the pace of the mammalian circadian clock. Mol Cell Biol. 2009;29(14):3853–66.
Article CAS PubMed PubMed Central Google Scholar
Schittek B, Sinnberg T. Biological functions of casein kinase 1 isoforms and putative roles in tumorigenesis. Mol Cancer. 2014;13(1):1.
Article Google Scholar
Kametani F, Nonaka T, Suzuki T, Arai T, Dohmae N, Akiyama H, Hasegawa M. Identification of casein kinase-1 phosphorylation sites on TDP-43. Biochem Biophys Res Commun. 2009;382(2):405–9.
Article CAS PubMed Google Scholar
Nonaka T, Masai H, Hasegawa M. Phosphorylation of TDP-43 by casein kinase 1 delta facilitates mislocalization and intracellular aggregate formation of TDP-43. Alzheimers Dement. 2014;10(4):P790.
Article Google Scholar
Greer YE, Gao B, Yang Y, Rubin JS. Lack of casein kinase 1 delta induces DNA damage, inhibition of mTORC1 signaling and nucleophagy. Cancer Res. 2014;74(19 Supplement):1335.
Article Google Scholar
Li J, Gramatica P. The importance of molecular structures, endpoints’ values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders. Mol Divers. 2010;14(4):687–96. doi:10.1007/s11030-009-9212-2.
Article CAS PubMed Google Scholar
Tropsha A. Best practices for QSAR model development, validation, and exploitation. Mol Inf. 2010;29(6–7):476–88. doi:10.1002/minf.201000061.
Article CAS Google Scholar
Van Damme S, Bultinck P. A new computer program for QSAR-analysis: ARTE-QSAR. J Comput Chem. 2007;28(11):1924–8. doi:10.1002/jcc.20664.
Article PubMed Google Scholar
Goyal S, Grover S, Dhanjal JK, Tyagi C, Goyal M, Grover A. Group-based QSAR and molecular dynamics mechanistic analysis revealing the mode of action of novel piperidinone derived protein–protein inhibitors of p 53–MDM2. J Mol Graph Model. 2014;51:64–72.
Article CAS PubMed Google Scholar
Goyal M, Dhanjal JK, Goyal S, Tyagi C, Hamid R, Grover A. Development of dual inhibitors against Alzheimer’s disease using fragment-based QSAR and molecular docking. Biomed Res Int. 2014;2014:979606. doi:10.1155/2014/979606.
Article PubMed PubMed Central Google Scholar
Gupta A, Jain R, Wahi D, Goyal S, Jamal S, Grover A. Abrogation of AuroraA-TPX2 by novel natural inhibitors: molecular dynamics-based mechanistic analysis. J Recept Signal Transduction. 2015:1–8.
Patel K, Tyagi C, Goyal S, Jamal S, Wahi D, Jain R, Bharadvaja N, Grover A. Identification of chebulinic acid as potent natural inhibitor of M. tuberculosis DNA gyrase and molecular insights into its binding mode of action. Comput Biol Chem. 2015;59:37–47.
Article CAS PubMed Google Scholar
Vats C, Dhanjal JK, Goyal S, Gupta A, Bharadvaja N, Grover A. Mechanistic analysis elucidating the relationship between Lys96 mutation in Mycobacterium tuberculosis pyrazinamidase enzyme and pyrazinamide susceptibility. BMC Genomics. 2015;16 Suppl 2:S14.
Article PubMed PubMed Central Google Scholar
Nagpal N, Goyal S, Wahi D, Jain R, Jamal S, Singh A, Rana P, Grover A. Molecular principles behind Boceprevir resistance due to mutations in hepatitis C NS3/4A protease. Gene. 2015;570(1):115–21.
Article CAS PubMed Google Scholar
Goyal S, Jamal S, Shanker A, Grover A. Structural investigations of T854A mutation in EGFR and identification of novel inhibitors using structure activity relationships. BMC Genomics. 2015;16 Suppl 5:S8.
Article PubMed PubMed Central Google Scholar
Tyagi C, Bathke J, Goyal S, Fischer M, Dahse H-M, Chacko S, Becker K, Grover A. Targeting the intersubunit cavity of plasmodium falciparum glutathione reductase by a novel natural inhibitor: computational and experimental evidence. Int J Biochem Cell Biol. 2015;61:72–80.
Article CAS PubMed Google Scholar
Virupaksha B, Alpana G, Prashant K, Deshpande U, Desideri A. Analysis of naphthoquinone derivatives as topoisomerase I inhibitors using fragment based QSAR. J Cheminformatics. 2013;5(S-1):22.
Article Google Scholar
Goyal M, Grover S, Dhanjal JK, Goyal S, Tyagi C, Grover A. Molecular modelling studies on flavonoid derivatives as dual site inhibitors of human acetyl cholinesterase using 3D-QSAR, pharmacophore and high throughput screening approaches. Med Chem Res. 2014;23(4):2122–32.
Article CAS Google Scholar
Singh A, Goyal S, Jamal S, Subramani B, Das M, Admane N, Grover A. Computational identification of novel piperidine derivatives as potential HDM2 inhibitors designed by fragment-based QSAR, molecular docking and molecular dynamics simulations. Struct Chem. 2016;27(3):993–1003.
Article CAS Google Scholar
Ajmani S, Agrawal A, Kulkarni SA. A comprehensive structure-activity analysis of protein kinase B-alpha (Akt1) inhibitors. J Mol Graph Model. 2010;28(7):683–94. doi:10.1016/j.jmgm.2010.01.007.
Article CAS PubMed Google Scholar
Tyagi C, Grover S, Dhanjal J, Goyal S, Goyal M, Grover A. Mechanistic insights into mode of action of novel natural cathepsin L inhibitors. BMC Genomics. 2013;14 Suppl 8:S10. doi:10.1186/1471-2164-14-S8-S10.
Article PubMed PubMed Central Google Scholar
Goyal M, Grover S, Dhanjal JK, Goyal S, Tyagi C, Grover A. Molecular modelling studies on flavonoid derivatives as dual site inhibitors of human acetyl cholinesterase using 3D-QSAR, pharmacophore and high throughput screening approaches. Med Chem Res. 2013:1–11.
MarwinSketch. 5.12.1 edn. 2013.
VLifeMDS. Molecular design suite. 43rd ed. Pune: VLife Sciences Technologies Pvt. Ltd; 2010.
Google Scholar
Ajmani S, Jadhav K, Kulkarni SA. Group‐based QSAR (G‐QSAR): mitigating interpretation challenges in QSAR. QSAR Comb Sci. 2009;28(1):36–51.
Article CAS Google Scholar
Goyal M, Dhanjal JK, Goyal S, Tyagi C, Hamid R, Grover A. Development of dual inhibitors against Alzheimer’s disease using fragment-based QSAR and molecular docking. BioMed Res Int. 2014.
Xu L, Zhang W-J. Comparison of different methods for variable selection. Anal Chim Acta. 2001;446(1–2):475–81. http://dx.doi.org/10.1016/S0003-2670(01)01271-5.
Article Google Scholar
Ajmani S, Kulkarni SA. Application of GQSAR for scaffold hopping and lead optimization in multitarget inhibitors. Mol Inf. 2012;31(6–7):473–90. doi:10.1002/minf.201100160.
Article CAS Google Scholar
Golbraikh A, Tropsha A. Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Mol Divers. 2002;5(4):231–43.
Article PubMed Google Scholar
Afantitis A, Melagraki G, Sarimveis H, Igglessi-Markopoulou O, Kollias G. A novel QSAR model for predicting the inhibition of CXCR3 receptor by 4-N-aryl-[1,4] diazepane ureas. Eur J Med Chem. 2009;44(2):877–84. doi:10.1016/j.ejmech.2008.05.028.
Article CAS PubMed Google Scholar
Golbraikh A, Tropsha A. Beware of q2! J Mol Graph Model. 2002;20(4):269–76.
Article CAS PubMed Google Scholar
Schrödinger L. Maestro, version 8.5. New York: Schrödinger; 2008.
Google Scholar
Schrödinger L. SCHRODINGER SUITE 2008. Maestro Version 8. 2008.
Sastry GM, Adzhigirey M, Day T, Annabhimoju R, Sherman W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J Comput Aided Mol Des. 2013;27(3):221–34.
Article PubMed Google Scholar
Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem. 2004;47(7):1739–49.
Article CAS PubMed Google Scholar
Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL. Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J Med Chem. 2004;47(7):1750–9.
Article CAS PubMed Google Scholar
Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE. Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer Science & Business Media. 2011.
Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass). 2010;21(1):128.
Article Google Scholar

Download references

Acknowledgments

AG is thankful to Jawaharlal Nehru University for usage of all computational facilities. AG is grateful to University Grants Commission, India for the Faculty Recharge Position.

Declarations

This article has been published as part of BMC Bioinformatics Volume 17 Supplement 19, 2016. 15th International Conference On Bioinformatics (INCOB 2016): bioinformatics. The full contents of the supplement are available online https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-17-supplement-19.

Funding

Publication charges for this article have been funded by Jawaharlal Nehru University.

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article.

Authors’ contributions

KJ, SG and AG designed the methods and experimental setup. KJ, SG and SoG carried out the implementation of various methods and were assisted by SJ and AS. KJ, SG, PD and AG wrote the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

The authors give their consent for publication of this article.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations

Amity School of Biotechnology, Amity University, Noida, Uttar Pradesh, 201303, India
Kopal Joshi
Department of Bioscience and Biotechnology, Banasthali University, Tonk, Rajasthan, 304022, India
Sukriti Goyal & Salma Jamal
Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi, 110016, India
Sonam Grover
Department of Biotechnology, TERI University, New Delhi, 110070, India
Aditi Singh
School of Biotechnology, Jawaharlal Nehru University, New Delhi, 110067, India
Pawan Dhar & Abhinav Grover

Authors

Kopal Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Sukriti Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Sonam Grover
View author publications
You can also search for this author in PubMed Google Scholar
Salma Jamal
View author publications
You can also search for this author in PubMed Google Scholar
Aditi Singh
View author publications
You can also search for this author in PubMed Google Scholar
Pawan Dhar
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Grover
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhinav Grover.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Joshi, K., Goyal, S., Grover, S. et al. Novel group-based QSAR and combinatorial design of CK-1δ inhibitors as neuroprotective agents. BMC Bioinformatics 17 (Suppl 19), 515 (2016). https://doi.org/10.1186/s12859-016-1379-9

Download citation

Published: 22 December 2016
DOI: https://doi.org/10.1186/s12859-016-1379-9

15th International Conference On Bioinformatics (InCoB 2016): Bioinformatics

Novel group-based QSAR and combinatorial design of CK-1δ inhibitors as neuroprotective agents

Abstract

Background

Results

Conclusions

Background

Methods

Preparation of the dataset

Calculation of descriptors for GQSAR modeling

Creation of training set and test set

Building of the GQSAR model

Validation and evaluation of the model

Cross-validation of the model

Generation of combinatorial library

Docking of the ligands with CK-1δ protein

Results and discussions

Descriptors calculation and validation of the data in training and test set

Analysis of the GQSAR model

Analysis of the combinatorial library generated using the GQSAR model

Docking analysis of the designed lead compounds with CK-1δ

Interactions of CHC, BHC with CK-1δ protein

Conclusions

References

Acknowledgments

Declarations

Funding

Availability of data and materials

Authors’ contributions

Competing interests

Consent for publication

Ethics approval and consent to participate

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us