Strain-specific behavior of Mycobacterium tuberculosis in A549 lung cancer cell line

Background A growing body of evidence has shown the association between tuberculosis (TB) infection and lung cancer. However, the possible effect of strain‐specific behavior of Mycobacterium tuberculosis (M.tb) population, the etiological agent of TB infection in this association has been neglected. In this context, this study was conducted to investigate this association with consideration of the genetic background of strains in the M.tb population. Results We employed the elastic net penalized logistic regression model, as a statistical-learning algorithm for gene selection, to evaluate this association in 129 genes involved in TLRs and NF-κB signaling pathways in response to two different M.tb sub-lineage strains (L3-CAS1and L 4.5). Of the 129 genes, 21 were found to be associated with the two studied M.tb sub-lineages. In addition, MAPK8IP3 gene was identified as a novel gene, which has not been reported in previous lung cancer studies and may have the potential to be recognized as a novel biomarker in lung cancer investigation. Conclusions This preliminary study provides new insights into the mechanistic association between TB infection and lung cancer. Further mechanistic investigations of this association with a large number of M.tb strains, encompassing the other main M.tb lineages and using the whole transcriptome of the host cell are inevitable.

inflammatory microenvironment driven by cytokines, chemokines, and inflammatory cells during TB infection has been recognized as a process that can induce genetic and host tissue damage and contribute to carcinogenesis in lung tissue [11].
There is increasing evidence of a significant genetic diversity in the M.tb population [12,13]. M.tb strain-specific host-pathogen interactions have been demonstrated in previous studies [14][15][16][17][18]. This characteristic may affect the trend of M.tb pathogenesis and the molecular mechanism behind the association between the risk of lung cancer and TB infection. However, the association between TB infection and lung cancer has been evaluated previously without considering the bacterial genotype. In the current study, we tried to provide new insight into this association with consideration of the genetic background of strains in M.tb population.
In Iran, the presence of remarkable diversity in M.tb population structure with predominance of L3-CAS1 and L4.5 (NEW1) sub-lineages has been documented. Epidemiologically, Iran has been identified as the probable origin of L4.5 and the ecological adaption and national occurring of this subpopulation was not unexpected. L3-CAS-1sub-lineage is almost found in around the Indian Ocean and the influx of Afghan refugees may contribute to ongoing circulation of L3-CAS1sub-lineage in Iran. However, it seems genetic variability be the main driver of this epidemiological trend and transmission potential in both sub-lineages [13,19,20]. In line with, in our previous study, we compared these dominant sub-lineages of M.tb strains in interrupting TLRs and NF-κB signaling pathways in alveolar epithelial cell type II (A549 cell line) and observed strainspecific characteristics in interactions with host cells [16]. In the light of these results, we examined the gene expression profile of cancerous cell line in response to two M.tb with divergent genetic background by employing penalized statistical model and systems biology methods.

Discussion
In the present study, we investigated lung cancer-related genes that differentially were regulated by different genotypes of M.tb in lung adenocarcinoma cell line using the statistical penalized algorithm. In our analyses, we identified 21 potentially lung cancer-related genes during infection with M.tb L3-CAS1 and L4.5 sub-lineages. Various inflammatory processes and functional pathway-associated genes, which are involved in carcinogenesis, have been investigated in different studies. Chemokines secretion is one of the main ways for recruitment of host cell and inhibition of antitumor immune responses in cancerous cells [21,22]. MCP-1 is one of the chemotactic stimuli that is secreted from cancerous cells and induces immunosuppressive microenvironments [23]. There are some controversies about the role of this chemokine in lung cancer pathogenesis [24,25]. However, Fridlender et al. found that the blockade of MCP-1 could inhibit lung tumorigenesis and could be proposed as a promising approach to lung cancer treatment [26]. Besides, the role of IL8 as another chemokine, which typically plays a role in the induction of angiogenesis and its overexpression, has been reported in lung cancer. In addition, the overexpression of COX-2, an inflammation-associated gene, has been found in different stages of lung IFN-γ is generally considered as a cytokine with antitumor activity. However, there are significant controversies about the role of this cytokine. Increasing evidence suggested that IFN-γ may have dual aspects in its function and act as both an anti-tumorigenic and a pro-tumorigenic cytokine [28]. The pro-tumorigenic property of IFN-γ is based on the upregulation of immunosuppressive cells such as Treg cells and Th17 [29]. The pivotal role of this cytokine in regulating the Programmed death-ligand 1 (PD-L1) gene expression, as a factor that has an inhibitory role in cancer immunity, and promoting the immune evade has been found in tumor cells [30]. Besides, the upregulation of PD-L1 expression and induction of lung carcinoma by IFN-γ have been revealed [30]. Similarly, in vitro and in vivo concomitant H37Rv infection in non-small cell lung cancer showed that lung cancer progression facilitated by enhancing Treg cells proportion and the upregulation of PD-L1 expression that induced by H37Rv as a part of M.tb lineage 4 [31]. In the current study, based on the expression of IFN-γ in response to the infection with M.tb L4.5 sub-lineage, compared to L3-CAS1 sub-lineage, this expression profile might be favor for better control of M.tb L4.5 sub-lineage strain compared to L3-CAS1 Table 3 Confirmation of the associations of 21 selected genes with lung cancer/or lung function by literature reviewing in PubMed with keywords (("Lung Cancer" OR "Lung Function") AND "name of each selected gene") in the title and abstract fields  Deregulation of apoptosis and cell proliferation pathways are the key mechanisms playing important roles in cancer pathogenesis [33]. The blockade of apoptosis can be mediated by the overexpression of anti-apoptotic proteins such as FLIP. The upregulation of FLIP also has been detected in lung carcinoma [34,35]. In line with lung cancer study, the expression of FLIP was upregulated in response to L4.5 sub-lineage when compared with L3-CAS1 sub-lineage in our study (p < 0.05). The overexpression of FLIP by M. tb L4.5 sub-lineage may contribute to exacerbation of lung cancer during infection with this strain. This overexpression and inhibiting of the apoptosis is also favor for M.tb pathogenesis.
In addition, it has been demonstrated the blockade of the Rho/Rho-kinase pathway, which is involved in cancer proliferation and invasion, inhibited tumor migration and invasion [36,37]. The knockdown of RhoA as the member of the Rho family inhibits lung cancer cell proliferation and induces apoptosis [38]. In our study, the expression of RhoA did not change in response to L4.5 sub-lineage when compared with L3-CAS1 sub-lineage. In addition, the deregulation of the HRAS gene, which is involved in the proliferation of different cancers, has been reported [39,40]. Overexpression of the Ras oncogene family member was identified in response to L4.5 sub-lineage when compared with L3-CAS1 sub-lineage. This upregulation can contribute to the progression of cancerous cells.
Among the 21 selected genes, BCL3 has an inhibitory function. The deregulation of this gene as an atypical member of the IκB family has been shown in different solid tumors [41]. In addition, Dimitrakopoulos et al. described the role of this gene in lung carcinogenesis [42]. They reported an increase in BCL3 expression in lung cancer. This overexpression could be directly related to the increased level of EGFR expression. Aberrant EGFR expression are implicated in the progression of malignant cells manner [43,44]. EGFR can promote angiogenesis by upregulation of main angiogenesis mediators such as Vascular endothelial growth factor (VEGF). Angiogenesis plays important role in the solid tumors growth and metastasis spreading [45]. Moreover, the expression of HDM2 as a negative regulator of p53 that is a tumor suppressor gene was induced by the upregulation of BCL3 [46]. In our analysis, based on the expression of BCL3 and EGFR in response to L4.5 sub-lineage when compared to L3-CAS1 sub-lineage, we hypothesize that infection by the L4.5 sub-lineage strain may be potent to deteriorate lung carcinoma by promoting tumor growth and angiogenesis.
Besides, it has been proposed that HSPA1A (HSP70), a chaperone molecule, is strongly involved in promoting and development of different tumor cells and overexpression of this heat-shock protein has been shown to be associated with the progression of several tumors such as lung cancer [47]. The current findings were consistent with the previous studies. However, level of HSP70 expression in response to infection with M.tb L4.5 sublineage was lower in compared to L3-CAS1 sub-lineage.
It is noteworthy that some inconsistent results were found in the current study. It has been shown TLRs pathway molecules such as IRAK1and TLR2 have important roles in neoplasm diseases [48] and the significant upregulation of IRAK1 and its involvement in the development of solid tumors including lung cancer have been reported [49,50]. Besides, high expression of TIFA [51] and IL1A as a gene, which regulates tumor growth, angiogenesis, and metastasis in lung carcinoma cell has been reported. [52]. Contrary, no changes in the expression of all aforementioned genes were observed during infection with M.tb L4.5 sub-lineage compared to L3-CAS1 sub-lineage. Although the expression of the genes is controversial, the expression profile of other genes suggested the possibility that infection with the M.tb L4.5 sub-lineage strain drive cancer cell to progression. In the other word, the risk of progression might have promoted in lung cancer patients with lung that infected by M.tb L4.5 sub-lineage strain compared to M.tb L3-CAS1 sublineage strain. These patients also are more potent to secondary infections.
In contrast to our results, Mvubu et al. [32] showed that the expression level of IRAK1 and IL1A were increased in response to infection with LAM sub-lineages(F15/LAM4/ KZN, F11), S sub-lineage (F28) and Beijing sub-lineage. Level of this increase was higher in response to LAM sub-lineages compared to the other sub-lineages. It is possible that infection with LAM sub-lineages similar to L4.5 sub-lineages is more potent to drive cancer cell to progression.
In our analysis, we also identified MAPK8IP3 as a novel and potent target that has not been reported in previous lung cancer studies. MAPK8IP3 is a scaffold gene, also known as JIP3, that exhibits function in the JNK pathway [53]. The overexpression of this gene has been shown in different tumor cells [53,54]. Therefore, MAPK8IP3 may have the potential to be recognized as a novel biomarker in lung cancer investigation.
Based on the results of previous studies that demonstrated elastic net penalized logistic regression frequently performed better than Ridge, LASSO, and some statisticalbased learning algorithms for model selection consistency and prediction accuracy [55], the use of this modern and accepted computational method in high dimensional gene expression data is a strength of the current study. The validation of all the results by literature review, the use of an appropriate cross-validation method (repeated 5-CV), the address of potential sources of bias and the use of STRING networks are the other strengths of the present study. However, the main limitations of our study are that the cell line selection was confined to adenocarcinoma of lung cell line and protein levels of selected genes were not assessed.

Conclusions
The evidence of epidemiological association between TB infection and lung cancer is well established. This preliminary study provides new insights into the mechanistic association between TB infection and lung cancer. The two studied M.tb sub-lineages promoted cancer development by creating an inflammatory environment through differentially down/up-regulation of gene involved in TLRs and NF-κB signaling pathways. This environment has crucial impact on cell proliferation, apoptosis and angiogenesis. Based on significant strain-specific behavior of M.tb population in host-pathogen interactions and according to our findings, investigation of linking TB infection to lung cancer in the context of the genetic background of M.tb strains might be more effective to gain a better understanding of this association, identification of M.tb strain-specific behavior and therapeutic intervention. Further investigations with a large number of M.tb strains, encompassing the other main M.tb lineages and using the whole transcriptome of the host cell are inevitable. However, providing further information to fully understand of significant M.tb strain-specific behavior related to lung cancer progression and minimizing bias are needed by means of high throughput methods.

Study design
The study was designed in accordance with our previous study [16] which investigated the gene expression profile of infected A549 cell line (ATCC CCL-185) in response to dominant genotypes of M.tb. Briefly, the dominant genotypes of M.tb (L3-CAS1 and L4.5 strains) in the capital of Iran were identified based on 24 loci MIRU-VNTR and Spoligotyping [19] and confirmed by whole genome sequencing method. Then, the A549 cell line (maintained in antibiotic-free media) was infected in triplicates with the dominant genotypes an multiplicity of infection (MOI) of ~ 50:1 (50 bacteria:cell) for 72 h supplemented Dulbecco's modified Eagle medium (DMEM) and After the time, cellular response involved in TLRs and NF-κB signaling pathways was evaluated by qRT-PCR. RT 2 Profiler ™ PCR Array kits (QIAGEN), which include RT 2 Profiler ™ PCR Array Human Toll-Like Receptor Signaling Pathway (QIAGEN, Cat.No. PAHS-018ZF-2) and RT 2 Profiler ™ PCR Array Human NF-κB Signaling Pathway (QIAGEN, Cat.No. PAHS-025YF-2) according to the manufacturer's instructions was used to perform qRT-PCR. The expression of 168 pathway-specific genes was evaluated and 39 genes were shared between these pathways. Secretion level of 12 cytokines/chemokines was assessed by ELISA arrays kit (QIAGEN). Viability of infected and mock cells was evaluated by the trypan blue exclusion test based on the manufacturer's instructions (Sigma Aldrich, Germany). In addition, intracellular growth assay and intracellular internalization index were carried out [16].

Gene expression analysis
The comparative cycle threshold (Ct) method (2 −ΔCt × 10 3 ) was used to demonstrate the relative gene expression across the samples and the fold change was calculated using the 2 −ΔΔCt method [56]. Next, the primary gene expression data were qualified and normalized. Linear modeling for statistical comparison was applied by "limma" R package [57]. The cutoff of the false discovery rate for statistical comparison between the control and TB groups was considered at the level of 0.10.

Gene selection model
Elastic net regularization produced a sparse model with good prediction accuracy and good grouping capability. Elastic net frequently has served better than the Ridge, LASSO, and many other statistical learning algorithms in gene selection consistency and prediction accuracy in gene datasets [55,58]. Elastic net is introduced as a compromise between these two techniques, combining strength between the Ridge and LASSO penalized regression [59]. The elastic net penalized logistic regression was performed by "glmnet" R package (https:// cran.r-proje ct. org/ web/ packa ges/ glmnet). The two M.tb sub-lineages were considered as dependent variable and expression level of the 129 genes were considered as independent/ or predictive variables in the elastic net regularized logistic regression for gene selection. The importance value of each selected gene was calculated using "varImp" function in "Caret" R package. Interactive agglomerative hierarchical clustering heatmap was applied by "heatmaply" R package in order to draw the co-expression heatmap between the selected genes (https:// cran.r-proje ct. org/ web/ packa ges/ heatm aply). Statistical significance was considered at the level of 0.05 in the all of statistical methods.

Cross-validation and literature validation
In order to validate the performance of the elastic net penalized regression, the repeated fivefold cross-validation was used. The model split the dataset by using repeated random sub-sampling with 100 repetitions in the fivefold cross-validation, permuting the sample labels every time. The cross-validated performance was summarized by observed misclassification error rate. In addition, to assess the literature validation for any result, a literature mining was used in PubMed by the search strategy of ("Lung Cancer" OR "Lung Function") AND ("name of each selected gene") and related MeSH terms in title and abstract fields.