Five crucial prognostic-related autophagy genes stratified female breast cancer patients aged 40–60 years

Background Autophagy is closely related to the progression of breast cancer. The aim at this study is to establish a prognostic-related model comprised of hub autophagy genes (AGs) to assess patient prognosis. Simultaneously, the model can guide clinicians to make up individualized strategies and stratify patients aged 40–60 years based on risk level. Methods The hub AGs were identified with univariate COX regression and LASSO regression. The functions and alterations of these selected AGs were analyzed as well. Moreover, the multivariate COX regression and correlation analysis between hub AGs and clinicopathological parameters were done. Results Totally, 33 prognostic-related AGs were obtained from the univariate COX regression (P < 0.05). SERPINA1, HSPA8, HSPB8, MAP1LC3A, and DIRAS3 were identified to constitute the prognostic model by the LASSO regression. The survival curve of patients in the high-risk and low-risk groups was statistically significant (P < 0.05). The 3-year and 5-year ROC displayed that their AUC value reached 0.762 and 0.825, respectively. Stage and risk scores were independent risk factors relevant to prognosis. RB1CC1, RPS6KB1, and BIRC6 were identified as the most predominant mutant genes. It was found that AGs were mainly involved in regulating the endopeptidases synthesis and played important roles in the ErbB signal pathway. SERPIN1, risk score was closely related to the stage (P < 0.05); HSPA8, risk score were closely related to T stag (P < 0.05); HSPB8 was closely related to N stag (P < 0.05). Conclusions Our prognostic model had the relatively robust predictive ability on prognosis for patients aged 40–60 years. If the stage was added into the prognostic model, the predictive ability would be more powerful. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04503-y.


Introduction
Autophagy is a natural phenomenon that regulates cell metabolism inside. Under normal physiological conditions, immune cell autophagy process could eliminate senile organelles and abnormal long-lived proteins in the body, which will be conducive to maintaining immune cell homeostasis. However, a stress stimuli, could result in cell autophagy working in preventing the accumulation of toxic or carcinogenic damage proteins and organelles, and inhibiting simultaneously cell cancelation. If it was occurred dysfunctional autophagy process, tumor could be coming soon [1]. Therefore, during oncogenesis process , autophagy has double function with promotion and suppression.It has been implicated that tumorigenesis is closely related to cell autophagy [2].
Breast Cancer(BC) is one of the most common malignancies for women. The 5-year survival rate of patients is low and it is easy to turn out especially on middle-aged 40-60 years women.Given it is prone to distant metastases, early diagnosis is of great significance of the prognosis of BC patients [3]. So far, conventional clinical evaluation means, such as TNM stag and pathological classification, are tough to precisely speculate patients prognosis in fact and offer clinical doctors more accurate treatment choices. In our study, analyzing the RNA-sequence data onto the AGs from female BC patients at a high incidence age, a prognostic model composed of several signatures was conceived, more accurately dividing BC patients into high-risk and low-risk groups, which will benefit to clinicians to target Patients in various risk levels and to adopt corresponding individualized strategies.

Download data
The Illumina HiSeq data and clinical data of female BC patients, whose ages were 40-60 years and whose pathological types were ductal and lobular tumors,were was performed to screen prognostic AGs by R software package"survival" [4].,while the Least Absolute Shrinkage and Selection Operator ( LASSO regression) was to construct the model of autophagic gene prognosis and to calculate patients riskscore through R software package"glmnet"..By means of including clinical and pathological factors, the multivariate COX regression could filter out independent risk factors,which were jointly incorporated into the model construction.

The various indexes evaluated the practicality of this predictive model
The performance of the AGs model was assessed by plotting survival curves, risk curves, Receiver Operating Characteristic Curve (ROC), calibration curves.We analyzed the main functions of prognostic-related AGs by gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene alterations were detected in cBioportal database(http://www.cbioportal.org/). Simultaneously, the correlation analysis was done between hub AGs and clinical factors.

Externally independent validation
The generalization ability of our risk score model was verified in GEO [5].

1.Screening prognostic-related AGs by univariate COX regression
To download the RNA-sequence data and survival data onto 501 female BC patients whose ages were 40-60 years old from TCGA database, the expression data of the whole 194 AGs were extracted,in terms of AGs list in Human Autophagy Database: HADb (http://autophagy.lu/). The univariate cox regression analysis was performed on 501 BC samples, and 33 prognostic-related AGs were obtained, in which 23 up-regulatedand ones and 10 down-regulated ones ( Figure 1A). Simultaneously, there were significant differences in these genes expressions of healthy women and female BC patients (P <0.05) ( Figure 1B).

2.The five hub AGs identified by LASSO regression
The five AGs were selected from 33 prognostic-related AGs by LASSO regression, whose risk score were calculated based on their respective correlation coefficients for risk stratification of BC prognosis ( Figure 1C). At lambda.1se ,there were five AGs screened, Serpin family A member 1 (SERPINA1), Heat shock protein family A weakly expressed in tumor and normal tissues ( Figure S1).

The differentiated survival curve depended on five-AGs-riskscore between the high-risk group and the low-risk group
At the risk score cut-off value -0.2, all patients were entered into high risk group (> -0.2) and low risk group (<=-0.2). The difference in survival curves between the high-risk group and the low-risk group was statistically significant (P =6.639e-06).
The survival rate of patients with high risk score was significantly lower than that of patients with low risk score ( Figure 1E). SERPINA1, MAP1LC3A, DIRAS3 were down-regulated in high risk group and up-regulated in low risk group, which were potentially tumor suppressor genes; HSPA8, HSPB8 were up-regulated in high risk group and down-regulated in low risk group, which were probably promoting oncogenes. The number of died patients in high risk group was significantly more than that in low risk group ( Figure 1F).

Identification of the independent risk prognostic factors
The  Figure 2C , Figure 2D). It was implicated that stage and risk score was more accurate than alone risk score on predictive ability.

5.Correlation analysis between the five hub AGs expression and clinicopathological parameters
We attempted to discover whether the five key AGs and their risk score were correlated with these clinicopathological factors. SERPIN1, risk score and stage were closely related (t = 9.268, P = 0.026; t = 11.79, P = 0.008); HSPA8, risk score and T stag is closely related (t = 17.535, P = 0.484e-04;t = 8.185, P = 0.042); HSPB8 is closely related to N stag (t = 8.136, P = 0.043) (Table 1, Figure 2E).

The performance of clinical prognostic model based on five hub AGs
Double independent prognostic risk factors, stage and risk score, were included in an established nomogram ( Figure 2F). C-index reached 0.745 (95% CI, 0.709-0.778).
At the same time, we plotted the calibration curve and further evaluated the accuracy of our model. Its results were showed that there was no statistically significant difference between model predicted value and the true one ( Figure 2G, Figure 2H).
The multi-factor ROC illuminated that risk score has the most robust predictive ability for BC prognosis among all clinicopathological factors.

The function analysis of prognostic-related AGs
The prognosis-related AGs of BC selected by univariate COX regression were performed GO and KEGG function enrichment analysis. It was found that up-regulated AGs were mainly involved in the regulation of endopeptidases in vivo ( Figure 3A), while down-regulated AGs were mainly involved in adjusting ErbB signaling ( Figure 3B). We have also taken the cBioPortal database (11%), and BIRC6 (6%) had the most genetic variety ( Figure 3C). There was a significant discrepancy in survival rate of for the genetically altered group and the unaltered group (P = 5.242e-3), and the prognosis of patients in the genetically modified group was worse ( Figure 3D).

8.The external validation of the AGs prognostic models on GEO database
To confirm the practicality of the five-hub-AGs prognostic model, we downloaded GSE96058, with platform GPL11154, Series Matrix File (s) and RNA sequence data from GEO.Randomly select 200 patients aged 40-60 years from 3273 samples, and calculate five-AGs-riskscore. At cut-off value risk score -0.2, patients were divided into a high-risk group and a low-risk group. The evident difference was existed on both groups, and the survival rate of high-risk patients was significantly lower than that of low-risk groups (P = 4.383e-04) ( Figure 3E). The ROC curve of the verification group showed that risk score has relatively good generalization ability ( Figure 3F).

Discussion
Autophagy is a complex immune process regulated genetically, in which autophagy-related genes are involved. It can supervise genome alteration and to impair genomic instability [6]. Carcinogenesis is subjective to genetic mutations.
The role of AGs in tumorigenesis and progression is extremely complicated and its function is likely to be multiple in diverse stages of cancer progression [7]. In the early stage, autophagy might preserve cell survival through cellular catabolism.  [20] found that HSPB8 was highly expressed in triple-positive hormone-sensitive BC cells (MCF-7) and was involved in regulating cell cycle and cell migration. Othman EQet al. [21] uncovered that, compared with normal tissues, a large number of BC tissues expressed the MAP1LC3A protein with strong immunoreactivity. It was indicated that MAP1LC3A was likely to play an indispensable role in the progression of BC. Nowak EM et al. [22] revealed that We also performed the functional analysis of prognostic-related AGs and found that they were mainly engaged in the regulation of endopeptidase and ERBB2 signaling pathways. Sisinni L et al. [23] focused on endoplasmic reticulum stress and the relationship between apoptosis and autophagy in human BC. They thought that endoplasmic reticulum stress resulted in levels and activities alteration of key regulators about cell survival and autophagy, promoting to increase endopeptidase synthesis. Moreover,the ERBB2 signaling pathway was a critical pathway to regulating BC autophagy [16].
We also compared other BC prediction models to ours. It was implicated that our model was simpler and easier to operate than theirs, and the AUC value of reflecting predictive ability was relatively higher. Shen Y et al. [24] investigated that 11-lncRNA signature was a novel and significant prognostic element independent among multiple clinical and pathological parameters. The TIMER database showed that the the  its functions remains to be investigated, which has great significance for the treatment and prognosis of BC patients. The functional status of autophagy and its pharmacological manipulation of BC treatment will become novel targeted treatment options.

Conclusions
we have constructed a prognostic model for predicting the survival rate of BC patients composed of five essential AGs, attempting to provide guidance of clinicians on making up personalized treatment strategies corresponded to different risk level patients. The autophagy signature represents a promising biomarker for assessing the overall survival rates of BC patients.

Ethics statement
The informed consent was obtained from all subjects. Our study has been approved by medical ethics committee of Xuchang Central Hospital.Number:2020-002

Methods statement
All methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication
All authors have seen the manuscript and approved of publication.

Conflict of interest statement
The authors have declared that there is no interest conflict .

Data availability statement
The data that support the findings of this study are available in TCGA and GEO at

Funding statement
No available funding supported this research.

Author Contributions
Conceive and design the article: XL L and HC ZH. Collect the patients information: JJ L. Statistics data: P L. Write the paper: XL L.Surpervision:Y S.

Additional File
Supplementary file.doc Figure S1 The expression of the five hub prognostic AGs on the protein level The database provided immunohistochemistry (IHC) results using a tissue microarray (TMA)-based analysis of the corresponding proteins in PC patients and adjacent normal tissues. IHC staining for each gene was done using the same antibodies in tumor tissues as in normal tissues. However, the estimation of protein expression could not be performed. Table 1 The correlation coefficients between key autophagy genes and clinicopathological factors (r2, P)