Skip to main content

Relating mutational signature exposures to clinical data in cancers via signeR 2.0

Abstract

Background

Cancer is a collection of diseases caused by the deregulation of cell processes, which is triggered by somatic mutations. The search for patterns in somatic mutations, known as mutational signatures, is a growing field of study that has already become a useful tool in oncology. Several algorithms have been proposed to perform one or both the following two tasks: (1) de novo estimation of signatures and their exposures, (2) estimation of the exposures of each one of a set of pre-defined signatures.

Results

Our group developed signeR, a Bayesian approach to both of these tasks. Here we present a new version of the software, signeR 2.0, which extends the possibilities of previous analyses to explore the relation of signature exposures to other data of clinical relevance. signeR 2.0 includes a user-friendly interface developed using the R-Shiny framework and improvements in performance. This version allows the analysis of submitted data or public TCGA data, which is embedded in the package for easy access.

Conclusion

signeR 2.0 is a valuable tool to generate and explore exposure data, both from de novo or fitting analyses and is an open-source R package available through the Bioconductor project at (https://doi.org/10.18129/B9.bioc.signeR).

Peer Review reports

Background

DNA mutations accumulate throughout an individual’s life and may result in the deregulation of metabolic processes observed in tumor cells [1]. Specific patterns of somatic mutations are characteristic of the exposure to some carcinogens, which are more frequently found in some tumor types. The study of these ’mutational signatures’ has become a solid field of research in oncology, and is now seen as a field which has made significant advances over the last years [2, 3]. The importance of studying mutational signatures in oncology are irrefragable, as mutation patterns are related to cancer etiology, diagnosis and prognosis, appear to predict response to therapy [4, 5] and may echo genomic alterations induced by chemotherapy, making them valuable tools for most aspects of cancer research [3].

The first method to extract mutational signatures from somatic mutation counts was based on non-negative matrix factorization (NMF) techniques applied to Single Nucleotide Variations (SNVs) counts [2]. Since then, several methods for mutational signature extraction have emerged, most of them based on variations of the NMF algorithm [6]. Our group developed signeR, a Bayesian approach to the NMF paradigm for mutational signature extraction [7]. A key idea that led to the development of signeR is that the signature extraction problem can be treated as an inferential task subject to statistical modeling. signeR is able to extract the underlying signatures by estimating both the number of signatures present in the data and the relative contribution of each signature to the total amount of observed mutation counts.

The relative contribution of a signature to the total amount of counts is known as a signature exposure. signeR can also be used to estimate the sample exposure levels of known mutational signatures, such as those described by the COSMIC consortium [8] or the Signal initiative [9]. This functionality follows a tendency observed in literature: as signatures have become known and well determined by the study of extensive datasets, algorithms capable of fitting mutation samples to available signatures started to emerge (e.g. deconstructSig, [10]).

Mutational signatures have recently been proposed as markers for cancer prognosis or drug sensitivity [11, 12]. Available evidence suggests that the estimation of exposure levels to mutational processes may be incorporated within the cancer diagnostic workflow, which may improve diagnosis in the future [13]. As an example, our group recently considered signeR to stratify gastric cancer patients for therapeutic intervention [14]. Those results highlight the scientific potential of relating mutational signatures to other relevant features in cancer, such as clinical or molecular data.

In this article we describe an enhanced version of signeR that is computationally more efficient and has several new functionalities. A major contribution of signeR 2.0 is that it allows to study the relation of each signature exposure to almost any other clinical features of interest, such as overall survival, tumor staging or cancer subtypes. These features may be categorical (e.g. cancer molecular subtypes), continuous (e.g. gene expression) or survival data. Such additional information is nowadays present in several databases for instance in The Cancer Genome Atlas consortium (TCGA, [15]). Clustering or machine learning algorithms used to relate exposures to clinical features are repeatedly applied to different results obtained while estimating the matrix of exposures to signatures. The decomposition of mutation data may lead to multiple similarly suitable solutions, thus the estimation of signatures and exposures is not exact. Most publications use bootstrap methods to evaluate the robustness of results obtained from mutation data decomposition [16, 17]. Our method, however, employs a Gibbs sampler to generate a posterior distribution of estimated signatures and exposures.

The utility of signeR 2.0 is demonstrated here by considering TCGA data obtained from stomach adenocarcinoma. Mutational signatures previously identified by the COSMIC consortium [8] for this type of cancer were used as templates to correlate their observed exposures to several other clinical data of interest. These analyses include the clustering of samples according to signatures exposures, the search for signatures showing significant differences in distribution among tumor subtypes and the evaluation of how exposure levels affect patients’ overall survival.

The software interface is user-friendly and intuitive, facilitating the estimation of mutational signatures and further extending the study of their relation to other clinical data to users with little programming background. We hope that this version of signeR will aid in subsequent genome based studies of cancers, eventually leading to new insights and discoveries.

Implementation

Database content

The new version of signeR described here provides a query interface, signerRFlow, to explore the interplay of mutational signature exposures and several other features present in clinical data. To do so, signeR 2.0 embedded into its framework the most recently processed and up-to-date molecular and clinical dataset of The Cancer Genome Atlas (TCGA) consortium (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga) along with a catalog of mutational signatures (COSMIC Single Base Substitution signatures v3.2, the latest version by the software construction, https://cancer.sanger.ac.uk/cosmic/signatures/SBS).

Interface design

The signeRFlow app was developed using shiny, an R package for building interactive web apps [18]. It is implemented as an open-source R package available along with signeR 2.0 through the Bioconductor project at https://doi.org/doi:10.18129/B9.bioc.signeR.

Algorithm

signeR 2.0 presents an updated version of the signeR Bayesian approach [7]. The functionalities and enhancements of signeR 2.0 are the following:

  1. 1.

    It provides a novel statistical framework for considering several downstream analyses, allowing the analysis of the relation between clinical features and signature exposures;

  2. 2.

    It presents parallel computation capabilities;

  3. 3.

    The estimation of hyper-hyper-parameters, necessary to start the estimation of signatures, may be avoided by considering pre-defined values, which can save a great deal of computational time.

A comparison of the computational efficiency brought by these enhancements is described in the Additional file 1. The rest of this section explains in some detail the new functionalities related to the first item.

To further explore the genotype-phenotype relationships between mutational signatures and other data of interest, signeR 2.0 provides a unified data modeling toolkit. If additional sample information is available, including molecular and clinical data such as cancer sub-type or overall survival, signeR 2.0 is able to evaluate how this information relates to the estimated exposures to mutational signatures. When the additional data is of a categorical nature, differences in exposures among groups can be analyzed and, if some of the samples are unlabeled they can be labeled based on the similarity of their exposure profiles to those of labeled samples. In the case of a continuous additional feature, its correlation to estimated exposures can be evaluated. Survival data can also be analyzed by estimating the relation of survival to mutational signature exposure. We describe briefly each of these new features next.

signeR takes as input a matrix \(M = (M_{ij})\) of mutation counts found in a set of genome samples. Each column of M, denoted hereafter as \(M_j\), corresponds to a genome sample and each row to a given mutation type. As an output signeR can estimate two matrices P and E of mutation signatures and signature exposures such that \(M \approx PE\). Alternatively, signeR can estimate only the exposures E to known signatures. In both cases, the algorithm estimates exposures by drawing a sample \(E^{(1)}\), \(E^{(2)}\), \(\ldots\), \(E^{(R)}\) of exposure matrices, approximately distributed according to our model posterior distribution [7]. All subsequent analyses described here are based on the repeated application of statistical or learning algorithms to the matrices \(E^{(r)}\), \(1 \le r\le R\). After each of the sampled matrices \(E^{(r)}\) is analyzed, results are joined and findings are considered significant if they are consistent throughout most of these analyses. A general description of this procedure is shown by the pseudo-code presented in Algorithm 1.

figure a

If estimated exposures are confronted with a categorical feature, signeR 2.0 uses non-parametric tests (Wilcoxon–Mann–Whitney or Kruskal–Wallis tests) to assess the enrichment of exposures in any of the categories. For each signature, the tests are applied on each \(E^{(r)}\), and obtained p values are inverted and log-transformed for visualization purposes. The resulting values are called Differential Exposure Scores and can be visualized as a boxplot [7]. signeR 2.0 is also able to evaluate the ability of exposure levels to discriminate samples among categories. Several classification algorithms are available for this purpose. signeR currently includes: (1) k-nearest neighbors, (2) linear vector quantization, (3) logistic regression, (4) linear discriminant analysis, (5) least absolute shrinkage and selection operator (lasso), (6) naive Bayes, (7) support vector machines, and (8) random forests. In all cases, given a genome sample \(M_j\) and an exposure matrix \(E^{(r)}\) the chosen classifier is used to label \(M_j\) in one of the categories. The final label for \(M_j\) is obtained as the most frequent label obtained by considering \(E^{(r)}\), \(1 \le i \le R\).

When a continuous feature is considered, such as gene expression, the correlation of each signature’s exposure to this feature can be assessed. A correlation test is applied to each \(E^{(r)}\) and the found p values, inverted and log-transformed, are shown as a boxplot. A similar approach, considering all signatures together, is used by signeR 2.0 to evaluate whether the feature can be linearly modeled based on exposures.

Survival data, often present in cancer studies, can also be related to exposures. For each signature and each \(E^{(r)}\), signeR 2.0 stratifies patients according to exposure levels and applies logrank tests to compare obtained groups. The impact of exposures on survival can also be quantified via Cox proportional hazard models [19]. Again, all tests are applied to each \(E^{(r)}\) and results are summarized by taking the median of all the obtained statistics (p values and hazard ratios).

Finally, when no additional data is available, signeR 2.0 includes unsupervised methods such as hierarchical and fuzzy clustering to discover sample sub-groups based entirely on the estimated exposures. Several options are available for the required distance measure (see R function dist documentation) or the agglomerative procedure (see R function hclust documentation). If hierarchical clustering is used, the algorithm is applied to each exposure matrix \(E^{(r)}\), \(1 \le i \le R\), as mentioned in the pseudo-code Algorithm 1. The obtained dendrograms are compared and shown on a final chart, where the relative frequency with which each branch was found is displayed. In case the user chooses to use fuzzy clustering, the fuzzy C-means algorithm is applied to each \(E^{(r)}\), thus generating matrices of membership grades of each genome sample to each cluster. Those grades are averaged to yield the final result. For visualization purposes a hierarchical procedure is applied to the mean membership grades so that similar samples are displayed together on the final chart.

Tests and learning algorithms available on signeR 2.0 are obtained from specialized R packages (e.g. pvclust or survival). Their complete list can be found on the package documentation and is included as Additional file 1. A few examples of the application of these functionalities to a dataset from the TCGA database are presented in Sect. 3.

signeRFlow

The signeRFlow app includes three major components and consists of a pipeline that allows: (i) data input and pre-processing; (ii) mutational signature estimation or fitting and (iii) exposure data modeling. A schematic overview of signeRFlow is shown in Fig. 1.

Fig. 1
figure 1

General overview of signeRFlow. Starting from the top, clinical and molecular features from publicly available TCGA databases or own user data can be loaded. After pre-processing, a friendly interface provides options to set up de novo, fitting and analytical methods for downstream analyses

The flexible input interface was designed to allow users to upload their own data either as a VCF, MAF or SNV matrix file (an example of the file structure can be found within the interface). Additionally, the user can select a previous cancer study of interest from the TCGA database available in the TCGA Explorer module. In the first option, users can add clinical information, while available clinical data for TCGA samples are already organized within signeRFlow and easily accessible through the interface.

Upon data upload completion, the mutational signature analysis is ready to commence. In this step, the user can take advantage of a Bayesian approach to perform de novo identification of mutational signatures. signeR 2.0 provides flexible options for choosing the number of searched signatures or optimizing it, within a fixed range, according to the Bayesian Information Criterion (BIC). In addition, signeRFlow is able to fit the mutational spectra of studied genome samples to known mutational signatures, thus estimating the samples’ exposure levels to related mutational processes. Single Base Substitution (SBS) signatures from COSMIC are available within signeRFlow for fitting analysis, although users can upload other signatures as well. Whenever a mutational signature analysis is performed, signeRFlow offers several plot options to visualize estimated signatures and their exposures, as well as the convergence of the MCMC model used to estimate them (Additional file 1: Figure S5). For the fitting to known signatures, exposure plots are available (Fig. 2A).

Finally, signeRFlow provides a toolbox containing state of art techniques on learning algorithms for exposure data analysis (Algorithm 1). For example, hierarchical and fuzzy clustering can be used to explore the qualitative differences among samples evidenced by signature exposures. Furthermore, to unveil the interplay of mutational signatures with clinical or genomic features, signeRFlow provides comprehensive options for covariate analysis considering either categorical or numerical features. In the first case, signeR 2.0 Differential Exposure Score (DES) can highlight signatures that are differentially active among previously defined groups of samples, while the function ExposureClassify evaluates the assignment of samples to groups according to exposure profiles. On the other hand, sample correlation and linear regression can be performed. Lastly, the effect of exposure levels on prognosis can be investigated by comparing the survival distributions of sample groups with contrasting exposure levels or by Cox regression analysis. The next section presents a concrete example of these possibilities.

Results

Although standard histological classification techniques are fundamental for dividing cancer into sub-types and disease stratification, the exposure to mutational processes may provide additional information extending further this characterization. We illustrate this here by using the differential exposure score (DES, [7]) estimated while analyzing a data set with 439 samples selected from the stomach adenocarcinoma (STAD) cohort from TCGA. The mutational spectra of these samples were fitted to known signatures previously reported to be characteristic of STAD [16]. According to COSMIC nomenclature, the signatures included in this analysis are Single Base Substitutions (SBS) numbers 2, 3, 5, 10b, 13, 15, 17a, 17b, 18, 20, 21, 26, 28, 34, 40, 41, 44, and 93. The estimated exposures (i.e. the empirical average of the realizations obtained by signeR for the exposure matrix) are shown in Fig. 2A.

As an exploratory approach, a Fuzzy clustering algorithm was applied to the exposures found by signeR. Results are shown in Fig. 2B. Interestingly, 3 of the 6 groups found via fuzzy clustering (Fig. 2B, clusters 1, 4 and 5) are mainly composed of samples characterized by high microsatellite instability (MSI), an important marker for tumor prognosis [20].

Fig. 2
figure 2

A Heatmap of estimated exposures obtained by fitting 19 COSMIC signatures to the STAD dataset. Genome samples are displayed as columns of the heatmap while COSMIC signatures are arranged as rows and estimated (log-transformed) exposure levels are shown by the colour scale. B Fuzzy clustering of samples according to estimated exposures, compared to known classifications by molecular profiles. Clusters were organized in columns and for each sample (row) the colour code indicates the membership grade of each cluster. Following the fuzzy clustering approach, a hierarchical clustering algorithm was applied to the membership grades (dendrogram at left), enabling better visualization of results and allowing to establish a relation to molecular sub-types and MSI status (annotation columns at right side). C p values found by the Kruskal–Wallis test for differences in exposures among the four sample groups. For comparison and display purposes, the p values were inverted and log-transformed. Box-plots of obtained scores are displayed and the significance cutoff of 0.05 is indicated by the red line. The labels at the x-axis correspond to the ID of each signature and, for those showing significant differences, the group characterized by higher exposure levels. D ROC curve of the exposure-based classification of samples according to their MSI status and related confusion matrix. E Kaplan–Meier curves showing the overall survival of STAD patients after stratification by the exposures obtained while fitting COSMIC signature SBS26. The displayed p value was found by application of the log-rank test for defined sample groups

Motivated by the clustering results, we considered several supervised approaches available on signeR 2.0. The sample molecular sub-types proposed in [20], namely Epstein–Barr virus (EBV)-positive tumors, tumors characterized by microsatellite instability (MSI), genomically stable (GS) tumors and tumors showing chromosomal instability (CIN), were adopted as targets to evaluate how the exposures of individual signatures correlate to them. For each signature, differences in exposures among STAD sample groups were evaluated by the Kruskal–Wallis test (Differential Exposure Scores). Results are shown in Fig. 2C. Thirteen COSMIC signatures show different levels of activity in sample subtypes. Among signatures with higher exposures in MSI samples we found SBS1, a clock-like signature that in most cancers correlates with the age of the individual, and five mutational signatures associated with defective DNA mismatch repair and microsatellite instability: SBS15, SBS20, SBS21, SBS26 and SBS44 (COSMIC consortium).

The potential of exposure data to classify cancer samples was also tested in signeRFlow, based on the microsatellite instability (MSI) status also described by [20]. According to clustering and DES results, exposure data seems adequate to identify samples with high microsatellite instability. Thus, the original sample classification as MSI-High, MSI-Low and MSStable was grouped as MSI-High and others and the classification algorithm adopted this grouping as target. A k-fold cross-validation approach (\(k=8\)) was adopted, producing a ROC curve for the classification found, as well as the related confusion matrix (Fig. 2D). It is worth noting that, as shown in the last column of the confusion matrix, a few samples are not consistently classified by signeR 2.0 and therefore are considered as undefined. Although the fraction of these samples is small (\(< 0.69\%\)), their labeling to some groups could be spurious, which is avoided by our approach because it incorporates the variability of exposure data.

Finally, we considered the impact of signature exposure levels on disease prognosis. For each signature, samples were stratified by their exposure levels, after searching for the cutoff value leading to the most relevant contrast on the overall survival of found strata (function maxstas, R package maxstat). The survival contrast among the resulting groups was evaluated by the logrank-test, repeatedly applied to the realizations of the exposure matrix. Signatures SBSx, x = 1, 5, 15, 21 and 26 were reported as significant in prognosis. According to COSMIC, the first two are clock-like signatures, which correlate with the age of the individual, while the last three are associated with MSI samples. As an example, Kaplan–Meier survival curves for signature SBS26 can be found in Fig. 2E.

The results presented in this section are consistent with previous knowledge about STAD. They exemplify how the new signeR functionalities described here can be used to gain further insights into the molecular nature of cancers.

Discussion

signeR 2.0 is a software suite devoted to exploring the information obtained from exposure to mutational processes data. It offers an updated version of the signeR Bayesian approach, with parallel computation functionalities and pre-computed hyper-hyper-parameters, which saves computational time. It is presented in a user interface, signeRFlow, which brings, in a ready-to-use form, methods to estimate exposure data from mutation counts and to relate them with available clinical data from genome samples under study. The results of previous applications of signeR to the TCGA datasets, both de novo and fitting analyses, are available for exploration with signeR 2.0 tools, accompanied by related clinical data. To this end, signeR 2.0 offers a collection of established data analysis methods (classifiers, linear models, survival analysis, etc.) and interfaces to apply them to generated samples of the exposure matrix, outputting summary statistics of individual results.

Results found on the gastric adenocarcinoma dataset (TCGA-STAD) show the software’s potential for exploring available data, hopefully leading to further insights and new discoveries. The observed relation of exposures to some signatures and MSI status or age is in accordance with the literature [20] and demonstrates the potential of this tool to identify patterns of interest in cancer samples. Provided algorithms can be valuable tools to improve patient stratification or prognosis. Due to its software interface, signeRFlow, the use of signeR 2.0 does not require extensive computational training and therefore the tool is accessible to a wider audience. signeR 2.0 is available as a Bioconductor package. A detailed explanation of how to use its interface is provided as Additional file 1 (S1) and also in the package documentation. signeR is an ongoing project and new versions and functionalities will be released soon.

Conclusions

signeR 2.0 is a valuable tool to generate and explore exposure data, both from de novo and fitting analyses. The software interface, signeRFlow, makes it accessible for a large audience, since its use does not require programming experience. signeR is still in development and new versions and functionalities will be released soon. We hope this software will help both researchers and students develop projects focused on the mutational spectra in cancer samples.

Data availability

Project name: signeR

Project home page: https://doi.org/10.18129/B9.bioc.signeR

Operating system(s): Platform independent

Programming language: R

Other requirements: R 4.3 or higher

License: GNU General Public License v3.0

Any restrictions to use by non-academics: license needed for commercial use

The datasets used or analyzed in this study are available from the cancer genome database (https://portal.gdc.cancer.gov) and COSMIC v3.2 (https://cancer.sanger.ac.uk/cosmic/signatures/SBS).

Code availability

signer 2.0 is an open-source R package available through the Bioconductor project at https://doi.org/10.18129/B9.bioc.signeR.

References

  1. Stratton MR. Exploring the genomes of cancer cells: progress and promise. Science. 2011;331(6024):1553–8.

    Article  CAS  PubMed  Google Scholar 

  2. Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 2013;3(1):246–59.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Koh AG, Degasperi A, Zou X, Momen S, Nik-Zainal S. Mutational signatures: emerging concepts, caveats and clinical applications. Nat Rev Cancer. 2021;21(10):619–37.

    Article  CAS  PubMed  Google Scholar 

  4. Liu M, Xia S, Zhang X, Zhang B, Yan L, Yang M, et al. Development and validation of a blood-based genomic mutation signature to predict the clinical outcomes of atezolizumab therapy in NSCLC. Lung Cancer. 2022;170:148–55.

    Article  CAS  PubMed  Google Scholar 

  5. Liu Z, Lin G, Yan Z, Li L, Wu X, Shi J, et al. Predictive mutation signature of immunotherapy benefits in NSCLC based on machine learning algorithms. Front Immunol. 2022;13: 989275.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kim YA, Leiserson MDM, Moorjani P, Sharan R, Wojtowicz D, Przytycka TM. Mutational signatures: from methods to mechanisms. Annu Rev Biomed Data Sci. 2021;4(1):189–206. https://doi.org/10.1146/annurev-biodatasci-122320-120920.

    Article  PubMed  Google Scholar 

  7. Rosales RA, Drummond RD, Valieris R, Dias-Neto E, da Silva IT. signeR: an empirical Bayesian approach to mutational signature discovery. Bioinformatics. 2017;33(1):8–16.

    Article  CAS  PubMed  Google Scholar 

  8. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2018;47(D1):D941–7. https://doi.org/10.1093/nar/gky1015.

    Article  CAS  PubMed Central  Google Scholar 

  9. Degasperi A, Amarante TD, Czarnecki J, Shooter S, Zou X, Glodzik D, et al. A practical framework and online tool for mutational signature analyses show inter-tissue variation and driver dependencies. Nat Cancer. 2020;1(2):249–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Rosenthal R, McGranahan N, Herrero J, Taylor BS, Swanton C. DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 2016;17:31.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Brady SW, Gout AM, Zhang J. Therapeutic and prognostic insights from the analysis of cancer mutational signatures. Trends Genet. 2022;38(2):194–208. https://doi.org/10.1016/j.tig.2021.08.007.

    Article  CAS  PubMed  Google Scholar 

  12. Levatic J, Salvadores M, Fuster-Tormo F, Supek F. Mutational signatures are markers of drug sensitivity of cancer cells. Nat Commun. 2022;13(1):2926.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Van Hoeck A, Tjoonk NH, Van Boxtel R, Cuppen E. Portrait of a cancer: mutational signature analyses for cancer diagnostics. BMC Cancer. 2019;19(1):457.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Buttura JR, Provisor SMN, Valieris R, Drummond RD, Defelicibus A, Lima JP, et al. Mutational signatures driven by epigenetic determinants enable the stratification of patients with gastric cancer for therapeutic intervention. Cancers. 2021;13(3):490.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zhang Z, Hernandez K, Savage J, Li S, Miller D, Agrawal S, et al. Uniform genomic data analysis in the NCI Genomic Data Commons. Nat Commun. 2021;12(1):1226.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Alexandrov LB, Nik-Zainal S, Siu HC, Leung SY, Stratton MR. A mutational signature in gastric cancer suggests therapeutic strategies. Nat Commun. 2015;6:1–7. https://doi.org/10.1038/ncomms9683.

    Article  CAS  Google Scholar 

  17. Huang X, Wojtowicz D, Przytycka TM. Detecting presence of mutational signatures in cancer with confidence. Bioinformatics. 2018;34(2):330–7. https://doi.org/10.1093/bioinformatics/btx604.

    Article  CAS  PubMed  Google Scholar 

  18. Chang W, Cheng J, Allaire J, Sievert C, Schloerke B, Xie Y, et al. shiny: web application framework for R. R package version 1.7.3.9001. https://shiny.rstudio.com/

  19. Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer; 2000.

    Book  Google Scholar 

  20. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature. 2014;513(7517):202–9.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was partially supported by Fundação de Apoio a Pesquisa do Estado de São Paulo (FAPESP), Grant 2022/12991-0. ED-N is a research fellow from Conselho Nacional de Desenvolvimento Cientifico e Tecnologico, Brazil (CNPq) and acknowledges the support received from Associacão Beneficente Alzira Denise Hertzog Silva (ABADHS) and FAPESP Grant 14/26897-0.

Author information

Authors and Affiliations

Authors

Contributions

RD, RR and IT wrote the main manuscript text. AD developed the shiny app. ED-N, RV and MM reviewed methodology and visualization. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Rafael A. Rosales or Israel Tojal da Silva.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Tutorial of signeR 2.0 and performance comparison against earlier version of signeR.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Drummond, R.D., Defelicibus, A., Meyenberg, M. et al. Relating mutational signature exposures to clinical data in cancers via signeR 2.0. BMC Bioinformatics 24, 439 (2023). https://doi.org/10.1186/s12859-023-05550-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12859-023-05550-3

Keywords