Genome-wide analysis of retention of U2 and U12-type introns
To demonstrate the application of IntEREst in comparing retention levels of various types of introns across several samples, we reanalyzed the RNAseq data from myelodysplastic syndrome (MDS) patients and control subjects included in Madan et al. [17] study. Specifically, we compared the genome-wide retention levels of U12-type introns vs U2-type across the MDS samples. This disease is caused by mutations in the ZRSR2 gene that encodes an integral protein component of the minor spliceosome. Moreover, the original analysis of the dataset reported that the ZRSR2 mutations in the patient samples led to increased retention of primarily U12-type introns while the U2-type introns were reported to be less affected [17]. The dataset represents 16 individuals: 8 were diagnosed with MDS and featured mutations in the ZRSR2 gene (referred to as ZRSR2mut), 4 were diagnosed with MDS but lacked the ZRSR2 mutations (referred to as ZRSR2wt), and 4 were healthy individuals (HEALTHY).
We ran genome-wide retention comparison of U12-type introns to U2-type introns. To carry out the analysis, we used RefSeq as a reference and identified and annotated 510 U12-type introns using the annotateU12() function that uses Position Weigh Matrices (PWM) extracted from the U12DB database [9]. Next we performed the differential IR analysis using the DESeq2-based function of IntEREst (comparing the ZRSR2mut samples vs ZRSR2wt and HEALTHY). The DESeq2 test was run by considering both results from intron retention and exon-exon junction runs of interest() function. Initially, by using the interestResultIntEx() function a result object was built that includes information of both intron retention and exon-exon junction levels (see Additional file 1 for more details).
The results show an increased retention of U12-type introns in the ZRSR2mut samples as opposed to U2-type introns. Specifically, after the low retention filtering and using a 0.01 adjusted p-value cutoff on the DESeq2 results, we identified 1521 introns representing either the U12- or U2- type that displayed higher retention levels in the ZRSR2mut samples compared to the controls (i.e. ZRSR2wt and HEALTHY samples). Of the 510 U12-type introns in the data, 269 (i.e. 52.7% of the U12-type introns) showed significant up-regulation of IR in the ZRSR2mut samples when compared to the controls, while none of the U12-type introns showed a significant reduction in IR (see Fig. 1a). In contrast, only 1252 of the 228524 (~0.54%) of U2-type introns analyzed showed a significant increase of IR and 89 (~0.03%) showed a significant decrease (see Fig. 1b). Our analysis also confirmed the earlier observation of increased intron retention levels with U12-type introns compared to U2-type introns [6, 11, 31] since we observed that the overall FPKM retention values (formula 1) of U12-type introns were higher than that of U2-type in all the samples of the MDS study, including the ZRSR2mut, ZRSR2wt and HEALTHY samples (Fig. 2a, b). However, this effect was more prominent in the ZRSR2mut samples, suggesting that the ZRSR2 mutations were exacerbating the IR of the U12-type introns. Similar increase in IR was not observed with the U2-type introns between ZRSR2mut and controls, regardless of whether they were located in the genes containing U12-type introns or other genes, or in the close proximity of U12-type introns (immediately up- or downstream position). Rather, the median log2 fold-change of the U2-type introns was approximately zero whereas the median log2 fold-change of U12-type introns was ~1.5 (see Fig. 2c). Moreover, the Jonckheere Trend test [32, 33] with 10000 number of permutations, under the null hypothesis that the values are similar (and with the alternative that the values for the U12-type introns are higher) returned a highly significant p-value of 0.0001. In line with these results, the median of ΔΨ values (i.e. the increase of percentage spliced in when comparing ZRSR2mut samples to the controls) for all U12-type introns was about 1% as compared to 0.6% with U2-type introns (see Fig. 2d). Moreover, the average ΔΨ values for introns showing a significant increase in IR were ~33% and 23% for U12-type and U2-type introns, respectively.
To further evaluate the validity and generality of our results, we compared the MDS results to the similar results that we obtained from analyzing an additional Maize data [34] (see Additional file 1 for more details). The Maize data is constructed of 6 samples (i.e. 3 roots and 3 shoots referred to as RGH3mut) that feature mutations in the gene RGH3 (ortholog of Human ZRSR2 gene) and 6 samples (3 roots and 3 shoots referred to as RGH3wt) that lack the mutation. The results of the Maize data analysis mirror our findings with the MDS data. Analogous to MDS data, the RGH3mut samples showed increased IR with ~46% of U12-type introns, while only a ~0.46% of the U2-type introns showed an increase in IR (see Additional file 1: Figure S7).
Together, our results suggest that IntEREst provides reliable quantification of differential IR events; Specifically, our results are not only consistent with the well-documented increased retention levels of U12-type introns [6, 11, 31], but are also in concordance with the molecular function of the ZRSR2 protein (and its Maize ortholog, i.e. RGH3) in the recognition of U12-type introns [17, 34].
Benchmarking and comparison to other methods
We evaluated the performance of the IntEREst in two ways using the MDS benchmark dataset. First, we carried out internal analysis comparing IntEREst results in conjunction with different statistical analysis packages implemented in IntEREst. Subsequently, we carried out comparison with both, the published results of the MDS analysis [17] and IRfinder [23], i.e. dedicated software for IR analysis. Note that all comparisons described in the following are based on the introns that were available in the both references used by the compared counterparts.
Differential up- and down–regulated introns in methods implemented in IntEREst
We compared the three methods implemented in IntEREst for differential intron retention analysis, i.e. DESeq2, GLM function of edgeR and DEXSeq, referred hereafter as IntEREst-DESeq2, IntEREst-edgeR and IntEREst-DEXSeq, respectively. The DESeq2 and edgeR have been previously reported to result in somewhat dissimilar results in differential gene expression analysis [35]. In contrast, DEXseq method differs in its application (see above). For IntEREst-DESeq2 and IntEREst-edgeR comparison, we first merged the intron-exon and the exon-exon junction results (obtained by running interest() in its two running modes) using interestResultIntEx(). Subsequently, we used deseqInterest() and glmInterest() functions (i.e. the IntEREst functions based on DESeq2 and edgeR-GLM) to analyze the change of IR relative to the change of the junction levels of their flanking exons. We used an adjusted p-value (Benjamini and Hochberg [36]) threshold cutoff of 0.01 to identify introns that are retained at significantly higher or lower level in the ZRSR2mut samples compared to controls (see Additional file 1 for more details).
We found that there is a significant overlap both with upregulated and downregulated introns between the IntEREst-DESeq2 and IntEREst-edgeR (Fig 3a, b), with a bias towards upregulated introns. Furthermore, of the introns not shared between the two methods, the IntEREst-DESeq2 identified more introns with an increase in IR, while the IntEREst-edgeR identified more downregulated IR events. The majority of the IR events not shared by the two methods (specifically those discovered by IntEREst-DESeq2 and missed by IntEREst-edgeR) display a weaker IR fold-change compared to those in the shared intron group (See Additional file 1: Figure S3). The observed differences between the two methods are in line with the recent DEG analysis results [35] and are due to the variability of the methods used and an extra filtering step based on Cook’s distance which is used in DESeq2 by default.
Comparison of the IntEREst-DESeq2 to IntEREst-DEXSeq revealed a considerable overlap between the two methods (Fig. 3 c). However, IntEREst-DEXSeq identified a large number of significantly less retained introns not identified by the IntEREst-DESeq2 (Fig. 3 d). This outcome reflects the gene-wise method adapted in DEXSeq where the variation in the retention levels of each intron is compared to the relative retention variation of all other introns within the same gene, rather than solely comparing the genome-wide changes of IR levels. This results in a more symmetric distribution of up/down regulated intron retention signals (Fig. S4). As a consequence, the significantly more and less retained introns discovered by IntEREst-DEXSeq were more than twice more frequently observed in the same genes compared to those identified by IntEREst-DESeq2. Furthermore, the IntEREst-DEXSeq only consider the reads that map to either introns or exons (here the intron read counts were used) and does not support the usage of both intron retention and exon-exon junction information.
IntEREst-DESeq2 and IRFinder show extensive overlap
We next compared the IntEREst-DESeq2 to IRFinder, a dedicated IR analysis software, which also uses DESeq2 package in its downstream analysis [27]. Since IntEREst-DESeq2 counts reads that map to the exons, we used the mean of the number of reads mapping to the 5’ and 3’ flanking exons. In contrast, the IRFinder counts the junction reads that map across the flanking exons. Running IRFinder with the default parameters extracted 250 introns showing significantly increased IR in ZRSR2mut samples, most of which (i.e. 235) overlapped with the introns discovered by the IntEREst-DESeq2 (Fig. 3 e). Note that IntEREst utilized more intron/exon-mapped reads compared to IRFinder. This was particularly evident with introns with lower retention levels, thus providing better-supported fold-change estimates for such introns (Additional file 1: Figure S5).
Enhanced discovery of IR events in MDS samples
We further compared our IR results with the original analysis of the MDS dataset by Madan et al. [17]. We found that IntEREs-DESeq2 was able to identify most (i.e. 177 out of 205 introns) of the significant IR events reported by Madan et al. [17], but it also discovered a large number of additional events not reported in the original study (Fig. 3f), representing both U12-type (149) and U2-type (1195) introns. On the contrary, the events that were reported in the original study, but missed in our analysis all represent borderline cases featuring low fold-changes and statistical significance (Additional file 1: Figure S6).
Together, our results revealed that the different methods implemented in IntEREst are able to identify a highly overlapping set of high-confidence differentially retained introns. Additionally, each method also identified IR events that are unique to a particular method. This provides the flexibility to select an approach best fitting to the particular research questions.
Sample size and sequencing library size sensitivity
Finally, we studied the effect of the number of biological replications and the intron read coverage levels again using the MDS dataset. To investigate the effect of biological replication, we randomly picked 2 to 8 MDS, ZRSR2mut and control samples for our analysis with IntEREst-DESeq2 (i.e. DESeq2Interest() ) and repeated this 10 times. As expected, the results reveal that increasing numbers of biological replicates lead to a discovery of an increased number of statistically significant IR events (Fig. 4a, b). This observation is similar to what has been observed earlier with gene expression analyses [35].
A similar trend was also observed when analyzing the effect of intron/exon read coverage levels. Here we distributed 5-50 million reads according to the relative retention levels of the introns and exon-exon junction levels (based on the complete data) in each sample, followed by analysis with IntEREst-DESeq2. In our analyses we assumed that the quality and read coverage is equal in all the individual MDS datasets. As a result, we observed that an increase in the sequencing library size leads to a discovery of increasing numbers of introns showing statistically significant deviation in the IR levels. However, the slope of increase of the number of discovered IR events decreases and levels off at the highest library sizes (more than 35M; Fig. 4c).