Measure transcript integrity using RNA-seq data
- Liguo Wang†1View ORCID ID profile,
- Jinfu Nie†1,
- Hugues Sicotte1,
- Ying Li1,
- Jeanette E. Eckel-Passow1,
- Surendra Dasari1,
- Peter T. Vedell1,
- Poulami Barman1,
- Liewei Wang3,
- Richard Weinshiboum3,
- Jin Jen4,
- Haojie Huang5,
- Manish Kohli2Email author and
- Jean-Pierre A. Kocher1Email author
© Wang et al. 2016
Received: 2 October 2015
Accepted: 29 January 2016
Published: 3 February 2016
Stored biological samples with pathology information and medical records are invaluable resources for translational medical research. However, RNAs extracted from the archived clinical tissues are often substantially degraded. RNA degradation distorts the RNA-seq read coverage in a gene-specific manner, and has profound influences on whole-genome gene expression profiling.
We developed the transcript integrity number (TIN) to measure RNA degradation. When applied to 3 independent RNA-seq datasets, we demonstrated TIN is a reliable and sensitive measure of the RNA degradation at both transcript and sample level. Through comparing 10 prostate cancer clinical samples with lower RNA integrity to 10 samples with higher RNA quality, we demonstrated that calibrating gene expression counts with TIN scores could effectively neutralize RNA degradation effects by reducing false positives and recovering biologically meaningful pathways. When further evaluating the performance of TIN correction using spike-in transcripts in RNA-seq data generated from the Sequencing Quality Control consortium, we found TIN adjustment had better control of false positives and false negatives (sensitivity = 0.89, specificity = 0.91, accuracy = 0.90), as compared to gene expression analysis results without TIN correction (sensitivity = 0.98, specificity = 0.50, accuracy = 0.86).
TIN is a reliable measurement of RNA integrity and a valuable approach used to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.
KeywordsTranscript integrity number TIN RNA-seq quality control Gene expression
In vitro RNA degradation occurs in most of the isolated RNA samples and the degree of degradation depends on the specimen collection and storage conditions such as formalin-fixed, paraffin-embedded (FFPE) and fresh frozen [1–3]. This is especially a major issue for clinical tissues collected in surgery suites because optimal storage of collected specimens is often not the primary focus in that setting. There have been multiple studies showing that in vitro degradation of RNA impairs accurate measurement of in vivo gene expression [4, 5]. RNA degradation has not been a major problem up to recently since it has a minor influence on gene expression measured with hybridization-based microarray platforms, in which the expression of each gene is measured by only a few short, discrete probes. For example, a previous study found that only 0.67 % (275 out of 41,000) of the probes were significantly affected by in vitro RNA degradation . However, in recent years, more studies including The Cancer Genome Atlas consortium (TCGA) are switching to use sequencing-based RNA-seq to profile gene expression. RNA-seq works under the assumption that every nucleotide of the transcript has the equal chance to be sequenced and the amount of reads produced from a transcript is proportional to the abundance and length of the transcript. However, if RNA molecules were partially or completely degraded the corresponding read yield would be also distorted accordingly. Hence, in vitro RNA degradation introduces a major source of variation when measuring gene expression via RNA-seq. In support of this hypothesis, a recent study found that up to 56 % of the genes were differentially expressed due to in vitro RNA degradation .
RNA Integrity Number (RIN) is the most widely used approach to assess in vitro RNA degradation [1–3, 7]. However, the RIN metric has several weaknesses that limit its applications in both pre-sequencing RNA sample screening and post-sequencing RNA-seq data analysis. First, the RIN score relies heavily on the amount of 18S and 28S ribosome RNAs; the four main features used by the RIN algorithm includes the “total RNA ratio”, “28S-region height”, “28S area ratio” and the “18S:28S ratio”. While this metric accurately captures the integrity of ribosomal RNAs, it fails to measure the mRNA integrity directly, which is the main input for RNA sequencing. Second, RNA decay rate is transcript specific and it is modulated by several endogenous and exogenous factors as well as other factors including “AU-rich” sequence, transcript length, GC content, secondary structure, RNA-protein complex [4, 5]. It was found that RNA decay rate varies between functional groups [6, 8] and between transcripts by up to ten-fold [5, 9, 10]. Third, RIN is an overall assessment of RNA quality and cannot be used as a co-factor to adjust for differential RNA degradation between transcripts in downstream gene expression analysis. Finally, it has been reported that RIN was not a sensitive measure of RNA quality for substantially degraded samples (https://www.illumina.com/content/dam/illumina-marketing/documents/products/technotes/evaluating-rnaquality-from-ffpe-samples-technical-note-470-2014-001.pdf). Illumina® proposed DV200 metric (the percentage of RNA fragments > 200 nucleotides) to assess RNA quality. However, similar to RIN, DV200 is also an overall measurement and fails to determine RNA degradation at transcript level.
The reduction of sequencing cost has opened doors for large-scale, RNA-seq-based, gene expression profiling studies (like TCGA) that use clinical specimens with rich outcomes data. At the same time, the RNA quality of these clinical samples could vary significantly and poses a great challenge to gene expression analysis. Here we developed a novel algorithm–transcript integrity number (TIN)–to evaluate RNA integrity from RNA-seq data. We applied our TIN algorithm to RNA-seq data generated from 12 human glioblastoma (GBM) cell line samples, 20 human peripheral blood mononuclear cell samples (PBMC), and 120 metastatic castration resistant prostate cancer (mCRPC) samples. Our results showed that TIN metric accurately measured the mRNA integrity at transcript level, as demonstrated by high concordance with RNA fragment size that estimated from RNA-seq read pairs. We also demonstrated that the median TIN score (medTIN) across all transcripts can be an accurate and reliable measurement of RNA integrity at transcriptome (or “sample”) level. More importantly, the TIN that is computed for each transcript can be used to adjust gene expression and improve differential expression analysis by reducing the false positives ascribed to in vitro RNA degradation.
Results and discussion
Measuring sample level RNA integrity
The 3′ bias observed in RNA-seq data could arise from RNA degradation by 5′ exonuclease [11, 12], and the commonly used polyA enrichment approach would lead to a even stronger 3′ bias particularly in degraded RNA samples because oligo (dT) selection will only isolate the most 3′ portion of the transcript . Consistently with this hypothesis, we found that samples with lower medTIN score usually had more skewed gene body coverage (Fig. 1c-d). The PBMC dataset was excluded from further analysis because its single-end sequencing design prevents the estimation of RNA fragment size.
Measuring transcript level RNA integrity
As the overall RNA quality decreased, concordance between TIN and fragment size was also decreased (Additional file 8: Figure S5). For example, the Pearson’s r were 0.88, 0.89, and 0.88 for three samples with RIN score of 10 whereas the Pearson’s r were 0.66, 0.61 and 0.63 for three samples with RIN score of 6 (Additional file 7: Figure S4 and Additional file 8: Figure S5). This is because the non-linear relationship between TIN and the RNA fragment size (Fig. 3), and the correlation was mainly determined by those transcripts whose TINs were smaller than saturation point.
Effects of transcript features on TIN score
Using TIN to adjust for RNA degradation in gene differential expression analysis
Functional annotation analysis using DAVID (http://david.abcc.ncifcrf.gov/) for 4 lists of differentially expressed genes (DEGs)
Enriched pathways for the 665 differentially expressed genes in mCRPC samples (without TIN correction).
structural constituent of ribosome
large ribosomal subunit
cytosolic large ribosomal subunit
Enriched pathways for the top 500 differentially expressed genes in human brain Glioblastoma cell line data (without TIN correction).
structural constituent of ribosome
structural molecule activity
large ribosomal subunit
Enriched pathways for the 289 differentially expressed genes in mCRPC samples (after TIN correction).
icosanoid metabolic process
unsaturated fatty acid metabolic process
fatty acid metabolic process
prostaglandin metabolic process
prostanoid metabolic process
Arachidonic acid metabolism
PPAR signaling pathway
Enriched pathways for the 117 differentially expressed genes in mCRPC samples (using 3′ tag counting method).
protein complex assembly
protein complex biogenesis
macromolecular complex assembly
macromolecular complex subunit organization
cellular macromolecular complex subunit organization
1.00E + 00
Evaluate TIN correction using SEQC RNA-seq data with spike-in controls
Without TIN correction
The qualities of commercially available reference RNA samples used in SEQC project were presumably high. Therefore, the improvement of TIN correction was unlikely to be explained by the mitigation of RNA quality differences. However, in addition to RNA degradation, RNA-seq has many other inherent biases (such as GC content, PolyA selection, mappability, etc) that could also produce non-uniform coverage, which could partially explain the improvement after TIN correction.
Comparing TIN correction to 3′ tag counting method
For 3TC method, deciding the size of N is not straightforward: to retain statistical power, N should be as large as possible; however, coverage bias cannot be effectively removed if N is too large. To determine the proper N size, we generated read coverage profiles for 20 mCRPC samples with all expressed transcripts aligned to the 3′ end (i.e. transcription end site) (Fig. 7c). Based on Fig. 7c, we set N to 250 and then performed gene expression analysis using the same procedure (see Methods). As we expected, 3TC method detected 117 DEGs (Additional file 23: Table S8), a much smaller number as compared to 289 DEGs that detected with TIN correction and 665 DEGs detected without TIN correction. Although there were 29 common genes detected by both 3TC and TIN correction methods (Fig. 7d). No prostate or prostate cancer relevant pathways were enriched for the 117 DEG list (Table 1).
Comparing TIN to mRIN
Although TIN and Agilent’s RIN are highly concordant, there are three major differences between them. First, RIN is a valuable approach for pre-sequencing sample screening, while TIN scores can only be calculated after RNA-seq data is produced. Second, when using RNA fragment size as a surrogate for RNA integrity to compare RIN and medTIN, we found that Agilent’s RIN only worked well for samples with relative higher RNA integrity, as evidenced by spread of the distribution of blue circles in Fig. 2c. In contrast, medTIN was more sensitive to samples with low integrity, as demonstrated by more spread of distribution of red circles in Fig. 2d. Third, TIN provides RNA quality measurements at transcript level, which not only enables transcript level quality control, but also helps improve gene expression analysis. This is particularly useful given that different genes usually degraded differently.
Since RNA fragment size can be directly estimated from paired-end RNA-seq data, one might question the need for TIN. There are several drawbacks for measuring the RNA integrity using RNA fragment size alone. First, it can only be estimated from paired-end RNA-seq data. Second, RNA fragment size is influenced by other confounding factors such as the fragmentation and size selection steps during library preparation.
We chose 10 mCRPC samples with lower RIN/medTIN scores (low RIN group) and another 10 samples with higher RIN/medTIN scores (high RIN group) with the primary purpose of comparing “RNA degradation effect” on gene expression analysis. Unlike GBM and PBMC datasets that generated from cell lines, the mCRPC dataset was generated from real clinical tissues, and represented the genuine RNA degradation complexity and inter-tumor heterogeneity. However, this was a less than ideal dataset because: 1) these 20 clinical samples were not exact biological replicates and the pathology characteristics of these samples were slightly different (Additional file 14: Table S4). For example, Gleason scores were slightly lower in “low RIN group” (mean = 6.9, median = 7) than that of “high RIN group” (mean = 7.3, median = 8), even though the difference was not statistically significant (P = 0.28, two-sided Wilcoxon rank sum test). This pathological differences between low and high RIN group also explained the detection of prostate cancer related DEGs. 2) Unlike SEQC which had spike-in transcripts with predetermined known expression values, there was no “true DEGs” available to accurately test the performance of TIN correction. However, we demonstrated through pathways analysis that TIN correction could remove ribosome genes and identify DEGs that related to prostate cancer.
It is known that oligo(dT) is not a ideal choice for isolating mRNA from degraded samples. Other protocols such as exome capture has been demonstrated with greatly improved performance . However, using oligo(dT) to isolate polyadenylated mRNA is the most widely used RNA-seq protocol especially at the early stage when more advanced protocols are not available. For example, BrainSpan (Atlas of the Developing Human Brain, http://www.brainspan.org/) used oligo(dT) to deplete rRNA during RNA-seq library preparation for RNA samples collected from post-mortem tissues. Being designed to correct non-uniform coverage derived from RNA degradation as well as other biases, our TIN algorithm would be a useful approach to reanalyze or meta-analyze these RNA-seq data available from public repositories. On the other hand, even for samples with reasonable RNA integrity (eg. RIN = 8), 3′ bias still persist (Fig. 1c). And we have demonstrated using the SEQC dataset that TIN could improve gene expression analysis even when the RNA quality is high.
In this study, we developed TIN as a novel metric to measure RNA integrity, and demonstrated with multiple datasets that the TIN metric is not only a reliable measurement of RNA integrity in both transcriptome and transcript level, but also a valuable metric to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.
Human U-251 MG brain glioblastoma cell lines (GBM) . This dataset has 12 pair-end RNA-seq data files available under SRA accession SRP023548. Samples in this dataset have a wide range of RIN values: three samples with RIN value of 10 (SRR873838, SRR873834 and SRR873822), two samples with RIN value of 8 (SRR879615 and SRR879800), three samples with RIN value of 6 (SRR880232, SRR881272 and SRR880070), three samples with RIN value of 4 (SRR881852, SRR881451, and SRR881672) and one sample with RIN value of 2 (SRR881985). Additional file 1: Table S1 presents the details of this dataset.
Human peripheral blood mononuclear cells (PBMC) . This dataset has 20 single-end RNA-seq data files available under SRA accession SRP041955. This dataset was developed to estimate the in vitro degradation at 12 h, 24 h, 48 h and 84 h. Additional file 2: Table S2 presents the details of the samples along with their associated RIN values (varied from 2.8 to 9.4).
Sequencing quality control consortium data set (SEQC) . The Sequencing Quality Control Consortium analyzed samples containing reference RNA. This dataset was downloaded from NCBI Gene Expression Omnibus (GEO) with accession number GSE49712. This SEQC subset has a total of 10 samples. Group A contains 5 replicates (SRR950078, SRR950080, SRR950082, SRR950084 and SRR950086) of the Stratagene Universal Human Reference RNA (UHRR) and Group B has 5 replicates (SRR950079, SRR950081, SRR950083, SRR950085 and SRR950087) of the Ambion Human Brain Reference RNA (HBRR). ERCC (External RNA Controls Consortium) control mix was spiked in both groups at 2 % by volume. This control mixture contains 92 synthetic polyadenylated oligonucleotides of 250-2000 nucleotides in length, which were meant to resemble human transcripts.
Human prostate cancer tissue samples (mCRPC). This study was approved by the Mayo Clinic Institutional Review Board and conducted in accordance with the Declaration of Helsinki. We obtained a total of 120 samples from 46 castration-resistant prostate cancer patients. Out of the collected 120 samples, 62 were blood samples, 18 were metastatic rib lesion biopsies and 40 were metastatic bone tissue biopsies. Tissues were snap frozen with liquid nitrogen and RNA was harvested using Rneasy Plus Mini Kit (Qiagen). RNA libraries were prepared according to the manufacturer’s instructions for the TruSeq RNA Sample Prep Kit v2 (Illumina, San Diego, CA). Briefly, poly-A mRNA was purified from total RNA using oligo dT magnetic beads. The purified mRNA was fragmented at 95 °C for 8 min and eluted from the beads. Double stranded cDNA was made using SuperScript III reverse transcriptase, random primers (Invitrogen, Carlsbad, CA) and DNA polymerase I and RNase H. The cDNA ends were repaired and an “A” base added to the 3′ ends. TruSeq paired end index DNA adaptors (Illumina, San Diego CA) with a single “T” base overhang at the 3′ end were ligated and the resulting constructs were purified using AMPure SPRI beads from Agencourt. The adapter-modified DNA fragments were enriched by 12 cycles of PCR using Illumina TruSeq PCR primers. The concentration and size distribution of the libraries was determined on an Agilent Bioanalyzer DNA 1000 chip and Qubit fluorometry (Invitrogen, Carlsbad, CA). Pair-end RNA sequencing was performed using Illumina HiSeq 2500. Additional file 3: Table S3 presents the details of this dataset.
Determine the RNA integrity number (RIN)
All mCRPC RNA samples were analysed by Agilent Bioanalyzer 2100 before sequencing. Based on the recorded electropherograms, RIN values were calculated according to the algorithm considering four features: “total RNA ratio” (i.e. the fraction of the area in the region of 18S and 28S compared to the total area under the curve), 28S-region height, 28S area ratio and the 18S:28S ratio. RIN values of GBM, PBMC and SEQC RNA samples were obtained from the original publications.
Algorithm for computing the transcript integrity number (TIN)
Calculating library RNA fragment size
RNA fragment size is the natural measure of the in vitro RNA degradation. Since read pairs were sequenced from both ends of RNA (actually cDNA) fragments, the size of each RNA fragment in the sequencing library can be directly estimated from the distance between read pairs after mapping them to the reference genome. We used uniquely mapped high quality (mapq ≥ 30) read pairs to estimate the RNA fragment size. When a read pair was mapped to the same exon, the fragment size is defined as the genomic distance covered by the two reads (i.e. distance between the “start” of the first read and “end” of the second read). When a read pair was mapped to different exons of the same gene, introns lying between the two reads were subtracted from the genomic distance covered by the read pair. We considered the longest RNA isoform when multiple splicing isoforms (exon skipping, intron retention, alternative donor/acceptor sites, etc.) exist. We removed transcripts with <30 mapped read-pairs to improve the reliability of library fragment size estimation. The “sample level” RNA fragment size was estimated by taking the average of fragment sizes calculated from all read pairs that uniquely mapped to the reference genome. Similarly, the “transcript level” RNA fragment size was estimated from all read pairs that specifically mapped to a transcript.
Normalizing gene level read counts using TIN metric
Where y i ′ denote the normalized read count of gene i and ŷ i denote the fitted value.
Differential expression analysis
We applied the same procedure for mCRPC dataset (compared 10 samples of lower RIN/TIN values with 10 samples of higher RIN/TIN values), GBM dataset (compared three samples with RIN = 10 to three samples with RIN = 4) and SEQC dataset (compared group A to group B). This method utilized edgeR (version 3.6.8) to perform differential expression analysis . The software was configured to use the TMM (trimmed mean of M values) method for normalizing the library depth differences between samples . Differential expression p-values were FDR corrected using the Benjamini-Hochberg method. Genes with an FDR of ≤ 0.01were considered as differentially expressed between groups.
Availability of supporting data
Twenty RNA-seq data generated from metastatic prostate cancer tissues were submitted to Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) with accession number: GSE70285 (reviewers’ link: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=knchmaksrfqfnov&acc=GSE70285). Python Code to calculate TIN score (tin.py) is freely available from RSeQC package (www.http://rseqc.sourceforge.net) .
coding DNA sequence
differentially expression gene
external RNA controls consortium
fragments per kilobase of transcript per million mapped reads
gene expression omnibus
ambion human brain reference rna
metastatic castration resistant prostate cancer
peripheral blood mononuclear cell
RNA integrity number
sequencing quality control consortium
sequencer read archive
transcription end site
transcript integrity number
transcript start site
stratagene universal human reference rna
This work is support by the Mayo Clinic Center for Individualized Medicine; A.T. Suharya and Ghan D.H.; Joseph and Gail Gassner; and Mayo Clinic Schulze Cancer for Novel Therapeutics in Cancer Research [grant number MC1351 to M.K]; National Institutes of Health [grant numbers CA134514, CA130908 to H.H.]. Other contributing groups include the Mayo Clinic Cancer Center and the Pharmacogenomics Research Network (PGRN).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- von Ahlfen S, Missel A, Bendrat K, Schlumpberger M. Determinants of RNA quality from FFPE samples. PLoS One. 2007;2:e1261.View ArticleGoogle Scholar
- Masuda N, Ohnishi T, Kawamoto S, Monden M, Okubo K. Analysis of chemical modification of RNA from formalin-fixed samples and optimization of molecular biology applications for such samples. Nucleic Acids Res. 1999;27:4436–43.PubMed CentralView ArticlePubMedGoogle Scholar
- Botling J, Edlund K, Segersten U, Tahmasebpoor S, Engström M, Sundström M, et al. Impact of thawing on RNA integrity and gene expression analysis in fresh frozen tissue. Diagn Mol Pathol. 2009;18:44–52.View ArticlePubMedGoogle Scholar
- Gallego Romero I, Pai AA, Tung J, Gilad Y. RNA-seq: impact of RNA degradation on transcript quantification. BMC Biol. 2014;12:42.PubMed CentralView ArticlePubMedGoogle Scholar
- Sigurgeirsson B, Emanuelsson O, Lundeberg J. Sequencing degraded RNA addressed by 3′ tag counting. PLoS One. 2014;9:e91851.PubMed CentralView ArticlePubMedGoogle Scholar
- Opitz L, Salinas-Riester G, Grade M, Jung K, Jo P, Emons G, et al. Impact of RNA degradation on gene expression profiling. BMC Med Genomics. 2010;3:36.PubMed CentralView ArticlePubMedGoogle Scholar
- Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, et al. The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol. 2006;7:3.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang E, van Nimwegen E, Zavolan M, Rajewsky N, Schroeder M, Magnasco M, et al. Decay rates of human mRNAs: correlation with functional characteristics and sequence attributes. Genome Res. 2003;13:1863–72.PubMed CentralView ArticlePubMedGoogle Scholar
- Beelman CA, Parker R. Degradation of mRNA in eukaryotes. Cell. 1995;81:179–83.View ArticlePubMedGoogle Scholar
- van Hoof A, Parker R. The exosome: a proteasome for RNA? Cell. 1999;99:347–50.View ArticlePubMedGoogle Scholar
- Houseley J, Tollervey D. The many pathways of RNA degradation. Cell. 2009;136:763–76.View ArticlePubMedGoogle Scholar
- Garneau NL, Wilusz J, Wilusz CJ. The highways and byways of mRNA decay. Nat Rev Mol Cell Biol. 2007;8:113–26.View ArticlePubMedGoogle Scholar
- Adiconis X, Borges-Rivera D, Satija R, DeLuca DS, Busby MA, Berlin AM, et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods. 2013;10:623–9.View ArticlePubMedGoogle Scholar
- Brisco MJ, Morley AA. Quantification of RNA integrity and its use for measurement of transcript number. Nucleic Acids Res. 2012;40:e144.PubMed CentralView ArticlePubMedGoogle Scholar
- Bauer M, Polzin S, Patzelt D. Quantification of RNA degradation by semi-quantitative duplex and competitive RT-PCR: a possible indicator of the age of bloodstains? Forensic Sci Int. 2003;138:94–103.View ArticlePubMedGoogle Scholar
- Gong X, Tao R, Li Z. Quantification of RNA damage by reverse transcription polymerase chain reactions. Anal Biochem. 2006;357:58–67.View ArticlePubMedGoogle Scholar
- Duan J, Shi J, Ge X, Dölken L, Moy W, He D, et al. Genome-wide survey of interindividual differences of RNA stability in human lymphoblastoid cell lines. Sci Rep. 2013;3:1318.PubMed CentralPubMedGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4:44–57.View ArticleGoogle Scholar
- Nie D, Che M, Grignon D, Tang K, Honn KV. Role of eicosanoids in prostate cancer progression. Cancer Metastasis Rev. 2001;20:195–206.View ArticlePubMedGoogle Scholar
- Liu Y. Fatty acid oxidation is a dominant bioenergetic pathway in prostate cancer. Prostate Cancer Prostatic Dis. 2006;9:230–4.View ArticlePubMedGoogle Scholar
- Baron A, Migita T, Tang D, Loda M. Fatty acid synthase: a metabolic oncogene in prostate cancer? J Cell Biochem. 2004;91:47–53.View ArticlePubMedGoogle Scholar
- Moreno J, Krishnan AV, Swami S, Nonn L, Peehl DM, Feldman D. Regulation of prostaglandin metabolism by calcitriol attenuates growth stimulation in prostate cancer cells. Cancer Res. 2005;65:7917–25.View ArticlePubMedGoogle Scholar
- Wierenga RK, Hol WG. Predicted nucleotide-binding properties of p21 protein and its cancer-associated variant. Nature. 1983;302:842–4.View ArticlePubMedGoogle Scholar
- Fukumoto M, Amanuma T, Kuwahara Y, Shimura T, Suzuki M, Mori S, et al. Guanine nucleotide-binding protein 1 is one of the key molecules contributing to cancer cell radioresistance. Cancer Sci. 2014;105:1351–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Matthews JM, Lester K, Joseph S, Curtis DJ. LIM-domain-only proteins in cancer. Nat Rev Cancer. 2013;13:111–22.View ArticlePubMedGoogle Scholar
- Feng H, Zhang X, Zhang C. mRIN for direct assessment of genome-wide and gene-specific mRNA integrity from large-scale RNA-sequencing data. Nat Commun. 2015;6:7816.PubMed CentralView ArticlePubMedGoogle Scholar
- Cieslik M, Chugh R, Wu Y-M, Wu M, Brennan C, Lonigro R, et al. The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing. Genome Res. 2015;25:1372–81.View ArticlePubMedGoogle Scholar
- SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol. 2014;32:903–14.View ArticleGoogle Scholar
- Jost L. Entropy and diversity. Oikos. 2006;113(2):363–75.View ArticleGoogle Scholar
- Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40.PubMed CentralView ArticlePubMedGoogle Scholar
- Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28:2184–5.View ArticlePubMedGoogle Scholar