Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers
© Dakna et al; licensee BioMed Central Ltd. 2010
Received: 3 November 2010
Accepted: 10 December 2010
Published: 10 December 2010
The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School.
We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test-set is essential.
Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design.
Is the change (frequency or abundance) of a certain molecule observed in a proteomics study of disease, the result of the disease, or does it merely reflect an artefact due to technical variability in the pre-analytical steps or in the analysis, biological variability, or bias introduced in the study (e.g. due to lifestyle, age, and gender)?
How should we estimate the number of samples required for the definition of likely valid biomarkers?
Which algorithms can be employed to combine biomarkers into a multi-marker classifier, and how can the validity of a multi-marker classifier be assessed? Is validation in an independent test set necessary?
In an effort to investigate these issues and propose answers to these questions, we have employed different analysis and statistical strategies towards biomarker definition and validation using a set of data obtained from real samples. While technical differences do exist between proteomics and peptidomics, these approaches investigate a highly similar chemical entity, and the problems and challenges associated with the identification of potential proteomic and peptidomics biomarkers (features significantly associated with the studied physiological or pathophysiological condition) are essentially identical. Therefore, we feel it is appropriate not to distinguish between peptidomics and proteomics throughout this manuscript. Several platforms for proteomics or peptidomics are currently being used in biomarker discovery studies (reviewed in e.g. .
We have chosen data from CE-MS as one representative example, due to the following reasons: a) CE-MS is being used in clinical trials and data from CE-MS are applied in clinical decision-making, b) sufficient datasets of CE-MS were available to us, and c) the analytical performance characteristics of the CE-MS platform are well documented [10, 11]
In order to permit a rigorous and realistic assessment of the methodology, the study must (i) represent a real proteomic dataset that is acquired using the same technologies and experimental design as for a biomarker study; (ii) be a classification problem with "typical" complexity, but simple enough to be tractable by standard methods; and (iii) permit the deployment of commonly used statistical analysis strategies in order to benchmark them against an unequivocal outcome. Based on these considerations we choose as an example the definition of proteomic differences between apparently healthy adult males and females. This avoids any bias due to a non-verifiable physiological condition in the subjects, since gender can be assessed with close to 100% confidence . This design avoids an important problem in biomarker discovery pipelines: the so called verification bias. This bias occurs if subjects are not equally likely to have the diagnosis verified by a gold-standard test and if selection for further evaluation is dependent on the diagnostic test result. Of course, in general the clinical situation will not allow for such a sharp definition as in the male-female case, but standard methods exist for accounting for the verification bias if the clinical readout cannot be assigned with 100% confidence [13–15]. We also used a cohort of subjects with diabetes type II either with normal kidney function (controls, CD) or diabetic nephropathy (cases, DN) to demonstrate the applicability of the methods to a case where the clinical readout may not be verified with 100% confidence. The difference in the male-female study turned out to be more subtle than in the CD versus DN case, as the differences between the proteomic profiles between males and females are less pronounced than in the CD-DN case.
As body fluid to be analysed we have chosen urine. The urinary proteome/peptidome is of high stability, reducing pre-analytical variability . CE-MS was chosen as technology as it allows for the routine analysis of a large number of samples, and has been thoroughly validated as a platform technology for proteomic biomarker studies . As result of the current study we demonstrate the importance of a strict and correct use of statistics, especially adjustment for multiple testing. We further describe algorithms that enable prediction of the number of samples required for the definition of biomarkers with high confidence. The results presented here also show that different machine learning algorithms perform similarly (and very well) in establishing discriminatory multi-marker models. However, it is equally evident that these only lead to meaningful results if the number of data points employed is sufficient to learn the difference between the groups, and that the performance of such models can only be assessed on an independent test set. Although our results have been obtained with a particular proteomic technology, CE-MS, the principal conceptual considerations, and hence also the conclusions, are independent of the technology used. Therefore, the results reported here should also be applicable to other datasets generated using alternative standard proteomics technologies such as LC-MS or MALDI. Unfortunately, to the best of our knowledge, there is currently no similar dataset publicly available for MALDI or for LC-MS. Hence, we cannot report on the application of the proposed methods for either platform.
Results and Discussion
The number of significant markers depends on the statistical test used
Resampling as means to define "better biomarkers"
The concordance of the markers
30% data hold out
N = 0
N = 2
N = 10
N = 20
N = 30
N = 40
N = 50
N = 100
Concordance in test set
Estimation of the sample sizes
An important question in the design of clinical proteomics studies is the selection of an appropriate sample size . The number of units to be included in the study should typically address two issues. First, the differential sample size Ndiff should allow the identification of putative biomarkers that are differentially expressed between two conditions (e.g. disease versus control). Second, the discriminative sample size Ndisc of the training data should allow the learning of a confident rule for classifying blinded items.
Estimation of the differential sample size
This resulted in a estimate π0 = 0.5652831 which plugged into Equation 1 leads to αave = 0.00223.
In the above considerations, we opted for simplicity for the standard definition of the sample size as the minimum number of samples necessary to achieve a specified power. Alternatively, the "confidence probability formulation"  may also be used as it relies on the permutation of pilot study data of small sample sizes.
Estimation of the discriminative sample size
In practice it is impossible to reach Γ and only upper bound estimates to it can be reached. The aim is to find the discriminative sample size Ndisc, that guarantees that Γ (Ndisc) of the classifier is within some threshold (e.g ϵ = 5%) from the optimal Bayes classifier obtained for infinite Ndisc  (that is, Γ(∞) - Γ(Ndisc) ≤ ϵ). Ndisc may then be obtained by resolving the equation Γ(∞) - Γ(Ndisc) = ϵ. Interestingly, here again the effect size δ turns out to be the parameter that determines Ndisc. In the classification context, the effect size measures the distance between the classes. If the pilot study shows a small effect size then it is unlikely that a good discriminator will be easily obtained. The required Ndisc that maximizes the Γ(Ndisc) implicitly depends on the false positive rate α [39, 40]. Consequently, using those markers that control the FDR should generally produce a good classifier [39, 40]. For the 67 male and 67 female profiles, controlling the FDR at 0.05 we are able to define 78 significant peptide markers requiring an Ndiff < 67. With their calculated effect sizes we found that Ndisc = 48 is required to obtain a classifier with 10% performance short of the optimal Bayes classifier. The analytical method described in [39, 40] relies on strong distributional assumptions and seems to be less conservative than the learning curve estimation of Ndisc.
Test errors for different classifiers
Applications to the CD-DN case study
To further test the applicability of the reported methods we investigated the difference between CD and DN patients using a data set of 120 CD and 120 DN subjects randomly split into 2 60 training and 2 × 60 test datasets (data available as Additional file 3 and Additional file 4). The differences in this dataset are much more pronounced than the male-female case (Additional file 5). Using the 2 × 60 training data and 10 different random splits we found that on average 447 peptides may be declared differentially expressed using the adjusted WT. 65% of those markers could be validated in the test data (Additional file 6). The fact that using a pilot study of larger size results in more markers being declared significant clearly applies here too, as readily seen from the figure in the Additional file 7. The learning curve of this dataset also shows clearly the inverse power law behaviour (Figure in Additional file 8) and suggests that for the CD-DN case fewer subjects than in the male-female comparison may be required to obtain a classification of comparable performance.
Patients, Procedures and Demographics
Second morning urine samples were collected from apparently healthy volunteers in the course of the medical examination prior to employment at the Hannover Medical School. Consent was given by all participants. Samples were collected in 10 ml Sarstedt urine monovettes and frozen immediately after collection without the addition of any preservatives. All samples were collected anonymously, only age and gender were recorded. All samples were collected in Germany, and under German law this study does not require IRB approval.
Sample preparation and CE-MS analysis
Urine samples were stored at 20°C for up to 3 years until analysis. For proteomic analysis, a 0.7 mL aliquot of urine was thawed immediately before use and diluted with 0.7 ml of 2 M urea, 10 mM NH4OH containing 0.02% SDS. To remove higher molecular mass proteins, samples were filtered using Centris-art ultracentrifugation filter devices (20 kDa molecular weight cut-off; Sartorius, Goettingen, Germany) at 3,000 rcf until 1.1 ml of filtrate was obtained. This filtrate was applied onto a PD-10 desalting column (Amersham Bioscience, Uppsala, Sweden) equilibrated in 0.01% NH4OH in HPLC-grade H2O (Roth, Germany) to remove urea, electrolytes, and salts. Finally, all samples were lyophilized, stored at 4°C, and suspended in HPLC-grade H2O shortly before CE-MS analysis, as described in . CE-MS analysis was performed as described [45, 46] using a P/ACE MDQ capillary electrophoresis system (Beckman Coulter, Fullerton, USA) on-line coupled to a Micro-TOF MS (Bruker Daltonic, Bremen, Germany). Data acquisition and MS acquisition methods were automatically controlled by the CE via contact-close-relays. The ESI spectra were accumulated every 3 s, over a range of m/z 350 to 3000 Th. Accumulation time has been chosen to be 3 s, since at peak width of ca. 15 sec at half peak height, essentially no resolution is lost when accumulating signal for 3 s. Faster sampling would result in any additional gain, but in loss in sensitivity, and also increase in the size of the data file. Accuracy, precision, selectivity, sensitivity, reproducibility, and stability are described in detail elsewhere [10, 17, 45]. In short, the detection limit is in the range of 1 fmol, depending on the ionization properties of the individual peptide. This corresponds to 100 - 1000 fmol/ml in a crude urine sample (before processing).
Mass spectral ion peaks representing identical molecules at different charge states were deconvoluted into single masses using MosaiquesVisu software . Migration time and ion signal intensity (amplitude) were normalized based on 29 collagen fragments that serve as internal standards . These internal polypeptide standards are the result of normal biological processes and have proven to be unaffected by any disease state studied to date (greater than 10,000 samples analysed to date) . The resulting peak list characterizes each peptide by its molecular mass [Da], normalized migration time [min], and normalized signal intensity. All detected peptides were deposited, matched, and annotated in a Microsoft SQL database, allowing further analysis and comparison of multiple samples (patient groups). To establish the identity of peptides observed in different samples, a linear function was employed that allowed, depending on the mass of the polypeptide, a 50 ppm absolute mass deviation for peptides of 800 Da that increased linearly to 100 ppm absolute mass deviation for peptides with a maximum mass of 20 kDa. These values have been found appropriate in several recent studies [11, 49, 50], as a compromise between avoiding erroneous assignment of the same identity to two different peptides, and assigning two different identities to the same peptide in different analyses, due to mass deviation, especially at low abundance. A similar linear function was used when comparing CE migration times, allowing a 4% absolute deviation. CE-MS data of all individual samples can be accessed in Additional files 1, 2.
Statistical methods, definition of biomarkers and sample classification
All the statistical analyses were implemented with internal scripts, using the R core software  as well as the contributed cran-packages ada, Kernlab, Ran-domForest, rpart, WilcoxCv, multtest, and ROCR available at http://cran.us.r-project.org.
Joost P Schanstra, Antonia Vlahou and Harald Mischak are all members of EUROKUP
List of abbreviations used
- 1) AUC:
area under the ROC curve
- 2) AAC:
area above the ROC curve
- 3) BH:
- 4) CE-MS:
capillary electrophoresis coupled mass spectrometry
- 5) CD:
diabetes type II with normal kidney function
- 6) DN:
- 7) ESI:
- 8) FDR:
false discovery rate
- 9) GLM:
generalized linear model
- 10) LC-MS:
liquid chromatography coupled mass spectrometry
- 11) LOD:
limit of detection
- 12) LOOCV:
leave-one-out cross validation
- 13) MALDI:
matrix assisted laser desorption ionization
- 14) MER:
misclassification error rate
- 15) Ndiff:
differential sample size
- 16) Ndisc:
discriminative sample size
- 17) ROC:
receiver operating characteristic
- 18) SQL:
structured query language
- 19) SVM:
support vector machine
- 20) WT:
Wilcoxon rank sum test.
This work was funded in part by grants from the European Union through InGenious HyperCare (LSHM-C7-2006-037093) and Geninca (HEALTH-F2-2008-202230) to HM and the EuroKUP COST Action (BM0702) and AV from the FP7 DECanBio (201333) and by the European Community's 7th Framework Programme, grant agreement HEALTH-F2-2009-241544 (SysKID). JPS acknowledges financial support from the Agence Nationale pour la Rechérche (ANR-07-PHYSIO-004-01), and support by Inserm, the "Direction Régional Clinique" (CHU de Toulouse, France) under the Interface program. WK is supported by the Science Foundation Ireland under Grant No. 06/CE/B1129.
- Rifai N, Gillette MA, Carr SA: Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 2006, 24(8):971–83. [Rifai1, Nader Gillette, Michael A Carr, Steven A Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Review United States Nature biotechnology Nat Biotechnol. 2006 Aug;24(8):971-83.] 10.1038/nbt1235View ArticlePubMedGoogle Scholar
- Listgarten J, Emili A: Practical proteomic biomarker discovery: taking a step back to leap forward. Drug Discov Today 2005, 10(23–24):1697–702. 10.1016/S1359-6446(05)03645-7View ArticlePubMedGoogle Scholar
- Petricoin EF, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg SM, Mills GB, Simone C, Fishman DA, Kohn EC, Liotta LA: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 2002, 359(9306):572–7. 10.1016/S0140-6736(02)07746-2View ArticlePubMedGoogle Scholar
- McLerran D, Grizzle WE, Feng Z, Thompson IM, Bigbee WL, Cazares LH, Chan DW, Dahlgren J, Diaz J, Kagan J, Lin DW, Malik G, Oelschlager D, Partin A, Randolph TW, Sokoll L, Srivastava S, Thornquist M, Troyer D, Wright GL, Zhang Z, Zhu L, Semmes OJ: SELDI-TOF MS whole serum proteomic profiling with IMAC surface does not reliably detect prostate cancer. Clin Chem 2008, 54: 53–60. 10.1373/clinchem.2007.091496View ArticlePubMedPubMed CentralGoogle Scholar
- Diamandis EP: Point: Proteomic patterns in biological fluids: do they represent the future of cancer diagnostics? Clin Chem 2003, 49(8):1272–5. 10.1373/49.8.1272View ArticlePubMedGoogle Scholar
- Ransohoff DF: Bias as a threat to the validity of cancer molecular-marker research. Nat Rev Cancer 2005, 5(2):142–9. 10.1038/nrc1550View ArticlePubMedGoogle Scholar
- Mischak H, Apweiler R, Banks RE, Conaway M, Coon J, Dominiczak A, Ehrich JHH, Fliser D, Girolami M, Hermjakob H, Hochstrasser D, Jankowski J, Julian BA, Kolch W, Massy ZA, Neusuess C, Novak J, Peter K, Rossing K, Schanstra J, Semmes OJ, Theodorescu D, Thongboonkerd V, Weissinger EM, Van Eyk JE, Yamamoto T: Clinical proteomics: A need to define the field and to begin to set adequate standards. PROTEOMICS - Clinical Applications 2007, 1(2):148–156. [http://dx.doi.org/10.1002/prca.200600771] 10.1002/prca.200600771View ArticlePubMedGoogle Scholar
- Decramer S, Gonzalez de Peredo A, Breuil B, Mischak H, Monsarrat B, Bascands JL, Schanstra JP: Urine in clinical proteomics. Mol Cell Proteomics 2008, 7(10):1850–62. 10.1074/mcp.R800001-MCP200View ArticlePubMedGoogle Scholar
- Fliser D, Novak J, Thongboonkerd V, Argiles A, Jankowski V, Girolami MA, Jankowski J, Mischak H: Advances in Urinary Proteome Analysis and Biomarker Discovery. J Am Soc Nephrol 2007, 18(4):1057–1071. [http://jasn.asnjournals.org/cgi/content/abstract/18/4/1057] 10.1681/ASN.2006090956View ArticlePubMedGoogle Scholar
- Haubitz M, Good DM, Woywodt A, Haller H, Rupprecht H, Theodorescu D, Dakna M, Coon JJ, Mischak H: Identification and validation of urinary biomarkers for differential diagnosis and evaluation of therapeutic intervention in anti-neutrophil cytoplasmic antibody-associated vasculitis. Mol Cell Proteomics 2009, 8(10):2296–307. 10.1074/mcp.M800529-MCP200View ArticlePubMedPubMed CentralGoogle Scholar
- Good DM, Zürbig P, Argilés n, Bauer HW, Behrens G, Coon JJ, Dakna M, Decramer S, Delles C, Dominiczak AF, Ehrich JHH, Eitner F, Fliser D, Fromm-berger M, Ganser A, Girolami MA, Golovko I, Gwinner W, Haubitz M, Herget-Rosenthal S, Jankowski J, Jahn H, Jerums G, Julian BA, Kellmann M, Kliem V, Kolch W, Krolewski AS, Luppi M, Massy Z, Melter M, Neusüss C, Novak J, Peter K, Rossing K, Rupprecht H, Schanstra JP, Schiffer E, Stolzenburg JU, Tarnow L, Theodorescu D, Thongboonkerd V, Vanholder R, Weissinger EM, Mischak H, Schmitt-Kopplin P: Naturally Occurring Human Urinary Peptides for Use in Diagnosis of Chronic Kidney Disease. Molecular and Cellular Proteomics 2010, 9(11):2424–2437. [http://www.mcponline.org/content/9/11/2424.abstract] 10.1074/mcp.M110.001917View ArticlePubMedPubMed CentralGoogle Scholar
- Mischak H, Allmaier G, Apweiler R, Attwood T, Baumann M, Benigni A, Bennett SE, Bischo R, Bongcam-Rudloff E, Capasso G, Coon JJ, DHaese P, Dominiczak AF, Dakna M, Dihazi H, Ehrich JH, Fernandez-Llama P, Fliser D, Frokiaer J, Garin J, Girolami M, Hancock WS, Haubitz M, Hochstrasser D, Holman RR, Ioannidis JPA, Jankowski J, Julian BA, Klein JB, Kolch W, Luider T, Massy Z, Mattes WB, Molina F, Monsarrat B, Novak J, Peter K, Rossing P, Sanchez-Carbayo M, Schanstra JP, Semmes OJ, Spasovski G, Theodorescu D, Thongboonkerd V, Vanholder R, Veenstra TD, Weissinger E, Yamamoto T, Vlahou A: Recommendations for Biomarker Identification and Qualification in Clinical Proteomics. Science Translational Medicine 2010, 2(46):46ps42. [http://stm.sciencemag.org/content/2/46/46ps42.abstract] 10.1126/scitranslmed.3001249View ArticlePubMedGoogle Scholar
- Alonzo TA, Kittelson JM: A novel design for estimating relative accuracy of screening tests when complete disease verification is not feasible. Biometrics 2006, 62(2):605–12. [Alonzo, Todd A Kittelson, John M R01 GM54438/GM/NIGMS NIH HHS/United States Comparative Study Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't United States Bio-metrics Biometrics. 2006 Jun;62(2):605–12.] [Alonzo, Todd A Kittelson, John M R01 GM54438/GM/NIGMS NIH HHS/United States Comparative Study Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't United States Bio-metrics Biometrics. 2006 Jun;62(2):605-12.] 10.1111/j.1541-0420.2005.00445.xView ArticlePubMedGoogle Scholar
- Buzoianu M, Kadane JB: Adjusting for verification bias in diagnostic test evaluation: a Bayesian approach. Stat Med 2008, 27: 2453–2473. 10.1002/sim.3099View ArticlePubMedGoogle Scholar
- Page JH, Rotnitzky A: Estimation of the disease-specific diagnostic marker distribution under verification bias. Computational Statistics and Data Analysis 2009, 53(3):707–717. [http://www.sciencedirect.com/science/article/B6V8V-4SX9FTT-1/2/a708b210a358c83a359bd1c2ca7bef7f] 10.1016/j.csda.2008.06.021View ArticlePubMedPubMed CentralGoogle Scholar
- Mischak H, Coon JJ, Novak J, Weissinger EM, Schanstra JP, Dominiczak AF: Capillary electrophoresis-mass spectrometry as a powerful tool in biomarker discovery and clinical diagnosis: an update of recent developments. Mass Spectrom Rev 2009, 28(5):703–24. 10.1002/mas.20205View ArticlePubMedPubMed CentralGoogle Scholar
- Jantos-Siwy J, Schiffer E, Brand K, Schumann G, Rossing K, Delles C, Mischak H, Metzger J: Quantitative urinary proteome analysis for biomarker evaluation in chronic kidney disease. J Proteome Res 2009, 8: 268–81. 10.1021/pr800401mView ArticlePubMedGoogle Scholar
- Wang P, Tang H, Zhang H, Whiteaker J, Paulovich AG, Mcintosh M: Normalization regarding non-random missing values in high-throughput mass spectrometry data. Pac Symp Biocomput 2006, 315–326. full_textGoogle Scholar
- Helsel R: Nondetects and data analysis: statistics for censored environmental data. New York: Wiley-Interscience; 2005.Google Scholar
- Taylor S, Pollard K: Hypothesis tests for point-mass mixture data with application to 'omics data with many zero values. Stat Appl Genet Mol Biol 2009, 8: Article 8.PubMedGoogle Scholar
- Broadhurst D, Kell D: Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics 2006, 2(4):171–196. [http://dx.doi.org/10.1007/s11306–006–0037-z] 10.1007/s11306-006-0037-zView ArticleGoogle Scholar
- Dakna M, He Z, Yu WC, Mischak H, Kolch W: Technical, bioinformatical and statistical aspects of liquid chromatography-mass spectrometry (LC-MS) and capillary electrophoresis-mass spectrometry (CE-MS) based clinical proteomics: a critical assessment. J Chromatogr B Analyt Technol Biomed Life Sci 2009, 877: 1250–1258. 10.1016/j.jchromb.2008.10.048View ArticlePubMedGoogle Scholar
- Oberg AL, Vitek O: Statistical Design of Quantitative Mass Spectrometry-Based Proteomic Experiments. Journal of Proteome Research 2009, 8(5):2144–2156. 10.1021/pr8010099View ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 1995, 57: 289–300. [http://vorlon.case.edu/~sray/mlrg/controlling_fdr_benjamini95.pdf]Google Scholar
- Hemelrijk J: Note on Wilcoxon's Two-Sample Test when Ties are Present. Annals of Mathematical Statistics 1952, 23: 133–135. 10.1214/aoms/1177729491View ArticleGoogle Scholar
- Soares AJ, Santos M, Trugilho M, Neves-Ferreira A, Perales J, Domont G: Differential proteomics of the plasma of individuals with sepsis caused by Acinetobacter baumannii. Journal of Proteomics 2009, 73(2):267–278. [http://www.sciencedirect.com/science/article/B8JDC-4X9NVD1–1/2/e97759e56b52f471a9361b9d05d3072b] 10.1016/j.jprot.2009.09.010View ArticlePubMedGoogle Scholar
- Matsubara J, Ono M, Honda K, Negishi A, Ueno H, Okusaka T, Furuse J, Furuta K, Sugiyama E, Saito Y, Kaniwa N, Sawada J, Shoji A, Sakuma T, Chiba T, Saijo N, Hirohashi S, Yamada T: Survival Prediction for Pancreatic Cancer Patients Receiving Gemcitabine Treatment. Molecular and Cellular Proteomics 2010, 9(4):695–704. [http://www.mcponline.org/content/9/4/695.abstract] 10.1074/mcp.M900234-MCP200View ArticlePubMedPubMed CentralGoogle Scholar
- Ma Y, Peng J, Huang L, Liu W, Zhang P, Qin H: Searching for serum tumor markers for colorectal cancer using a 2-D DIGE approach. Electrophoresis 2009, 30(15):2591–2599. 10.1002/elps.200900082View ArticlePubMedGoogle Scholar
- Altman DMD, TN B, MJ G: Statistics with Confidence: Confidence intervals and statistical guidelines. 2nd edition. London: BMJ Books; 2000.Google Scholar
- Cairns DA, Barrett JH, Billingham LJ, Stanley AJ, Xi-narianos G, Field JK, Johnson PJ, Selby PJ, Banks RE: Sample size determination in clinical proteomic profiling experiments using mass spectrometry for class comparison. Proteomics 2009, 9: 74–86. 10.1002/pmic.200800417View ArticlePubMedGoogle Scholar
- Jackson D, Herath A, Swinton J, Bramwell D, Chopra R, Hughes A, Cheeseman K, Tonge R: Considerations for powering a clinical proteomics study: Normal variability in the human plasma proteome. PROTEOMICS - CLINICAL APPLICATIONS 2009, 3(3):394–407. 10.1002/prca.200800066View ArticlePubMedGoogle Scholar
- Efron B, Tibshirani R: An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC; 1993.View ArticleGoogle Scholar
- Strimmer K: A unified approach to false discovery rate estimation. BMC Bioinformatics 2008, 9: 303. 10.1186/1471-2105-9-303View ArticlePubMedPubMed CentralGoogle Scholar
- Lesaffre E, Scheys I, Frohlich J, Bluhmki E: Calculation of power and sample size with bounded outcome scores. Stat Med 1993, 12: 1063–1078.View ArticlePubMedGoogle Scholar
- Walters SJ: Sample size and power estimation for studies with health related quality of life out-comes: a comparison of four methods using the SF-36. Health Qual Life Outcomes 2004, 2: 26. 10.1186/1477-7525-2-26View ArticlePubMedPubMed CentralGoogle Scholar
- Lin WJ, Hsueh HM, Chen JJ: Power and sample size estimation in microarray studies. BMC Bioinformatics 2010, 11: 48. 10.1186/1471-2105-11-48View ArticlePubMedPubMed CentralGoogle Scholar
- Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, Golub TR, Mesirov JP: Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol 2003, 10(2):119–42. 10.1089/106652703321825928View ArticlePubMedGoogle Scholar
- Kenneth RH, Caimiao W: Learning Curves in Classification With Microarray Data. Seminars in oncology 2010, 37: 65–68. 10.1053/j.seminoncol.2009.12.002View ArticleGoogle Scholar
- Dobbin KK, Zhao Y, Simon RM: How large a training set is needed to develop a classifier for microarray data? Clin Cancer Res 2008, 14: 108–14. 10.1158/1078-0432.CCR-07-0443View ArticlePubMedGoogle Scholar
- Dobbin KK, Simon RM: Sample size planning for developing classifiers using high-dimensional DNA microarray data. Biostatistics 2007, 8: 101–117. 10.1093/biostatistics/kxj036View ArticlePubMedGoogle Scholar
- Braga-Neto UM, Dougherty ER: Is cross-validation valid for small-sample microarray classification? Bioinformatics 2004, 20(3):374–80. 10.1093/bioinformatics/btg419View ArticlePubMedGoogle Scholar
- Molinaro AM, Simon R, Pfeiffer RM: Prediction error estimation: a comparison of resampling methods. Bioinformatics 2005, 21(15):3301–7. 10.1093/bioinformatics/bti499View ArticlePubMedGoogle Scholar
- Dudoit S, van der Laan M: Multiple Testing Procedures with Applications to Genomics. New York: Springer; 2008.View ArticleGoogle Scholar
- Hogg R, Tannis E: Probability and Statistical Inference. 8th edition. Prentice Hall: Pearson; 2010.Google Scholar
- Theodorescu D, Wittke S, Ross MM, Walden M, Conaway M, Just I, Mischak H, Frierson HF: Discovery and validation of new protein biomarkers for urothelial cancer: a prospective analysis. Lancet Oncol 2006, 7(3):230–40. 10.1016/S1470-2045(06)70584-8View ArticlePubMedGoogle Scholar
- Wittke S, Mischak H, Walden M, Kolch W, Radler T, Wiedemann K: Discovery of biomarkers in human urine and cerebrospinal fluid by capillary electrophoresis coupled to mass spectrometry: towards new diagnostic and therapeutic approaches. Electrophoresis 2005, 26(7–8):1476–87. 10.1002/elps.200410140View ArticlePubMedGoogle Scholar
- Neuhoff N, Kaiser T, Wittke S, Krebs R, Pitt A, Bur-chard A, Sundmacher A, Schlegelberger B, Kolch W, Mischak H: Mass spectrometry for the detection of differentially expressed proteins: a comparison of surface-enhanced laser desorption/ionization and capillary electrophoresis/mass spectrometry. Rapid Commun Mass Spectrom 2004, 18(2):149–56. 10.1002/rcm.1294View ArticlePubMedGoogle Scholar
- Coon JJ, Zurbig P, Dakna M, Dominiczak AF, Decramer S, Fliser D, Frommberger M, Golovko I, Good DM, Herget-Rosenthal S, Jankowski J, Julian BA, Kellmann M, Kolch W, Massy Z, Novak J, Rossing K, Schanstra JP, Schiffer E, Theodorescu D, Vanholder R, Weissinger EM, Mischak H, Schmitt-Kopplin P: CE-MS analysis of the human urinary proteome for biomarker discovery and disease diagnostics. Proteomics Clin Appl 2008, 2: 964. 10.1002/prca.200800024View ArticlePubMedPubMed CentralGoogle Scholar
- Alkhalaf A, Zürbig P, Bakker SJL, Bilo HJG, Cerna M, Fischer C, Fuchs S, Janssen B, Medek K, Mischak H, Roob JM, Rossing K, Rossing P, Rychlík I, Sourij H, Tiran B, Winklhofer-Roob BM, Navis GJ, for the PREDICTIONS Group: Multicentric Validation of Proteomic Biomarkers in Urine Specific for Diabetic Nephropathy. PLoS ONE 2010, 5(10):e13421. [http://dx.doi.org/10.1371%2Fjournal.pone.0013421] 10.1371/journal.pone.0013421View ArticlePubMedPubMed CentralGoogle Scholar
- Maahs DM, Siwy J, Argilés n, Cerna M, Delles C, Dominiczak AF, Gayrard N, Iphöfer A, Jänsch L, Jerums G, Medek K, Mischak H, Navis GJ, Roob JM, Rossing K, Rossing P, Rychlík I, Schiffer E, Schmieder RE, Wascher TC, Winklhofer-Roob BM, Zimmerli LU, Zürbig P, Snell-Bergeon JK: Urinary Collagen Fragments Are Significantly Altered in Diabetes: A Link to Pathophysiology. PLoS ONE 2010, 5(9):e13051. [http://dx.doi.org/10.1371%2Fjournal.pone.0013051] 10.1371/journal.pone.0013051View ArticlePubMedPubMed CentralGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria; 2010. [http://www.R-project.org] [ISBN 3-900051-07-0]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.