Skip to main content


Archived Comments for: Mining gene expression data by interpreting principal components

Back to article

  1. PCA provides a really systemic appreciation

    Alessandro Giuliani, Istituto Superiore di Sanita

    11 April 2007

    This paper is, in my opinion, the by far most clear and up to the point paper I ever read on the analysis of microarray data. This paper allowed me to make PCA to be understood by biologists after many years of strenuous but more or less useless efforts.Statistics is the most neglected (while the by far most useful) field for approaching the so called Systems Biology and PCA is the best technique to analyse biological systems while not superimposing non justified mathematical assumptions. The problem is that PCA needs a TRULY SYSTEMIC mind and this is lacking in many scientists looking for 'the specific gene that does the work' or, on a complementary but very similar mood 'the deterministic model exactly explaining what I have observed by mathematics'. PCA is in the middle between pure post-hoc statistics like t-test (I need a confirmation that a difference is big enough to be published) and differential-equation style modeling (if gene A is activated by transcription factor B the I can wrote an equation that..). That middle way is were biology is (the supersensitivity of hard mathematical models make them ridiculous for biology and the classical inferential tests when applied to thousands of variables are devoid of any sense, no matter of any Bonferroni-like medicine), and this is probably the reason why PCA is around by more than one century (since the pioneering work by Spearman in 1904) in biomedical science. But PCA implies the scientist to acquire a very unusual (for scientists not for normal peolple) attitude on his ideas, he must dare to admit the SINGLE MEASURES HE PERFORMS ARE NOT SO CRUCIAL AND IMPORTANT FOR THE SYSTEM AT HAND AND THEY ARE ONLY PALE SHADOWS OF THE REAL THING. An appreciation of the real thing comes only from the correlation structure of the measures and implies new constructs (the components) basically different from the ingredients he puts into (the variables being them genes or whatsoever), this is still disturbing, this is why this paper is beautiful.

    Competing interests

    No Competing Interests