Thermodynamic scaling behavior in genechips
© Ferrantini et al; licensee BioMed Central Ltd. 2009
Received: 18 August 2008
Accepted: 06 January 2009
Published: 06 January 2009
Affymetrix Genechips are characterized by probe pairs, a perfect match (PM) and a mismatch (MM) probe differing by a single nucleotide. Most of the data preprocessing algorithms neglect MM signals, as it was shown that MMs cannot be used as estimators of the non-specific hybridization as originally proposed by Affymetrix. The aim of this paper is to study in detail on a large number of experiments the behavior of the average PM/MM ratio. This is taken as an indicator of the quality of the hybridization and, when compared between different chip series, of the quality of the chip design.
About 250 different GeneChip hybridizations performed at the VIB Microarray Facility for Homo sapiens, Drosophila melanogaster, and Arabidopsis thaliana were analyzed. The investigation of such a large set of data from the same source minimizes systematic experimental variations that may arise from differences in protocols or from different laboratories. The PM/MM ratios are derived theoretically from thermodynamic laws and a link is made with the sequence of PM and MM probe, more specifically with their central nucleotide triplets.
The PM/MM ratios subdivided according to the different central nucleotides triplets follow qualitatively those deduced from the hybridization free energies in solution. It is shown also that the PM and MM histograms are related by a simple scale transformation, in agreement with what is to be expected from hybridization thermodynamics. Different quantitative behavior is observed on the different chip organisms analyzed, suggesting that some organism chips have superior probe design compared to others.
In Affymetrix Genechips the DNA sequences attached to the chip surface are synthesized in situ by photolitography . The technology is limited to 25-mers, which are rather short oligos to provide a reliable measurement of the abundance of the complementary sequence in solution. Fortunately the same technology allows to place many more different sequences on a same given surface area compared to spotted arrays. Hence the loss of specificity due to the short length of the oligos can be compensated by the large number of sequences. For expression measurements typically a given mRNA transcript is interrogated by a probe set constituted by 10 to 20 probe pairs. Each probe pair is composed of two short oligonucleotides (25 pb). One matches with a part of the given mRNA transcript and it is called perfect match (PM). The mismatch probe (MM) has the same sequence, except at position 13, i.e. exactly in the middle nucleotide of the 25-mer. We use here the standard nomenclature of probes for the single stranded sequences attached to the chip surface and of targets for the transcript in solution.
The original purpose of the PM/MM design was to use the differences I PM - I MM as estimators of the probe specific binding. This is however problematic for two reasons: 1) For a fair amount of probes I MM > I PM (bright mismatches, which can occur in 30% of the probes in some chips, as discussed by ) and 2) in spike-in experiments, in which some target sequences are added in solution at well-known concentrations, I MM increases with increasing target concentration. Concerning point 1), Binder and Preibisch  showed that bright MM are predominantly observed at low intensities, i.e. for weakly expressed genes. Hence their origin is most likely correlated to non-specific binding favoring MM compared to PM. Point 2) reveals a more fundamental problem: the increase of the MM signal following an increase of target concentration implies that hybridization of the almost complementary sequence to a target cannot be neglected. Therefore MMs cannot be used as estimators of non-specific hybridization.
In this paper we demonstrate that MMs still hold some valuable information. In particular, we will focus on the average ratio of background subtracted PM and MM intensities. On general grounds, one expects that this ratio is constant and independent of the details of the type of experiment (e.g., tissue or organism analyzed). Our study focuses on three organisms: H. sapiens, A. thaliana, and D. melanogaster, for which in total more than 250 chips were analyzed, by means of the Langscal package presented here. The data were obtained over a period of about 4 years at the Microarray Facility of the Flemish Institute for Biotechnology (VIB). By investigating a large set of data from the same source we minimize the systematic variations between experiments that may arise from either slight differences in protocols or from different laboratories. The results presented here extend the findings of a previous study on some randomly selected experiments from the Gene Expression Omnibus server (, http://www.ncbi.nlm.nih.gov/projects/geo/). Indeed, the data from the same laboratory show a much more coherent behavior compared to the previous analysis, although systematic deviations for different organisms are clearly observed.
The Langmuir model, which has been used also in previous studies of Affymetrix Genechips (see e.g. [2, 5]), predicts that the background subtracted PM/MM ratio depends only on the difference in chemical affinities (ΔΔG in Eq. (2)) between the transcript sequence binding to a perfect matching probe and the same sequence binding to a probe with an internal mismatch. This quantity is related to the hybridization chemistry and depends only on the probe sequence. Therefore the ratio is expected to be independent on the biological details, as the organism or tissue analyzed. Our study shows that this is only partially true. Through a scaling analysis of PM and MM histograms we show that experiments in same organism have a quite stable PM/MM ratios. However, when different chips are compared, this ratio differs considerably.
We argue that these differences are due to the chip design. In the three organisms analyzed we found that the human chip had the smallest PM/MM ratio, and this ratio varies in two different human chips considered: HGU95 and HGU133, where the latter is based on a more recently annotated genome. In addition, the analysis of the PM/MM ratio and thermodynamic scaling can be used as a quality control for a global check of the performance of an experiment. Langscal is an R-package designed for this kind of analysis . It computes average PM/MM ratios and investigates scaling properties of intensity histograms. This information provides insights about the agreement of the experimental data with hybridization thermodynamics and also a global check of the quality of an experiment. Langscal is freely available at http://www.microarrays.be/Langscal.htm.
Results and discussion
There has been some discussion about the variability of the saturation threshold (A in Eq. (1)) for different probes  and about the differences in the PM and MM saturation, possibly due to post-hybridization effects . The sudden drop at high intensity in the histograms analyzed in our study, however, suggests a probe independent threshold as expected from the Langmuir isotherm. In addition, the analysis of Affymetrix spike-in data  shows a good agreement with the Langmuir model with a unique probe-independent saturation threshold A. It is difficult to assign a threshold to the MM histograms as they decay rapidly at high intensity: no MM probes have an intensity above 10, 000 in the A. thaliana chip of Fig. 1, while just a few MM probes are above this value in D. melanogaster. However the data are not inconsistent with a common saturation value for PM and MM histograms.
After having presented some global features of the histograms we proceed with the illustration of the results of the computations of the PM/MM ratios, which are based on Eq. (5) and (6) in Methods. These equations, deduced from the Langmuir model far from saturation, link the PM and MM histograms to each other by a simple scale transformation. The link holds for probes with a similar type of mismatch. As explained in Methods, this boils down to separate the PM/MM pairs to those having a common central triplet. There are in total 64 different triplets, which we label with the symbol α. The starting point of the analysis is the calculation of the background level I0.
The Langscal package estimates I0 as the crossing point between the global PM and MM histograms of intensities, as it is shown in Fig. 1. The user can eventually refine this estimate by providing his own value of I0. Afterwards the 64 ratios a α are calculated from the median of the background subtracted PM/MM ratios, following Eq. (5).
Figure 2 also shows the MM histograms in which the background subtracted intensity is multiplied by a factor α s , which corresponds to a horizontal shift in the log-log scale. The fact that the PM and shifted MM histograms overlap is a proof of the validity of Eq. (6), which shows that the experimental data are consistent with the linearized Langmuir model. The accuracy of the overlap can be interpreted as an estimator of the quality of the hybridization and its consistency with equilibrium thermodynamics. Sometimes the histogram rescaling may be not as good as in Fig. 2, as shown in few a examples in Additional file 1. The poor overlap is a sign of an inconsistency with the scaling form predicted from Eq. (6) and can have different origins, which will be discussed in the next section.
To summarize the triplet analysis with a single chip parameter, we also performed the rescaling of Eq. (6) for the histogram over the whole chip, i.e. ignoring the differences between triplets. The global factor a is then found by requiring the overlap between PM and shifted MM histograms. An example of such global rescaling is shown in Fig. 3 for an experiment on D. melanogaster. In this case the average ratio between the background subtracted PM and MM intensities is a ≃ 6.5. Note that the histograms are smoother than those shown in Fig. 2, as they are extended to the intensities of the whole chip. The rescaled MM histogram (dashed line) overlaps well with the PM histogram in most of the intensity range. A deviation is observed in the high intensities regime where the PM histogram drops and splits from the rescaled MM histogram. This behavior is described by the denominator in the Langmuir isotherm in Eq. (1). As mentioned before, the histogram rescaling (Eq. (6)) is only valid not too close to the saturation level, a regime in which the denominator of Eq. (1) can be neglected. As also stated before, the saturation regime only involves a few hundred PM probes, a small fraction of the total number of probes.
Very few outliers are observed outside the boxplot range. One example is a human chip with a ≃ 1, a value which implies overlapping PM and MM histograms. This is a chip with an anomalously high intensity background signal and it is as well flagged as unsatisfactory hybridization based on the Affymetrix guidelines and from image inspection (http://www.affymetrix.com, see Table 1 in the Additional file 1).
Three other Drosophila chips were found to have a substantially bigger value of the PM/MM ratio, compared to the average Drosophila value a ≃ 5. The analysis of the standard quality checks according to the Affymetrix guidelines for these three chips does not show any kind of low quality behavior. These chips come all from the same experimental batch, which suggests that the sample extraction, preparation and amplification may influence the value of a. In general (see Conclusions) one expects that a large value of the PM/MM ratio is a signature of good hybridization quality. Thus these 3 Drosophila chips with high a's are probably the result of an experiment of particularly good quality.
One of the first concerns raised by DNA microarrays was that of quality control and check of the reliability of the technology. A microarray experiment consists of many biochemical steps, which contain each many potential sources of variations, influencing the overall performance of the experiment. Therefore it is extremely important to have tools for the quality assessment in order to be able to eventually discard experiments in which the quality does not meet certain standards. Some of these tools indeed exist, as e.g. the Bioconductor package affyPLM [11, 12], which is based on probe level data analysis, and simpleAffy , which allows to assess the Affymetrix quality standards.
This paper presents another tool for the quality control of hybridization in Affymetrix Genechips, which relies on the analysis of PM and MM histograms. Through this analysis one can check the agreement of the data with the hybridization thermodynamics, which predicts that the intensity is linked to the target concentration and hybridization free energy according to the Langmuir isotherm of Eq. (1). A consequence of the Langmuir isotherm is that the background subtracted PM/MM ratio of Eq. (2) is universal, i.e. it should depend only on the nature of the nucleotides neighboring the mismatch site and not on the biological details of the experiment, as organism or tissue investigated. The Langmuir isotherm predicts that the histograms of background subtracted PM and MM intensities should scale on top of each other when a shift of the MM intensity scale is applied.
The Langscal package presented here has been designed with the purpose of checking the validity of these hypothesis and to determine the PM/MM ratios for all different mismatches which are organized in triplets. In agreement with previous results  the values of a α follow the hierarchy expected from experimental studies involving DNA/RNA hybridization in solution [9, 10]. The previous study , done on a series of randomly selected experiments, showed a large variability from experiment to experiment of the parameters a α . In the data presented here, and when focusing on the same organism, one finds stable values of the a α , except for a few outliers. This shows that the different protocols used, as hybridization characteristics or the stringency of washing may influence considerably the PM/MM ratio.
Quantitatively there is, however, a difference of at least one order of magnitude between the values of a α deduced from hybridization studies in solution and those which are actually measured on the microarray.
There are several reasons for these differences. For instance, hybridization in solution is different from microarray hybridization which occurs in the vicinity of a solid surface and the value of ΔΔG obtained from hybridization in solution  may not adequately describe hybridization in a chip. The fact that a α is lower than its solution counterpart is in agreement with the idea that hybridization is impeded by the vicinity of the surface, leading to an overall lowering of ΔG. A second effect is that target sequences are typically fragmented before hybridization. Such fragmentation may lead to hybridization to the MM probe of two fragments that skip the 13th mismatching nucleotide. If such events are frequent, then the estimate of Eq. (2), which links the PM/MM ratio to ΔΔG is clearly inadequate. Irrespectively of the origin of these differences, one could anyhow take the parameters a α as measurements of the hybridization quality.
High values for a α suggest that the hybridization took place in conditions close to the thermodynamic equilibrium in which both strands are in solution.
In order to summarize the analysis with a single quality parameter we have used the global PM/MM ratio over the whole chip, a, where no distinction is made between triplets. The boxplot of the a measured in different experiments for the three different organisms analyzed is shown in Fig. 4. The first surprising result is that the global PM/MM ratio depends on the organism studied, while hybridization thermodynamics (see Eq.(1) and (2)) suggests that it should only depend on the chemistry of the hybridization process and not on any other biological details as the tissue, organism, or the biological conditions used. This is however only true in the case of an optimal probe design in which stringent criteria for probe uniqueness would ensure a dominant specific hybridization with the desired target sequence. When these criteria are not met, other target sequences may bind significantly to the probe and they can do so more efficiently to a MM than to a PM and therefore modify the PM/MM ratio.
Using a as a measurement for the hybridization quality and according to the large amount of data analyzed, we found that the best performance (highest a value) is that of A. thaliana chips, with an average ratio of about 6.5, while the poorest is that of the human chips (average a ≃ 2.5). It is maybe not surprising to find the lowest values for a for the human chips. Within the three organisms investigated, the human genome is by far the most complex one in terms, for instance, of alternative splicing events or of Single Nucleotide Polymorphisms. These complications may lead to PM/MM pairs which do not perform as expected from hybridization thermodynamics.
The correlation between the average ratio a and the probe design can be clearly seen in Fig. 6. This shows an increase from a ≃ 1.4 for HGU95 to a ≃ 2.3 for HGU133. The difference between these two human chipsets is that the HGU133 is more recent and has a more up to date annotation compared to the older chipset HGU95. An improvement in the chip design leads thus to an increase in the parameter a, consistent with the idea that this parameter can be used as a measure of the quality of the hybridization.
There are several possible problems in an experiment as excessive target fragmentation or sample contamination. In addition, a chip can contain annotation errors, which could lead for instance for a MM sequence to be perfect matching to other transcripts. These effects, which are not considered in the present analysis, would typically lead to lower the intensity differences between PM and MM pairs hence to a lower than average PM/MM ratio. When comparing experiments from different laboratories one may find a wide range of values for the average PM/MM ratio, because the experimental protocols and sample preparation can considerably influence the ratio. Nonetheless, a systematic study of this ratio in different chip organisms and experimental designs could be very useful for a better understanding of the origin of the lab-to-lab variability, which is a weakness to current microarrays.
In conclusion our work started from some theoretical predictions on the scaling behavior of PM and MM histograms. On the view of the fact that the analysis is based on a simple model, with only one adjustable parameter (the average background intensity value), we can conclude that the scaling hypothesis is satisfactorily verified in most of the experiments. Also the hierarchy of values of triplet-dependent PM/MM ratios follows the same inequalities as expected from equilibrium thermodynamics . These ratios are however rather constant only for experiments within the same chip, and not for different organisms chips. This result is at odds with thermodynamics and it is probably due to problems with the chip design. So far the analysis of the shape of microarray intensity histograms has received little attention in the literature. We think that such analysis provides interesting insights on the global performance of an experiment and on a better understanding to the relationship between PM and MM. The Langscal package, designed to perform such an analysis, is useful for a global quality check, together with other quality control methods [11–13] prior to expression level estimation.
Langmuir hybridization thermodynamics
where I0 is the background intensity level, that one would measure in absence of a target (limit c → 0).
The factor A is a proportionality constant, R the gas constant, and T the temperature. The hybridization free energy ΔG η , which is sequence-dependent, measures the affinity of the target-probe binding. Note that the specific bindings to a PM or MM probe are treated on equal footing, but with different hybridization free energies ΔG PM and ΔG MM . DNA hybridization thermodynamics  predicts the highest affinity for perfect matching hybridization compared to hybridization with one or more internal mismatches, hence ΔG PM > ΔG MM .
where ΔΔG = ΔG PM - ΔG MM is the difference between the hybridization free energies of a PM and a MM. Note that in the ratio of Eq. (2) the target concentration c (expression level) cancels out. Hence, according to the model, the background subtracted ratio for a given PM/MM pair should be constant across different experiments, as this quantity depends only on ΔΔG, which in turn depends only on the sequence composition. Individual PM and MM signals may of course vary because of changes of expression levels c in different experiments, as it is clear from Eq. (1). Actually, as discussed below, ΔΔG is expected to be dependent only on the central nucleotide triplet. A consequence of the hybridization thermodynamics is that the ratio of Eq. (2) is constant throughout experiments for PM/MM pairs with the same central nucleotide triplet.
The use of a constant background value I0 is an approximation. Algorithms for the computations of the expression level typically use background estimators which are probe dependent (see e.g. ). In the present work, however Equation (2) will be tested on an average of typically several hundred probe pairs sharing the same central nucleotide triplet. As robust estimator the median over individual ratios will be taken. We verified that a change of 10% in Eq. (2) leads to a change of about 5% of the average ratios in Fig. 4, while a smaller change of 5% of the background would lead to a change of less than 2%. We conclude that results on average ratios do not critically depend on the choice of background value. In order to keep the analysis simple a single constant background value is chosen.
where d and r indicate the DNA probe and RNA target strands, respectively. The dots indicate the flanking nucleotides: as these are identical in the two stretches of the sequence they give equal contribution to both ΔG's and hence cancel out in ΔΔG. The result is thus that ΔΔG depends only on the identity of the central nucleotide triplet.
The substitution rule A ↔ T and C ↔ G implies that 4 types of mismatches can be generated (i.e., AA, CC, GG, and TU). Taking into account the two flanking nucleotides, one ends up with 64 possible central triplets. Using Eq. (4) and the free energies given in  and  one can find estimates for ΔΔG of 8 triplets, as experimental values are available only for these. From the given ΔΔG's and setting T = 45°C in Eq. (2), one finds typical values of a in the range of 40 to 3, 000. These values are obtained from estimates of ΔG parameters for hybridizations in which both strands are in solution. They will be compared with actual PM/MM ratio determined from the analysis of the experimental data in the next section.
Scaling of PM and MM histograms
where ⟨⟩ α denotes the median over all probe pairs at fixed central triplet α. If the triplets are roughly equally frequent, a chip with 500, 000 sequences will approximately have about 8, 000 PM/MM pairs with a given α. In the previous equation we take for simplicity I0 constant in the whole chip and further restrict to probe pairs that satisfy the inequalities: I PM > I0 and I MM > I0.
In practice, one can check the validity of Eq. (6) by plotting the PM and MM histograms on the same graph, but with two different scales on the horizontal axis: the MM background subtracted intensities should be rescaled by a factor a α . The overlap of the two histograms rescaled in this way is a direct proof of the validity of the Langmuir model and also of the consistency of the experimental data with thermodynamics. The package Langscal produces these rescaling plots for all the 64 triplets.
Finally, to characterize the PM/MM ratios of one experiment with a single parameter, the rescaling of the histograms of Eq. (6) can also be extended to all intensities on the chip.
The experiments analyzed
We demonstrate the usage of the Langscal package on a randomly chosen set of Affymetrix hybridizations, performed at the VIB MicroArray Facility. The data set comprises 91 Affymetrix chips of array design ATH1-121501, designed for A. thaliana, 48 hybridizations on the Drosophila_2 chip, and 135 hybridizations of HG-U133_Plus_2 chip design, for H. sapiens. The arrays were processed by different lab-technicians over a period of 4 years.
The procedures, including amplification, labeling, and fragmentation, were performed manually or were automated by a Biomek-3000 ArrayPlex work station (Beckman-Coulter, Analysis Belgium). However, the procedures followed a standardized protocol with possibly small variations due to changes in kits provided by Affymetrix or in enzymes or other ingredients. In general, total RNA was controlled for its integrity and purity using Agilent Bioanalyzer and Nanodrop spectrophotometer http://www.nanodrop.com, respectively. Only RNA without signs of degradation or impurities (260/280 and 260/230 > 1.8) was used for further processes. Probes were prepared from 2–5 μg total RNA according to Affymetrix's guidelines. Briefly, from total RNA, poly-A RNA was reversed transcribed using a poly dT-T7 primer and labeled during a T7 in-vitro transcription reaction using the Affymetrix IVT Labeling Kit (Affymetrix, High Wycombe, UK). The probes were purified (GeneChip Sample Cleanup Module, Affymetrix, UK) and analyzed again for yield (30–120 μg) and purity (260/280 and 260/230 > 1.8). 20 μg from the resulting aRNA was fragmented with alkaline hydrolysis and resuspended with control spikes in 300 μl hybridization buffer (Eukaryotic Hybridization Control Kit, Affymetrix, High Wycombe, UK). The Genechips were hybridized in a rotisseri oven at 45°C and washed and stained in the GeneChip Fluidics Station-400 or -450 (Affymetrix, UK) using EukGE-WS2 protocol. Arrays were scanned with the GeneChip Scanner 3000 (Affymetrix, UK) and image analysis was performed in GCOS.
The R-package Langscal which performs the above analysis is freely available at http://www.microarrays.be/Langscal.htm.
We acknowledge financial support from FWO (Research Foundation – Flanders) grants n. G.0297.06 and n. G.0311.08.
- Lipshutz RJ, Fodor SPA, Gingeras TR, Lockhart DJ: High density synthetic oligonucleotide arrays. Nat Genet 1999, 21: 20. 10.1038/4447View ArticlePubMedGoogle Scholar
- Naef F, Magnasco MO: Solving the riddle of the bright mismatches: Labeling and effective binding in oligonucleotide arrays. Phys Rev E 2003, 68: 011906. 10.1103/PhysRevE.68.011906View ArticleGoogle Scholar
- Binder H, Preibisch S: Specific and nonspecific hybridization of oligonucleotide probes on microarrays. Biophys J 2005, 89: 337. 10.1529/biophysj.104.055343PubMed CentralView ArticlePubMedGoogle Scholar
- Ferrantini A, Carlon E: On the relationship between perfect matches and mismatches in Affymetrix Genechips. Gene 2008.Google Scholar
- Held GA, Grinstein G, Tu Y: Modeling of DNA microarray data by using physical properties of hybridization. Proc Natl Acad Sci 2003, 100: 7575. 10.1073/pnas.0832500100PubMed CentralView ArticlePubMedGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria; 2008. [http://www.R-project.org]Google Scholar
- Burden CJ, Pittelkow Y, Wilson SR: Adsorption models of hybridization and post-hybridization behaviour on oligonucleotide microarrays. J Phys: Cond Matt 2006, 18(23):5545. 10.1088/0953-8984/18/23/024Google Scholar
- Carlon E, Heim T: Thermodynamics of RNA/DNA hybridization in high-density oligonucleotide microarrays. Physica A 2006, 362: 433. 10.1016/j.physa.2005.09.067View ArticleGoogle Scholar
- Sugimoto N, Nakano S, Katoh M, Matsumura A, Nakamuta H, Ohmichi T, Yoneyama M, Sasaki M: Thermodynamic Parameters To Predict Stability of RNA/DNA Hybrid Duplexes. Biochemistry 1995, 34: 11211. 10.1021/bi00035a029View ArticlePubMedGoogle Scholar
- Sugimoto N, Nakano M, Nakano S: Thermodynamics-Structure Relationship of Single Mismatches in RNA/DNA Duplexes. Biochemistry 2000, 39: 11270. 10.1021/bi000819pView ArticlePubMedGoogle Scholar
- Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19: 185. 10.1093/bioinformatics/19.2.185View ArticlePubMedGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4: 249. 10.1093/biostatistics/4.2.249View ArticlePubMedGoogle Scholar
- Wilson CL, Miller CJ: Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics 2005, 21(18):3683–3685. 10.1093/bioinformatics/bti605View ArticlePubMedGoogle Scholar
- Zhang L, Miles MF, Aldape KD: A model of molecular interactions on short oligonucleotide microarrays. Nature Biotech 2003, 21: 818. 10.1038/nbt836View ArticleGoogle Scholar
- Bloomfield VA, Crothers DM, Tinoco I Jr: Nucleic Acids Structures, Properties and Functions. University Science Books, Mill Valley; 2000.Google Scholar
- Kroll KM, Barkema GT, Carlon E: Modeling background intensity in DNA microarrays. Phys Rev E 2008, 77: 061915. 10.1103/PhysRevE.77.061915View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.