- Open Access
Efficient discovery of abundant post-translational modifications and spectral pairs using peptide mass and retention time differences
© Fu et al; licensee BioMed Central Ltd. 2009
- Published: 30 January 2009
Peptide identification via tandem mass spectrometry is the basic task of current proteomics research. Due to the complexity of mass spectra, the majority of mass spectra cannot be interpreted at present. The existence of unexpected or unknown protein post-translational modifications is a major reason.
This paper describes an efficient and sequence database-independent approach to detecting abundant post-translational modifications in high-accuracy peptide mass spectra. The approach is based on the observation that the spectra of a modified peptide and its unmodified counterpart are correlated with each other in their peptide masses and retention time. Frequently occurring peptide mass differences in a data set imply possible modifications, while small and consistent retention time differences provide orthogonal supporting evidence. We propose to use a bivariate Gaussian mixture model to discriminate modification-related spectral pairs from random ones. Due to the use of two-dimensional information, accurate modification masses and confident spectral pairs can be determined as well as the quantitative influences of modifications on peptide retention time.
Experiments on two glycoprotein data sets demonstrate that our method can effectively detect abundant modifications and spectral pairs. By including the discovered modifications into database search or by propagating peptide assignments between paired spectra, an average of 10% more spectra are interpreted.
- Peptide Propagation
- Spectral Pair
- Retention Time Shift
- Cation Modification
- Paired Spectrum
Identification of peptides, especially post-translationally modified peptides, using liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) is the basic task of current proteomics research [1–3]. Database search is the most widely used computational approach to peptide identification from mass spectra [4–9]. Other approaches include de novo sequencing [10–13] and tag-based approach [14–17]. However, due to the complexity of mass spectra, the majority (70–90%) of them cannot be interpreted at present . Among many reasons for the low interpretation rate of mass spectra, unexpected or unknown peptide modifications is a major one [19, 20].
Identification of modified peptides is usually conducted in a restrictive manner; that is, a set of variable modifications are specified before database search. However, there are hundreds of known natural or artificial modifications (563 entries in the Unimod  database up to July 28, 2008). Most of them have multiple specific sites. Therefore, it is no practical to select all the modifications for database search, since this will lead to combinatorial explosion of search space as well as increased chance of random matches. In current popular search engines, such as SEQUEST  and Mascot , no more than ten variable modification types are allowed. The problem is that in most cases we know little about which modifications occur in the protein sample and exist in the mass spectra in hand. Most of the time, oxidation on methionine is the only variable modification specified for database search. As a result, a large amount of spectra from modified peptides have not been interpreted in the past.
To address the above problem, unrestrictive approaches to modification identification have been proposed in recent years [22–25]. MS-Alignment is the first algorithm for unrestrictive identification of modifications [22, 26], which aligns the experimental spectrum against the theoretical spectrum predicted from a peptide in the database in a modification-tolerant manner, just like the sequence alignment in genomics. In this way, any modifications can be identified as long as the two spectra compared present enough similarities. SPIDER formulates modification identification as a dynamic programming problem, searching for a modified peptide that minimizes the difference between the de novo sequenced and the database peptides .
Although the above approaches to unrestrictive identification of modifications are useful and attractive, they involve time-consuming database search processes or rely on good spectrum quality. If we can know the real types of modifications presented in the spectra prior to database search, the time spent will be reduced significantly. In fact, several methods have been proposed to detect modifications independently of sequence databases [19, 27, 28]. Due to the dynamic nature of modifications, the modified and unmodified forms of the same peptide often exist simultaneously in a protein sample. Mass spectra of modified and unmodified peptides are correlated with each other in their peptide masses, LC retention time and fragment peaks. Savitski et al.  proposed to use the peptide mass difference histogram constructed from paired spectra to detect modifications. However, their method builds on the complementary use of CAD and ECD fragmentation modes in the mass spectrometer, and thus is not applicable to current common proteomic experiments where only CAD or ECD is used. Potthast et al.  described a method called mass distance fingerprint to detect modification types from common proteomic mass spectra. In contrast to their rather complicated statistical model of mass distribution, they used the peptide mass information only, limiting the confidence of discoveries. Bandeira et al.  proposed to detect modification-related spectral pairs by comparing the peptide fragmentation data. Their method is applicable to both abundant and low-concentration modifications, but at the cost of computational efficiency.
This paper describes a simple yet efficient approach to detecting abundant modifications in high-accuracy peptide mass spectra using both peptide mass and retention time information. Each pair of spectra is represented by a two-dimensional (2-D) vector composed of the mass difference and the retention time difference between their precursor ions. A bivariate Gaussian mixture model is used to discriminate modification-related spectral pairs from random ones. In this way, accurate modification masses and confident spectral pairs can be obtained. We also use a peptide propagation method to assign peptides to modification spectra at given false discovery rate (FDR) without searching a database. Experiments on two glycoprotein data sets demonstrate the effectiveness of our method.
Although our method is at present unable (not designed) to find low-concentration modifications, it possesses several advantages compared to previous methods to detect modifications and spectral pairs:
Only the peptide mass and retention time information is used. Computing the spectra similarity based on peptide fragmentation data enables detection of low-concentration modifications but is very time-consuming. Clustering spectra according to the 2-D peptide mass and retention time data is demonstrated to be very fast. Abundant modifications and corresponding spectral pairs can be efficiently detected in this way as shown in Results section of the paper.
Peptide retention time is used in addition to peptide mass. Although using peptide masses alone can also reveal modifications, retention time provides an independent source of supporting evidence. Two features are far more discriminative in detecting modification-related spectral pairs than only one feature. Therefore, modification masses and identities can be more accurately determined.
Our approach to modification and spectral pair discovery is independent of peptide fragmentation data. It is known that modified peptides often have complex fragmentation patterns. For example, phosphorylated peptides often undergo insufficient fragmentation, resulting in dominant neutral-loss precursor peaks. Peptides of cation modifications have even poorer fragmentation spectra with reduced fragment signals according to our observation. Therefore, those methods using peptide fragmentation data to measure the spectra similarity are unreliable to detect pairs of poorly fragmented spectra. The method proposed in this paper is naturally immune to this problem.
Roughly speaking, we use two steps to look for modifications. In the first step, the peptide masses are used alone to obtain a list of candidate modification masses. In the second step, for each candidate modification mass, the retention time is added to validate the modification. Each possible pair of spectra is represented by a 2-D vector of peptide mass difference and retention time difference. Then, a bivariate Gaussian mixture model is used to differentiate modification-related spectral pairs from random ones. By doing this, the accurate modification mass values can be estimated and the influence of the modifications on retention time can be characterized. Before all of these analyses, the spectra data set is first reduced to remove redundancy, as described below.
Removal of spectra redundancy
In tandem mass spectrometry, abundant peptides will produce many duplicate spectra, leading to data redundancy. The disadvantages of spectra redundancy in our problem lie in two folds. First, redundancy brings high computational burden. Since we are dealing with the peptide mass and retention time differences between all pairs of spectra, the size of the problem increases quadratically with the number of spectra. Second and no less important, the spectra redundancy may cause unexpected effect on the distributions of peptide mass and retention time differences. The retention time of abundant peptides may possibly distribute across a large range. This is obviously undesirable for our analysis. To remove the spectra redundancy, among all the spectra whose peptide mass values are within a small window (e.g. 5 ppm for FT instruments), only one of them is reserved as the representative spectrum and all other spectra are removed. Of course, spectra of close peptide masses may not be produced by the same peptide and some spectra may be falsely removed as redundant spectra. However, this hardly affects our analysis. We find that a representative subset of spectra is already enough to reveal the dominant modifications.
Detection of candidate modification masses
Validation of candidate modification masses
After a list of candidate modification masses are found through the above steps, the next task is to filter out false positives and select a high-confidence subset, since it is possible that the high frequencies of some Δm found in step 1 may be just due to chance rather than real modifications. Given that we do not want to use the fragment peak information (for simplicity and efficiency), it is necessary to look for other type of evidence to verify the candidate modification masses.
In LC-MS/MS experiments, in addition to the peptide masses, the retention time of peptides is another type of information readily available. Since the peptide retention time is orthogonal to the peptide mass, it is an important source of evidence to validate the modifications. The modified and the unmodified forms of a peptide share the same amino acid sequence and differ by a modification group only. As a result, they are similar in physical and chemical properties and thus LC behaviour, i.e. retention time. A modification usually has a relatively small and consistent effect on the retention time of peptides. Therefore, we can expect that if a high-frequency Δm is truly due to a modification, spectral pairs of this Δm will show consistent differences in peptide retention time. For example, a modification may tend to increase or decrease the retention time of peptides to some extent, or simply have no significant influence at all. If a candidate modification mass is accompanied with a wild distribution of retention time differences, this candidate is very likely to be a false alarm.
Estimation of mixture distribution
Each pair of spectra is represented by a 2-D vector (denoted by Δ) composed of peptide mass difference Δm and retention time difference ΔRt between the two spectra, i.e.,
Δ = ⟨Δm, ΔRt⟩.
where G(μ, Σ) is the Gaussian distribution with mean μ and covariance matrix Σ, subscripts Rand and Mod denote random and modification-related spectral pairs, respectively, and α Rand and α Mod are mixing coefficients. The parameters of the mixture distribution can be estimated using the Expectation-Maximization (EM) algorithm.
Detection of spectral pairs
Spectral pairs with posterior probability larger than the given threshold are accepted as true modification-related spectral pairs.
Propagation of peptide assignments
After high-confidence modification-related spectral pairs are found out, the peptide assigned to one spectrum can be propagated to its paired spectrum. It is expected that many spectra that cannot be assigned peptides to via standard database search can be identified in this way. This is especially useful for modification spectra of low signal/noise ratios. For example, according to our observation, cations usually prevent peptides from sufficient fragmentation, resulting in few signal fragment peaks. In addition, some modifications can occur at many sites. For example, according to the annotations in the Unimod database, the carbamidomethylation can possibly occur at the N-term of any peptides and five amino acids anywhere. It is not practical to specify too many variable modifications in current database search engines, which will greatly increase the time of database search and the chance of false positive identifications. Therefore, direct propagation of peptide assignments among paired spectra will be a promising and efficient approach to identifying modified peptides and increasing the interpretation rate of spectra, a major expectation in the proteomics community.
Inference of modification types
Modifications and spectral pairs found in data set 2.
# of spectral pairs (pp> 0.98)
Unimod AC #
Water-loss/Glu → pyro-Glu*
We used two data sets of mass spectra to test our algorithm. Both data sets were produced from glycoprotein samples. The reason lies in that in order to identify glycosylated peptides, protein samples have to undergo complex treatments, which can possibly introduce chemical modifications. We show that modifications can be efficiently discovered using our algorithm and either including the modifications into the database search process or directly propagating peptides between paired spectra can significantly increase the number of spectra interpreted.
Data set 1
Ig G depleted human plasma from a healthy donor was mixed with WGA, Con A and JAC lectins to enrich most glycoproteins. Then the glycoproteins were reduced, alkylated and digested by trypsin and PNGase F, followed by LC-MS2 analysis. After treating the glycopeptides with PNGase F, the deglycosylated peptides had a +0.984 Da mass drift on asparagine to aspartate. LC-MS2 experiments were performed on an LTQ-FT mass spectrometer. The LTQ-FT mass spectrometer was operated in the data-dependent mode. A full scan survey MS experiment was acquired in the FT-ICR mass spectrometer, and the five most abundant ions detected in the full scan were analyzed by MS2 scan events. This resulted in a total of 8,654 MS2 mass spectra.
Data set 2
Ig G depleted human plasma from a healthy donor was mixed with LCH lectins to enrich core fucosylated glycoproteins; then the glycoproteins were reduced, alkylated and digested by trypsin and Endo F3 (treatment with Endo F3 released partial oligosaccharide chain, and leaved the fucosyl GlcNAc reside on the peptides). Samples were separated by SCX and RP HPLC, and then sent to an LTQ-FT mass spectrometer. A full scan survey MS experiment was acquired in the FT-ICR mass spectrometer, and the five most abundant ions detected in the full scan were analyzed by MS2 scan events. An MS3 spectrum was automatically collected when one of the top three intense peaks from the MS2 spectrum corresponded to a neutral loss event of 73.0290 m/z, 48.6860 m/z and 36.5145 m/z. Among the resulted MS3 mass spectra, those that were sure to be from core fucosylated glycopeptides were selected out intelligently in a post-processing step . The final data set consists of a total of 1,528 MS3 mass spectra considered to have been generated from core fucosylated glycopeptides.
Discovered modifications and spectral pairs
Modifications and spectral pairs found in data set 1
# of spectral pairs (pp> 0.98)*
Theoretical mass (Da)
Unimod AC #
Ammonia-loss/Gln → pyro-Glu
Methylation/Asp → Glu
Among the discovered modifications, some are common in LC-MS/MS experiments, e.g., oxidation, carbamidomethylation and water-/ammonia-loss. Carbamidomethylation is usually specified as a fixed modification on cysteine as an artifact for database search. However, we show here that carbamidomethylation may occur on more specific sites abundantly in a variable manner. Modifications of important biological functions are also found. Methylation is an in vivo post-translational modification, while deamidation can occur both in vivo and in vitro. Other modifications may have been introduced by special sample treatments or from samples themselves, e.g., the sodium cation coming from digestion buffer of glycosidase or lectin binding buffer commonly used in the glycoproteomics research and the iron cation coming from hematoglobins in plasma.
Peptide identification and propagation results on data set 1 (FDR < 2%)
# of peptides identified
# of spectra interpreted
Improvements on peptide identification
There are two ways to make use of the found modification types to improve peptide identification. The most common way is to include them as variable modifications into the database search process. The other is to directly propagate peptide between paired spectra without re-searching the database. Both experiments are conducted. Tables 3 and 4 give the peptide identification results on data sets 1 and 2, respectively. Database search is performed using the pFind search engine [7, 31] and the target-decoy database method  is used to estimate the FDR of search results. For data set 1, we include the deamidation (at N) introduced by sample preparation and the common oxidation (at M) as variable modifications in the first round of database search. Then, the most abundant carbamidomethylation (at N-term and K) is added in the second round of database search. As a result, 301 more spectra are interpreted, although the increase in the number of peptide identifications is trivial. Based on the first-round database search results, peptide propagation is carried out. It turns out that 537 spectra with the carbamidomethylation modification are assigned with peptides at the same FDR (2%) as database search (we use the maximum posterior error probability as a conservative estimate of FDR, which in theory is larger than actual FDR). Moreover, after peptide propagation between spectra paired by the cation:Na modification, 134 more spectra are assigned with peptides. On data set 2, inclusion of oxidation into the database search increases the number of interpreted spectra by 22, while peptide propagation between spectral pairs related by the cation:Fe(II), oxidation and deamidation increases the number of interpreted spectra by 143, 33 and 19, respectively. Note that cation modifications, e.g. Na and Fe adduct cannot be added into database search as variable modifications. This is because it is not clearly known which amino acids the cations are attached to and the spectra with cation modifications usually have very low signal/noise ratio according to our observation. On average, an increase of 10% in spectra interpretation rate is obtained on these two data sets after considering the discovered modifications.
To sum up, experimental results demonstrate that including the detected modification types into database search can increase the number of interpreted spectra to some extent, while direct peptide propagation between spectral pairs related by modifications can interpret even more spectra. Moreover, it seems that spectra with cation modifications can only be effectively assigned with peptides by peptide propagation.
Peptide identification and propagation results on data set 2 (FDR < 2%)
# of peptides identified
# of spectra interpreted
This work was supported by the National High Technology Research and Development Program (863) of China under Grant Nos. 2007AA02Z315, 2007AA02Z326 and 2006AA02A308, the National Key Basic Research & Development Program (973) of China under Grant Nos. 2002CB713807 and 2006CB910801, the CAS Knowledge Innovation Program under Grant No. KGGX1-YW-13 and the National Natural Science Foundation of China under Grant No. 20735005.
This article has been published as part of BMC Bioinformatics Volume 10 Supplement 1, 2009: Proceedings of The Seventh Asia Pacific Bioinformatics Conference (APBC) 2009. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/10?issue=S1
- Aebersold R, Mann M: Mass spectrometry-based proteomics. Nature. 2003, 422: 198-207. 10.1038/nature01511.View ArticlePubMedGoogle Scholar
- Mann M, Jensen ON: Proteomic analysis of post-translational modifications. Nat Biotechnol. 2003, 21 (3): 255-261. 10.1038/nbt0303-255.View ArticlePubMedGoogle Scholar
- Witze ES, Old WM, Resing KA, Ahn NG: Mapping protein post-translational modifications with mass spectrometry. Nat Methods. 2007, 4 (10): 798-806. 10.1038/nmeth1100.View ArticlePubMedGoogle Scholar
- Eng JK, McCormack AL, Yates JR: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994, 5: 976-989. 10.1016/1044-0305(94)80016-2.View ArticlePubMedGoogle Scholar
- Perkins DN, Pappin DJ, Creasy DM, Cottrell JS: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999, 20: 3551-3567. 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2.View ArticlePubMedGoogle Scholar
- Craig R, Beavis RC: TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004, 20: 1466-1467. 10.1093/bioinformatics/bth092.View ArticlePubMedGoogle Scholar
- Fu Y, Yang Q, Sun R, Li D, Zeng R, Ling CX, Gao W: Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry. Bioinformatics. 2004, 20: 1948-1954. 10.1093/bioinformatics/bth186.View ArticlePubMedGoogle Scholar
- Colinge J, Masselot A, Giron M, Dessingy T, Magnin J: OLAV: towards high-throughput tandem mass spectrometry data identification. Proteomics. 2003, 3: 1454-1463. 10.1002/pmic.200300485.View ArticlePubMedGoogle Scholar
- Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M, Maynard DM, Yang X, Shi W, Bryant SH: Open mass spectrometry search algorithm. J Proteome Res. 2004, 3 (5): 958-964. 10.1021/pr0499491.View ArticlePubMedGoogle Scholar
- Dancik V, Addona TA, Clauser KR, Vath JE, Pevzner PA: De novo peptide sequencing via tandem mass spectrometry. J Comput Biol. 1999, 6: 327-342. 10.1089/106652799318300.View ArticlePubMedGoogle Scholar
- Frank A, Pevzner P: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem. 2005, 77: 964-973. 10.1021/ac048788h.View ArticlePubMedGoogle Scholar
- Ma B, Zhang KZ, Hendrie C, Liang CZ, Li M, Doherty-Kirby A, Lajoie G: PEAKS: powerful software for peptide de novo sequencing by MS/MS. Rapid Commun Mass Spectrom. 2003, 17: 2337-2342. 10.1002/rcm.1196.View ArticlePubMedGoogle Scholar
- Taylor JA, Johnson RS: Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal Chem. 2001, 73: 2594-2604. 10.1021/ac001196o.View ArticlePubMedGoogle Scholar
- Hernandez P, Gras R, Frey J, Appel RD: Popitam: towards new heuristic strategies to improve protein identification from tandem mass spectrometry data. Proteomics. 2003, 3: 870-878. 10.1002/pmic.200300402.View ArticlePubMedGoogle Scholar
- Mann M, Wilm M: Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal Chem. 1994, 66: 4390-4399. 10.1021/ac00096a002.View ArticlePubMedGoogle Scholar
- Tabb DL, Saraf A, Yates JR: GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal Chem. 2003, 75: 6415-6421. 10.1021/ac0347462.PubMed CentralView ArticlePubMedGoogle Scholar
- Tanner S, Shu H, Frank A, Wang LC, Zandi E, Mumby M, Pevzner PA, Bafna V: InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal Chem. 2005, 77 (14): 4626-4639. 10.1021/ac050102d.View ArticlePubMedGoogle Scholar
- Keller A, Nesvizhskii AI, Kolker E, Aebersold R: Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database Search. Anal Chem. 2002, 74: 5383-5392. 10.1021/ac025747h.View ArticlePubMedGoogle Scholar
- Savitski MM, Nielsen ML, Zubarev RA: ModifiComb, a new proteomic tool for mapping substoichiometric post-translational modifications, finding novel types of modifications, and fingerprinting complex protein mixtures. Mol Cell Proteomics. 2006, 5 (5): 935-948. 10.1074/mcp.T500034-MCP200.View ArticlePubMedGoogle Scholar
- Bandeira N: Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications. Biotechniques. 2007, 42 (6): 687, 689, 691 passim.Google Scholar
- UNIMOD. [http://www.unimod.org/]
- Tsur D, Tanner S, Zandi E, Bafna V, Pevzner PA: Identification of post-translational modifications by blind search of mass spectra. Nat Biotechnol. 2005, 23 (12): 1562-1567. 10.1038/nbt1168.View ArticlePubMedGoogle Scholar
- Tanner S, Pevzner PA, Bafna V: Unrestrictive identification of post-translational modifications through peptide mass spectrometry. Nat Protoc. 2006, 1 (1): 67-72. 10.1038/nprot.2006.10.View ArticlePubMedGoogle Scholar
- Searle BC, Dasari S, Turner M, Reddy AP, Choi D, Wilmarth PA, McCormack AL, David LL, Nagalla SR: High-throughput identification of proteins and unanticipated sequence modifications using a mass-based alignment algorithm for MS/MS de novo sequencing results. Anal Chem. 2004, 76 (8): 2220-2230. 10.1021/ac035258x.View ArticlePubMedGoogle Scholar
- Han Y, Ma B, Zhang K: SPIDER: software for protein identification from sequence tags containing de Novo sequencing error. IEEE 2004 Computational Systems Bioinformatics Conference: 2004. 2004, 206-215.Google Scholar
- Pevzner PA, Mulyukov Z, Dancik V, Tang CL: Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. Genome Res. 2001, 11: 290-299. 10.1101/gr.154101.PubMed CentralView ArticlePubMedGoogle Scholar
- Potthast F, Gerrits B, Hakkinen J, Rutishauser D, Ahrens CH, Roschitzki B, Baerenfaller K, Munton RP, Walther P, Gehrig P: The Mass Distance Fingerprint: a statistical framework for de novo detection of predominant modifications using high-accuracy mass spectrometry. J Chromatogr B Analyt Technol Biomed Life Sci. 2007, 854 (1–2): 173-182.View ArticlePubMedGoogle Scholar
- Bandeira N, Tsur D, Frank A, Pevzner P: A new approach to protein identification. 10th Annual International Conference on Research in Computational Molecular Biology. 2006Google Scholar
- Jia W, Lu Z, Fu Y: A Strategy for Precise and Large-Scale Identification of Core Fucosylated Glycoproteins. 2008,Google Scholar
- Medzihradszky KF, Spencer DI, Sharma SK, Bhatia J, Pedley RB, Read DA, Begent RH, Chester KA: Glycoforms obtained by expression in Pichia pastoris improve cancer targeting potential of a recombinant antibody-enzyme fusion protein. Glycobiology. 2004, 14 (1): 27-37. 10.1093/glycob/cwh001.View ArticlePubMedGoogle Scholar
- Wang LH, Li DQ, Fu Y, Wang HP, Zhang JF, Yuan ZF, Sun RX, Zeng R, He SM, Gao W: pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Commun Mass Spectrom. 2007, 21 (18): 2985-2991. 10.1002/rcm.3173.View ArticlePubMedGoogle Scholar
- Elias JE, Gygi SP: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods. 2007, 4 (3): 207-214. 10.1038/nmeth1019.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.