Progressive search in tandem mass spectrometry

Joh, Yoonsung; Lee, Kangbae; Kim, Hyunwoo; Park, Heejin

doi:10.1186/s12859-023-05222-2

Software
Open access
Published: 14 March 2023

Progressive search in tandem mass spectrometry

Yoonsung Joh¹,
Kangbae Lee¹,
Hyunwoo Kim² &
…
Heejin Park ORCID: orcid.org/0000-0002-8608-5994¹

BMC Bioinformatics volume 24, Article number: 94 (2023) Cite this article

1836 Accesses
3 Altmetric
Metrics details

Abstract

Background

High-throughput Proteomics has been accelerated by (tandem) mass spectrometry. However, the slow speed of mass spectra analysis prevents the analysis results from being up-to-date. Tandem mass spectrometry database search requires O(|S||D|) time where S is the set of spectra and D is the set of peptides in a database. With usual values of |S| and |D|, database search is quite time consuming. Meanwhile, the database for search is usually updated every month, with 0.5–2% changes. Although the change in the database is usually very small, it may cause extensive changes in the overall analysis results because individual PSM scores such as deltaCn and E-value depend on the entire search results. Therefore, to keep the search results up-to-date, one needs to perform database search from scratch every time the database is updated, which is very inefficient.

Results

Thus, we present a very efficient method to keep the search results up-to-date where the results are the same as those achieved by the normal search from scratch. This method, called progressive search, runs in O(|S||ΔD|) time on average where ΔD is the difference between the old and the new databases. The experimental results show that the progressive search is up to 53.9 times faster for PSM update only and up to 16.5 times faster for both PSM and E-value update.

Conclusions

Progressive search is a novel approach to efficiently obtain analysis results for updated database in tandem mass spectrometry. Compared to performing a normal search from scratch, progressive search achieves the same results much faster. Progressive search is freely available at: https://isa.hanyang.ac.kr/ProgSearch.html.

Background

Database search in tandem mass spectrometry, usually done by as Sequest [1], Tide [2], Comet [3], Mascot [4], Maxquant [5], MS-GF [6], MSFragger [7], and so on, is quite time consuming: Especially when the number of spectra is large, for example, more than 10 million of spectra [8, 9] and/or the search space is wide such as open search [7].

Meanwhile, protein databases used for search are updated frequently. For example, the most widely used database, Uniprot [10], is updated monthly with 0.5 to 2% changes, which means newly identified protein sequences are inserted and some incorrect sequences are deleted. Although the change is very small, it may cause changes in the overall analysis results because each spectrum score is calculated relatively based on the entire search results. Therefore, to keep the search results up-to-date, one needs to perform database search from scratch every time the database is updated, which is very inefficient.

Thus, we present a very efficient method to keep the search results up-to-date where the results are the same as those achieved by the normal search from scratch. This method, called progressive search, efficiently minimizes the computation time such that our progressive search is much faster than the normal search from scratch. In this study, we applied our progressive search to Comet which is not only incorporated into widely used proteomics pipelines such as Trans-Proteomics Pipeline [11] and Crux [12] but also a stand-alone open source tandem mass spectrometry database search engine. Our experimental results in Figs. 1 and 2 show that progressive search is 16.5–53.9 times faster than the normal search where the database change is 0.16%, the number of tryptic termini is 1, and the number of missed cleavage is 2.

Implementation

Database separation

First, we compare the old database D_old and the new database D_new to identify D_srd, D_del, and D_ins where D_srd contains the proteins shared by both D_old and D_new, D_del contains the proteins stored in only D_old, and D_ins contains the proteins stored in only D_new. Let R_old, R_new, and R_srd denote the PSM results for D_old, D_new, and D_srd, respectively. Figure 3 shows the case that D_old is the set of proteins {A, B, C, D, E} and D_new is the set of proteins {B, C, D, E, F}. Thus, D_srd is the set of proteins {B, C, D, E}, D_del is {A}, and D_ins is {F}.

For experimental results, we used UniProtKB database released from January to June 2020 (Fig. 4). On average, 0.07% and 0.67% of the proteins were deleted and inserted every month, respectively. In addition, 0.09% and 0.70% of the amino acids are deleted and inserted every month on average, respectively.

Workflow

Progressive search consists of four steps called “deletion”, “insertion”, “score calculation”, and “E-value calculation” (Fig. 5). We explain the progressive search that runs in O(|S||ΔD|) time on average where S is the set of spectra and ΔD is the difference between the old and the new databases where |X| denotes the number elements in X.

(1)
Deletion This is the process of obtaining R_srd from R_old. The R_srd is the same as R_old except the PSMs whose peptide sequences are from only D_del. Those PSMs are deleted and replaced by PSMs obtained by searching D_srd for the spectra in the deleted PSMs (S_del). For example, PSMs of scans 1, 3 and 6 are updated after deletion in Fig. 5.
(2)
Insertion This is the process of obtaining R_new from R_srd. We search D_ins for all the spectra to find PSMs. Then the found PSMs are compared with the PSMs in R_srd. The PSMs with better scores are selected and stored in R_new. For example, PSMs of scans 2, 3 and 6 are updated after insertion in Fig. 5.
(3)
Score calculation This is the process of calculating deltaCn values in R_new. deltaCn is a score representing the difference between Xcorr values. Since we got the Xcorr of R_new through previous steps, we can calculate the deltaCn of R_new in this step.
(4)
E-value calculation This is the process of calculating E-values in R_new. Note that the E-value of every PSM may be invalid even if only one of all PSMs has been changed. Since E-value calculation requires all PSM information that has not been output by the original Comet, we built “Comet-E”, a modified version of Comet, to address the E-value correction.

Detailed explanations are given in the following subsections: Deletion, Insertion, Score calculation, and E-value calculation.

Deletion

Algorithm description: The main purpose of deletion is converting R_old into R_srd. Each spectrum in R_old was identified by either D_del or D_srd (Fig. 6, Composition of database). Recall that S_del denote the set of spectra identified by only D_del and let S_srd denote the set of spectra identified by D_srd. While the PSMs for S_srd remain as they are, the PSMs for S_del should be replaced by the PSMs obtained by searching D_srd for S_del. In the example in Fig. 6, among the PSMs for scans 1–7, only the PSMs for scans 1 and 6 are identified with only D_del (= {A}). Thus, S_del consists of spectra in scans 1 and 6. (Note that the PSM for scan 3 belongs to S_srd because its peptide AARASLIEQ exists in both proteins A and C and thus all we have to do is to delete A from the protein list of scan 3.) We search D_srd (= {B, C, D, E}) for S_del and the new results replace the old results of S_del.

Insertion

The main purpose of insertion is converting R_srd into R_new. Each spectrum in R_new is identified by either D_ins or D_srd. First, we search D_ins for the set S. Let R_ins denote the search result. For each spectrum, we replace its PSM in R_srd by its PSM in R_ins if the Xcorr of the PSM in R_ins is higher than that in R_srd. In Fig. 7, we search D_ins (= {F}) for all the spectra and get R_ins. Since only the PSMs of scans 3 and 6 in R_ins have higher Xcorr values (2.91 and 1.51) than those in R_srd (2.03 and 0.83), R_new is obtained by replacing the PSMs of scans 3 and 6 in R_srd with those in R_ins. (Note that the scan 2 result of R_ins is the same as R_srd because its peptide LGGLWSAV exists in both proteins B and F and thus all we have to do is to add F to the protein list of scan 2.) Since the main part of insertion is to search D_ins for the set S, the time complexity is O(|S|∙|D_ins|).

Score calculation

After the deletion and insertion, all PSMs with their Xcorr scores have been updated for D_new. Now, the deltaCn values which are defined as follows should be recalculated.

$${\text{deltaCn}}\left( i \right) \, = { 1 }{-}{\text{ Xcorr}}\left( {i + { 1}} \right) \, /{\text{ Xcorr}}\left( i \right)$$

where Xcorr(i) denote the i-th largest PSM score for a spectrum. Thus, recalculating deltaCn takes O(|S|) (= O(|PSM|)) time in the worst case. In addition, when deltaCn is updated, there are two subtleties to consider as follows.

(i)
Increment of the parameter num_output_lines by 1

In order to calculate deltaCn(i), not only Xcorr(i) but also Xcorr(i + 1) is required. Since the parameter num_output_lines of Comet determines the number of Xcorr values in the output, num_output_lines should be n + 1 if deltaCn(i)’s for $i\le n$ are to be calculated by progressive search (Comet-P or Comet-E). Even though Comet just outputs n lines, it always calculates the Xcorr values for PSMs of all ranks, and thus incrementing num_output_lines by 1 rarely affects the total running time.

(ii)
Xcorr precision refinement in the output

In Comet, the internal data type of Xcorr is double but the Xcorr values in the output of Comet are rounded to the fourth decimal place as shown in Table 1. Thus, the deltaCn(1) calculated by Comet is different from the deltaCn(1) calculated by Xcorr(1) and Xcorr(2) values from the output of Comet as explained in the legend of Table 1. Hence, the Xcorr values in the output of Comet-P/Comet-E are rounded to the seventh decimal place so that the deltaCn calculated by the output of Comet-P/Comet-E is the same as that calculated by Comet.

Table 1 Xcorr precision refinement

Full size table

E-value calculation

The purpose of “E-value calculation” is converting R_new into R_new-E-value. For example, we explain how to calculate E-values of Comet. We built “Comet-E”, a modified version of Comet, to address the E-value correction. Comet-E has two more features than the original Comet. First, it can output the histogram of Xcorr values which was just an intermediate data structure used to calculate E-values in Comet. Second, it can take a histogram of Xcorr values as input and calculate E-values based on the histogram. Let His(R) denote a histogram for a result set R. We calculate His(R_new) as follows: First, we run Comet-E to acquire histograms His(R_del) and His(R_ins). Then, His(R_new) is calculated by “His(R_old) − His(R_del) + His(R_ins)” where His(R_old) was already produced earlier by Comet-E. Finally, His(R_new) is given as input to Comet-E and it recalculates the E-value. Then, R_new is converted into R_new-E-value. Detailed explanations are given in the following subsections i), ii), and iii). Subsection i) explains the E-value calculation by Comet and subsections ii) and iii) explain the two new features of Comet-E.

(i)
E-value calculation by Comet

Comet calculates the E-value for each spectrum based on Xcorr values for all candidate peptides (Fig. 8). Comet needs at least 3000 Xcorr values for each spectrum to calculate its E-value. Comet uses decoy peptides predefined in Comet if the number of Xcorr values is less than 3,000. Then, Comet calculates the histogram of the Xcorr values for each spectrum. The histogram is used to calculate the E-value of each spectrum by the internal scoring function of Comet.

(ii)
Comet-E (output)

Comet-E can output the histogram of Xcorr values for each spectrum (Fig. 9). The histogram consists of Xcorr values for the sequences in the database only, excluding decoy sequences. Unlike Comet, Comet-E outputs the histogram table for every spectrum as a.txt file. Histograms are created with a bin width of 0.1, and has an average of 10 bin counts per spectrum. So, the histogram information (Xcorr counts for all bins) for each spectrum can be represented using only about 20 numbers.

(iii)
Comet-E (E-value recalculation)

Given His(R_new) as input, Comet-E can calculate the E-values of R_new (Fig. 10). Note that His(R_new) is calculated by “His(R_old) − His(R_del) + His(R_ins)”. This calculation is performed for each bin. If there is no output for a bin among histograms, its frequency is assigned to 0. Note that His(R_old) was produced by Comet-E when R_old was generated and His(R_del) and His(R_ins) are produced by Comet-E when R_del and R_ins are generated, respectively. The time complexity of E-value calculation is O(|S|) because it is regardless of the size of database difference and only proportional to the number of spectra.

Results

We measured and compared the running times of Comet, Comet-P (progressive Comet with PSM update only), and Comet-E (progressive Comet with both PSM and E-value update). The databases used were the SwissProt and TrEMBL human protein databases provided by UniProt. And tandem mass spectrometry (MS/MS) spectra for HEK293 cells [13] were used as an input, and the total number of spectra was 1,121,149. We compared them in different parameter settings: In subsection i), we show the results when the difference between D_old and D_new is fixed and the numbers of tryptic termini (ntt) and missed cleavages (mc) change. In subsection ii), we show the results when the difference between D_old and D_new changes and ntt and mc are fixed. The search results of Comet, Comet-P, and Comet-E remain consistent for both PSM and peptide levels (Fig. 11).

The entire experiments were carried out on a Linux PC with an Intel(R) Xeon(R) octa-core CPU E5-2609 v3 @ 1.90 GHz and 36 GB of RAM. The Linux version is Ubuntu 12.04.5 LTS and the compiler is GNU C compiler 6.5.0. All experiments were performed by a single thread.

(i)
Changing the numbers of tryptic termini and missed cleavages

Table 2A shows the running time results when D_old and D_new are fixed to Uniprot 2020.01 and Uniprot 2020.02, respectively and ntt changes from 0 to 1 and mc changes from 0 to 2. Note that the difference between D_old (Uniprot 2020.01) and D_new (Uniprot 2020.02) is 0.16% (Fig. 4 #amino acid). Table 2A shows not only the overall running times of Comet, Comet-P, and Comet-E, but also breaks down the overall running times of Comet-E into the running times of individual modules (database separation, deletion, insertion, and E-value calculation). Note that the running time of Comet-P is the sum of the running times of all individual modules except the E-value calculation. For example, look at the leftmost column ntt2mc0. In this case, Comet, Comet-P, and Comet-E take 7846.9, 459.1, and 1900.1 s, respectively. The 459.1 s which is the running time of Comet-P is the sum of 3.7 s (database separation), 244.1 s (deletion), and 211.3 s (insertion). The 1900.1 s which is the running time of Comet-E is the sum of 459.1 s (Comet-P) and 1441.0 s (E-value calculation).

Table 2 Summary of the running times for various search parameter settings

Full size table

Table 2B shows the statistics of the running time results in Table 2A. The running time ratio rows show the ratios of individual running times to the running time of Comet. Look at the leftmost column ntt2mc0 again. Since the running time of Comet-P is 459.1 s and that of the original Comet is 7846.9, the ratio is 459.1/7846.9 = 0.0585 = 5.85%. Since the ratio is 5.85%, Comet-P is 17.09 (= (1/5.85)*100) times faster than Comet which is shown just below 5.85% in the table. The speedup of Comet-P is between 17.09 (ntt2mc0) and 53.92 (ntt1mc2) and the speedup of Comet-E is between 4.13 (ntt2mc0) and 16.52 (ntt1mc2). Hence, the more nontryptic termini and missed cleavages there are, the bigger the speedup is.

Finally, it should be noted that the E-value calculation time does not change a lot as the nontryptic termini or missed cleavages change. It is between 1261.4 and 1459.9 s as shown in the last row of Table 2A. It may seem strange on a first look but it is reasonable because the time complexity of E-value calculation is just O(|S|) which means it is regardless of the size of database difference.

(ii)
Changing database update interval

Table 3 shows the running times and their statistics of Comet, Comet-P, and Comet-E when ntt and mc are fixed to 1 and 2, respectively and the database update interval changes from 1 to 5 months. In this experiment, D_old is fixed to Uniprot 2020.01 and D_new changes appropriately from Uniprot 2020.02 to Uniprot 2020.06. The ratio |D_del|/|D_new| increases from 0.02% to 0.44% and the ratio |D_ins|/|D_new| also increases from 0.14% to 3.48% as the database update interval increases as shown in the last two rows in Table 3B. Recall that the time complexities of deletion and insertion are O(|S|∙|D_del|) and O(|S|∙|D_ins|), respectively. Thus, their running times are expected to increase as the database update interval increases. As expected, the measured running time of deletion (resp. insertion) increases from 344.3 to 598.3 s (resp. from 288.6 to 1443.9) as the database update interval increases as shown in Table 3A. When it comes to the E-value calculation, since its time complexity is O(|S|), its running time is regardless of the database update interval. It is between 1394.5 and 1496.7 s. Conclusively, the speedup of Comet-P is between 53.92 (1 month) and 17.39 (5 months) and the speedup of Comet-E is between 16.52 (1 month) and 10.23 (5 months) as shown in Table 3B.

Table 3 Summary of the running times for several database update intervals

Full size table

Conclusions

Progressive search is a novel approach to efficiently obtain analysis results for updated database in tandem mass spectrometry. Its running time is O(|S||ΔD|) on average and thus it is up to 53.9 times faster than the normal search from scratch for PSM update only (including the update of PSM scores such as Xcorr and DeltaCn) and up to 16.5 times faster for both PSM and E-value update for the intervals up to 5 months. We also discovered our Progressive search is effective even for longer intervals. Comet-P and Comet-E are 2.5 and 4 times faster than normal search, respectively, even with the interval of 34 months (July 2019 and May 2022 databases) (data not shown). The PSMs and E-values achieved by progressive search are the same as those achieved by the normal search from scratch. In addition, we verified that repeated use of progressive search does not increase the differences in deltaCn values due to rounding. We compared the results from searches for 3-month intervals (between Jan. 2020 and Apr. 2020) with results from 3 repeated searches for 1-month interval (between Jan. 2020 and Feb. 2020, between Feb. 2020 and Mar. 2020, and between Mar. 2020 and Apr. 2020). The deltaCn values were the same in both results although progressive search was used multiple times. This study demonstrates the applicability of Progressive search for efficient tandem mass spectrometry database search. Use of this approach can be extended to a variety of public search tools, including Comet.

Availability and requirements

Project name: progressive search.

Project home page: https://isa.hanyang.ac.kr/ProgSearch.html

Operating system(s): Linux.

Programming language: Java, C + +

Other requirements: JDK 1.8 or higher.

License: Apache License V2.0

Any restrictions to use by non-academics: as stipulated by Apache License V2.0

Availability of data and materials

Experiments were carried out with the August 2019 version of Comet and can be obtained through Comet website: http://comet-ms.sourceforge.net. The database used in this current study are publicly available in the UniProt website: https://www.uniprot.org. The databases used were the SwissProt and TrEMBL human protein databases provided by UniProt. We measured the performance of Progressive Search using tandem mass spectrometry (MS/MS) spectra for HEK293 cells [13]. The HEK293 24-fraction MS/MS dataset was used in the experiment, and the total number of spectra was 1,121,149.

Abbreviations

S :: Set of spectra
D :: Set of peptides in a database
D _new :: New database
D _old :: Old database
D _srd :: Database which contains the proteins shared by both D_old and D_new
D _del :: Database which contains the proteins stored in only D_new
D _ins :: Database which contains the proteins stored in only D_new
R _new :: PSM results for D_new
R _old :: PSM results for D_old
R _srd :: PSM results for D_srd
R _del :: PSM results for D_del
R _ins :: PSM results for D_ins

References

Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5(11):976–89.
Article CAS PubMed Google Scholar
Diament BJ, Noble WS. Faster SEQUEST searching for peptide identification from tandem mass spectra. J Proteome Res. 2011;10(9):3871–9.
Article CAS PubMed PubMed Central Google Scholar
Eng JK, Jahan TA, Hoopmann MR. Comet: an open-source MS/MS sequence database search tool. Proteomics. 2013;13(1):22–4.
Article CAS PubMed Google Scholar
Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry data. ELECTROPHORESIS Int J. 1999;20(18):3551–67.
Article CAS Google Scholar
Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26(12):1367–72.
Article CAS PubMed Google Scholar
Kim S, Gupta N, Pevzner PA. Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J Proteome Res. 2008;7(8):3354–63.
Article CAS PubMed PubMed Central Google Scholar
Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods. 2017;14(5):513–20.
Article CAS PubMed PubMed Central Google Scholar
Frank AM, Bandeira N, Shen Z, Tanner S, Briggs SP, Smith RD, Pevzner PA. Clustering millions of tandem mass spectra. J Proteome Res. 2008;7(01):113–22.
Article CAS PubMed Google Scholar
Griss J, Perez-Riverol Y, Lewis S, Tabb DL, Dianes JA, Del-Toro N, Rurik M, Walzer M, Kohlbacher O, Hermjakob H. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nat Methods. 2016;13(8):651–6.
Article CAS PubMed PubMed Central Google Scholar
Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, et al. The Universal protein resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2006;34:D187-191.
Article CAS PubMed Google Scholar
Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N, Sun Z, Nilsson E, Pratt B, Prazen B. A guided tour of the trans-proteomic pipeline. Proteomics. 2010;10(6):1150–9.
Article CAS PubMed PubMed Central Google Scholar
McIlwain S, Tamura K, Kertesz-Farkas A, Grant CE, Diament B, Frewen B, Howbert JJ, Hoopmann MR, Käll L, Eng JK. Crux: rapid open source protein tandem mass spectrometry analysis. J Proteome Res. 2014;13(10):4488–91.
Article CAS PubMed PubMed Central Google Scholar
Chick JM, Kolippakkam D, Nusinow DP, Zhai B, Rad R, Huttlin EL, Gygi SP. A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides. Nat Biotechnol. 2015;33(7):743–9.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Not applicable

Funding

This work was supported by the National Research Foundation of Korea grant funded by the Korea government (Ministry of Science and ICT) (No. 2018R1A5A7059549 and No. 2021M3H9A2030520), and by the Korea Institute of Science and Technology Information (KISTI) and Korea Bio Data Station (K-BDS) with computing resources including technical support.

Author information

Authors and Affiliations

Department of Computer Science, Hanyang University, Seoul, 06978, Republic of Korea
Yoonsung Joh, Kangbae Lee & Heejin Park
Biomedical Informatics Team, Korea Institute of Science and Technology Information, Daejeon, 34141, Republic of Korea
Hyunwoo Kim

Authors

Yoonsung Joh
View author publications
You can also search for this author in PubMed Google Scholar
Kangbae Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hyunwoo Kim
View author publications
You can also search for this author in PubMed Google Scholar
Heejin Park
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YJ, KL, HK, and HP designed the study. YJ and KL performed computing experiments. All authors wrote, read and approved the final manuscript.

Corresponding author

Correspondence to Heejin Park.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Joh, Y., Lee, K., Kim, H. et al. Progressive search in tandem mass spectrometry. BMC Bioinformatics 24, 94 (2023). https://doi.org/10.1186/s12859-023-05222-2

Download citation

Received: 04 August 2022
Accepted: 03 March 2023
Published: 14 March 2023
DOI: https://doi.org/10.1186/s12859-023-05222-2

Progressive search in tandem mass spectrometry