Genetic algorithm learning as a robust approach to RNA editing site prediction

Background RNA editing is one of several post-transcriptional modifications that may contribute to organismal complexity in the face of limited gene complement in a genome. One form, known as C → U editing, appears to exist in a wide range of organisms, but most instances of this form of RNA editing have been discovered serendipitously. With the large amount of genomic and transcriptomic data now available, a computational analysis could provide a more rapid means of identifying novel sites of C → U RNA editing. Previous efforts have had some success but also some limitations. We present a computational method for identifying C → U RNA editing sites in genomic sequences that is both robust and generalizable. We evaluate its potential use on the best data set available for these purposes: C → U editing sites in plant mitochondrial genomes. Results Our method is derived from a machine learning approach known as a genetic algorithm. REGAL (RNA Editing site prediction by Genetic Algorithm Learning) is 87% accurate when tested on three mitochondrial genomes, with an overall sensitivity of 82% and an overall specificity of 91%. REGAL's performance significantly improves on other ab initio approaches to predicting RNA editing sites in this data set. REGAL has a comparable sensitivity and higher specificity than approaches which rely on sequence homology, and it has the advantage that strong sequence conservation is not required for reliable prediction of edit sites. Conclusion Our results suggest that ab initio methods can generate robust classifiers of putative edit sites, and we highlight the value of combinatorial approaches as embodied by genetic algorithms. We present REGAL as one approach with the potential to be generalized to other organisms exhibiting C → U RNA editing.


INTRODUCTION
Tinidazole is chemically 1-(2-ethylsulfonyl-ethyl)-2-methyl-5-nitroimidazole ( Figure 1). It is active against protozoa and anaerobic bacteria and is used like metronidazole in a range of infections [1]. The drug is reported to hydrolyze quantitatively in alkaline conditions to 2-methyl-5nitroimidazole and under photolytic conditions, the drug yields intermediate, rearrangement, and degradation products [2].
Resonance light scattering (RLS) is an elastic scattering and occurs when an incident beam in energy is close to an absorption band. Pasternack et al. first established the RLS technique to study the biological macromolecules by means of an ordinary fluorescence spectrometer [3][4][5]. Due to their high sensitivity, selectivity, and convenience, RLS studies have attracted great interest among researchers [6][7][8][9][10]. RLS has emerged as a very attractive technique that has been used to monitor molecular assemblies and characterize the extended aggregates of chromophores. In recent years, RLS technique has been used to determine pharmaceutical [11,12] and various biological macromolecules such as nucleic acid [13,14], protein [15,16], metal ion [17], and bacteria [18], while the study and determination of tinidazole with RLS technique were not yet reported.
Several analytical methods for tinidazole have been developed so far such as HPLC [19], LC-MS [20], capillary electrophoresis [21], spectrophotometry [22], voltammetry [23], and electrochemical methods [24,25]. Among these analytical methods, the voltammetry method according to Chinese Pharmacopoeia is popular and regarded relatively reliable for the determination of tinidazole. Although it often provides very accurate results, it suffered from cost time and complexity. HPLC was also used to determine tinidazole in drugs in Chinese Pharmacopoeia with good result but needs tedious pretreatment.
Herein, we report a robust, quick, and simple method for the determination of tinidazole in injections and tablets with NaB(C 6 H 5 ) 4 as a probe by RLS technique. The obtained results were almost in agreement with those obtained by the currently used HPLC method according to Chinese Pharmacopoeia.

Apparatus
RLS spectra were obtained by synchronous scanning in the wavelength region from 250 to 750 nm on a JASCO FP-6500 spectrofluorometer (Tokyo, Japan) using quartz cuvettes (1.0 cm). The width of excitation and emission slits was set at 3.0 nm. HPLC analysis was carried out on an Agilent 1100 HPLC system (USA) equipped with G1314A isocratic pump, a thermostatted column compartment, a variablewavelength UV detector (VWD), and Agilent ChemStation software. The pH measurements were carried out on a PHS-3C exact digital pH meter equipped with Phoenix Ag−AgCl reference electrode (Cole-Palmer Instrument Co., Ill, USA), which was calibrated with standard pH buffer solutions.

Reagents
A working solution of tetraphenylboron sodium (10.0 mg mL −1 ) was prepared with methanol-water solution (20 : 80, v/v). A stock solution of tinidazole was prepared by dissolving tinidazole (> 99.99%, Sigma) in the doubly distilled water. The working solutions of tinidazole were obtained by diluting the stock solution prior to use. Sulfuric acid solution (0.18 mol L −1 ) was used to control the acidity, while 0.1 mol L −1 NaCl was used to adjust the ionic strength of the aqueous solutions. All other reagents and solvents were of analytical reagent grade and used without further purification unless otherwise noted. All aqueous solutions were prepared using newly double-distilled water.

Scheme
The composition of precipitate was determined by the Job-Asmus method [26]. The molar ratio tinidazole: TPB was found to be 1 : 1. It is possible that stronger basic secondary amine group in the molecule of tinidazole was transferred to cationic ion and reacted with tetraphenylboron. The precipitation reaction may be as follows.

Standard procedure
An appropriate aliquot of tinidazole working solution was added to a mixture of 1.0 mL of tetraphenylboron sodium solution (10.0 mg mL −1 ), and 1.0 mL sulfuric acid (0.18 mol L −1 ) and diluted to 10 mL with water. After standing for five minutes later, the solution was scanned on the fluorophotometer in the region of 250 to 750 nm with Δλ = 0 nm. The obtained RLS spectrum was recorded and its intensity was measured at 569.5 nm. The enhanced RLS intensity of tinidazole-TPB system was represented as ΔI = I − I 0 (I and I 0 were the RLS intensities of the system with and without tinidazole). The operations were carried out at room temperature.
The HPLC separation was performed on Kromasil ODS column (250 mm × 4.6 mm, 5 μm, Hanbon Science & Technology Co., Ltd) connected with a Zorbax SB-C 18 guard column (20 mm × 4 mm, 5 μm). The mobile phase consisted of methanol and 0.1% acetic acid aqueous solution (20 : 80, v/v) and the flow-rate was 1.0 mL/min. The volume of sample injected was 20 μL. The monitoring wavelength was 310 nm. The column temperature was set at 25 • C.

Sample
The injections of tinidazole were diluted 100 to 200 folds with pure water. The tablets of tinidazole were dissolved in 500 mL pure water and filtered through a 0.45 μm cellulose acetate membrane. A 1.0 mL aliquot of the prepared sample solutions was added to a 10 mL volumetric flask instead of tinidazole standard solution.

Characteristics of the RLS spectra
The RLS spectrum of B(C 6 H 5 ) 4 −Na in sulfuric acid solution (0.018 mol L −1 ) is shown in Figure 2b. It can be seen that the RLS intensity of B(C 6 H 5 ) 4 −Na is quite weak in the whole scanning wavelength region. In contrast, upon addition of trace amount of tinidazole to B(C 6 H 5 ) 4 −Na solution, a remarkably enhanced RLS with a maximum peak at 569.5 nm was observed under the same conditions ( Figure 2, c-g). It can be clearly observed that there were two peaks located at 452.0 and 569.5 nm in the RLS spectrum of tinidazole-TPB system. The addition of increasing tinidazole to the B(C 6 H 5 ) 4 −Na solution leads to the gradual enhancement in RLS intensity, exhibiting a concentration-dependent relationship. The production of RLS and its intensity are correlative with the formation of the aggregate and its particle dimension in solution [3].
As shown in Figures 2a and 2b, when the RLS intensities of tinidazole and NaB(C 6 H 5 ) 4 were considered alone, they were quite weak. It thus can be concluded that B(C 6 H 5 ) − 4 anion reacted with tinidazole and produced a new-formed compound whose RLS intensity was much higher than that of tinidazole or NaB(C 6 H 5 ) 4 when they existed separately. Moreover, the dimension of tinidazole-TPB particles may be much less than the incident wavelength, and thus the enhanced light-scattering signal occurs under the given conditions. In this way, the resonance light scattering formula [26] could be applicable to the tinidazole-TPB system.

Effects of pH values in medium
The newly formed tinidazole-TPB compound may be ascribed to the higher electrostatic attraction between TPB and tinidazole than that of the coexistent sodium ion. Moreover, the RLS is relevant to the dimension of the formed aggregated species. Hence, the pH value may exert certain influence on the attraction strength and the dimension of suspension particles, and thus the RLS production and its intensity. As shown in Figure 3, the RLS intensity of NaB(C 6 H 5 ) 4 Xin Yu Jiang et al.  solution did not change with the variation of pH in range of 1.44-6.44, whereas that of the tinidazole-TPB system presented different traits. The RLS intensity of the tinidazole TPB decreased from pH 1.44 to 6.44. Acidity strongly affected the form of ammonium ion, which reacted with the B(C 6 H 5 ) − 4 . A maximum RLS intensity was obtained around pH 1.44 and this value was selected for the subsequent measurements.

Effect of ionic strength
There existed high concentration of sodium chloride (0.9%) in tinidazole samples such as injections. Did the large amounts of Na + and Cl − affect the RLS spectra of tinidazole-TPB system? The Na + and Cl − may interfere with the electrostatic attraction between TPB and tinidazole. Herein, sodium chloride was used to maintain the ionic strength of the solution. The unexpected observation is that both of the RLS intensity of TPB-Na and tinidazole-TPB system hardly changed with the concentration changes of added NaCl (Figure 4). Therefore, the system can be allowed in the solutions with high ionic strength such as injections.

Addition orders
The effect of addition order on the RLS intensity is listed in  no large effect on the RLS intensity. The proposed assay of tinidazole has a wide pH range.

Stability
The formation process of tinidazole-TPB particles includes three steps: nucleation, crystal growth, and aggregation, which will affect the sizes of the particles directly. Because the size of the particles is one important factor deciding RLS intensity, stabilizer must be used to control the size of particles, prevent the rapid sedimentation of the particles, and improve the reproducibility of RLS intensities of solutions. To improve the reproducibility of RLS intensity of a suspension system, it is crucial to impede the rapid sedimentation of the particles. In this regard, various stabilizers were usually used. However, tinidazole-TPB system is very stable within 20 minutes ( Figure 5) and the average deviation of RLS signal was found to be lower than 2.28%.

Tolerance of foreign substances
Some cationic and anionic species normally found in injections and tablets were studied by the addition of foreign  substances. Their concentration relative to tinidazole and the corresponding influence to the determination are displayed in Table 2. Table 2 shows that few coexisting ions interfere with the determination of tinidazole. Common ions such as Na + , Ca 2+ , Ba 2+ , Mg 2+ , Zn 2+ , Co 2+ , Cu 2+ , Al 3+ , and Pb 2+ can be tolerated at high concentrations because they did not combine with B(C 6 H 5 ) − 4 . However, some ions such as K + and NH + 4 can only be tolerated at very low concentration (10 μg mL −1 ). In the studied species, NH + 4 and K + were affected seriously due to similarity of ionic radius. However, NH + 4 and K + were nearly absent in the sample, so it would not interfere with the determination. The most abundant Na + would interfere at the concentration of up to 1000 times than that of tinidazole. Because Na + was studied by adding NaAc and Ac − had more molecular weight than Na + , the tolerant level of Ac − was larger. The results demonstrated that the addition of CO 2− 3 and PO 3− 4 in excess of 1000 folds in concentration relative to tinidazole can induce moderate RLS signal. This may be due to the formation of extended aggregate around tinidazole-TPB particle cores by the relatively higher negatively charged ions of CO 2− 3 and PO 3− 4 . Other studied ions have nearly no effects on the determination when their concentration was the same as or more than tinidazole. Due to the good selectivity of this method, assays can be performed without removing other coexisting ions.

Detection and quantification limits
The detection limit was calculated as s b + 3s, where s b is the average signal of ten blank solutions and s the standard deviation. The quantification limit was calculated as s b + 10s, where s b is the average RLS signal of ten blank solutions and s the standard deviation. When the RLS intensity at 569.5 nm was selected, the detection limit and quantification limit were calculated to be 5.0 μg mL −1 and 10.0 μg mL −1 , respectively, indicating high sensitivity of this method for the determination of tinidazole. The sensitivity of the RLS method is prominently higher than that of turbidimetry (results are not presented).

Detected wavelength and calibration curves
From the RLS spectra (Figure 2) of tinidazole-TPB system, three peaks are located at 452.0 and 569.5 nm. The maximum RLS peak is located at 569.5 nm. Calibration curves were determined for five different concentrations of tinidazole standard solutions under these two wavelengths. Each calibration sample was detected in triplicate. According to the above standard procedure, the calibration curves were obtained by plotting the concentration of tinidazole against the intensity of RLS spectra at 452.0 and 569.5 nm (Figure 6). Table 3 lists the parameters and correlation coefficients of the calibration plots with two wavelengths. The ΔI(y) and the tinidazole concentrations (x) were fit to the linear function. The results of the regression analysis were then used to back-calculate the concentration results from the ΔI, and the back-calculated concentrations and appropriate summary statistics (mean, standard deviation (SD), and percent relative standard deviation (RSD)) were calculated and presented in tabular form.  From Table 3, detected wavelength has obvious effect on the linear relationship of this method. To different detected wavelength, the RLS intensity of the system is also different. It offers a wide detected range for different concentration of tinidazole in samples. The lowest detection limit and quantification limit took place at 569.5 nm, because 569.5 nm is the maximum RLS peak.

Precision
The precision study was comprised of repeatability and reproducibility studies. These were developed in five different samples. The repeatability was established by analyzing the samples five times. The reproducibility was determined by analyzing each sample on three different days over about one month. The repeatability and the reproducibility are < 2.37% and < 3.96%, respectively. These results indicate that the present method can be used for quantitative analyses of tinidazole.

Recovery
To establish the accuracy of the method, this procedure was also performed on tinidazole added to samples. Table 4 shows the recoveries of tinidazole applying this analytical method. From Table 4, good results are obtained with the recovery range of 95.13-106.76%.

Comparison of RLS and HPLC methods
As shown in Table 5, the proposed method was applied to determine tinidazole concentration in injections and tablets. The attained results were compared with that of HPLC method. From Table 5, it was seen that the RLS results were in agreement with the HPLC method according to Chinese Pharmacopoeia. The average RSD of the RLS method is 0.79%-1.83%, which is slightly lower than that of the HPLC method (1.22%-2.48%), which proved that the RLS assay of tinidazole in drugs was practical.
In this paper, we compared two methods to analyze tinidazole in injections and tablets. These two methods, RLS and HPLC, can give similar results for tinidazole content in drugs (Table 5). However, the operations of RLS and HPLC methods were significantly different. The HPLC method appears to suffer from complexity and cost time, whereas the RLS method described here is robust, cost effective, and simple while retaining sufficient sensitivity. It took more than 10 minutes for an HPLC analysis, but only 1 minute for RLS analysis. Second, the RLS analysis was not affected by small variation in temperature, so it could be carried out at room temperature. But the temperature had a significant effect on the HPLC analysis. The HPLC column temperature was set  at a fixed temperature. Third, the RLS analysis did not use organic solvent, but toxic acetonitrile is used in the HPLC analysis.

Mechanism discussion
Light scattering is caused by the presence of fine particles. Because the dimension of tinidazole-TPB particles is much less than the incident wavelength, it should be in accordance with the resonance light scattering formula, which is shown as follows [27]: where R(θ) is the resonance light ratio at the scattering angle θ, which is equal to the ratio of the scattering intensity of incident light I(θ) at the angle θ to the intensity of incident light I 0 ; n 1 and n 0 are the refractive indices of solute and medium, respectively; N 0 is the number of particles per unit volume; υ is the volume of the particle; and λ is the wavelength of incident light in the medium. If c is the concentration of tinidazole-TPB solution, and ρ is the density of each particle, so N 0 υ is equal to c/ρ. The formula above can be expressed as In the experiment, θ is 90 • , υ remained nearly constant because the experiment conditions such as acidity and the adding volume of stabilizer and other reagents were kept as identical as possible to obtain the same size particles; n 1 , n 0 , λ, and ρ were all constant. According to (2), RLS intensity is proportional to the concentration of tinidazole-TPB suspen-sion (c) or the number of particles in the unit volume (N 0 ). Therefore, tinidazole can be determined based on this theory.

CONCLUSION
In this contribution, we proposed a resonance light scattering technique to determine tinidazole in drugs. The analytical results showed that our method is rapid, sensitive, selective, and potential to be put into practice. This method may also be a valuable approach for the development of detection of tinidazole in serum.