Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Conformational and functional analysis of molecular dynamics trajectories by Self-Organising Maps

Figure 2

Optimal sampling rate of MD trajectories. Procedure for testing the SOM clustering at different sampling rates. The original data Q (40000 data points for each trajectory) is used to select the "optimal sampling rate of MD trajectories" between the following summarization levels; 20000; 10000; 5000; 2500; 1600; 800; 400; 80 and 40. For each summarization level "n", k samples Q1(n),...,Qk(n) were extracted. Each sample Qi(n) summarizes a trajectory trough "n" randomly selected data points; each data point belongs to one of "n" equal width trajectory intervals. The "k" samples Q1(n),...,Qk(n) are used as learning data sets for SOM1(n),...,SOMk(n). Each SOMi(n) is queried with data sets Qi(n) and Q to obtain the hits pair (hi(n), Hi(n)) which is submitted to a Chi-squared goodness of fit test. In the case where the "k" goodness of fit tests are not rejected, the level of summarization "n" is assumed to summarize the original data set without a significant loss of information.

Back to article page