We have implemented our algorithm lsbuild in Matlab and tested it with a set of model problems on an Intel Core 2 Quad CPU Q9550 2.83 GHz, 4GB of RAM and Linux OS-32 bits. In all experiments the parameters of the function φ
λ,τ
of the algorithm lsbuild were set at λ = 1.0 and at τ = 0.01.
We compared our results with the algorithms dgsol and buildup. The algorithm dgsol proposed by Moré and Wu in [22] uses a continuation approach based on the Gaussian transformation
of the nonsmooth function
where the potentials p
ij
are given by
The algorithm dgsol starts with an approximated solution and, given a sequence of smoothing parameters λ0 > λ1 > ... > λ
p
= 0, it determines a minimizer xk+1of 〈f〉λ. The algorithm dgsol uses the previous minimizer x
k
as the starting point for the search. In this manner a sequence of minimizers x1, ..., xp+1is generated, with the xp+1a minimizer of f and the candidate for the global minimizer. In our experiments, we used the implementation of the algorithm dgsol encoded in language C and downloaded from [23].
We also compared our results with the ones obtained by the version of the algorithm buildup proposed by Sit, Wu and Yuan in [8]. The algorithm buildup starts defining a base set using four points whose distances between all of them are known (a clique of four points). Then, at each iteration, a new point x
k
with known distances to at least four points in the base is selected. In order to avoid the accumulation of errors, instead of just positioning the new point, in the modified version of the algorithm buildup the entire substructure formed by the point x
k
and its neighbors in the base is calculated by solving the nonlinear system
with variables and B being the set formed by the index k and the indexes of all neighbors of x
k
in the current base set. The parameters d
kj
are the given distances between the node x
k
and its neighbors x
j
in the base and, for the nodes x
j
and x
i
already in the base, if the distance between them is unknown, we consider d
ij
= ||x
i
- x
j
||. Once the substructure is obtained, it is inserted in the original structure by an appropriated rotation and translation and the point x
k
is included in the base. This process is repeated until all nodes are included in the base. We have implemented the buildup algorithm in Matlab.
Our decision to compare the lsbuild with the algorithms dgsol and buildup is mainly motivated by theirs similarities with our proposal. In fact, the algorithm dgsol uses a smooth technique in order to avoid the local minimizers and the algorithm buildup solves a sequence of systems which produce partial solutions and iteratively try to construct a candidate to global solution. Our algorithm combines some variations of these two ideas. We use a hyperbolic smooth technique to insert differentiability in the problem and a divide-and-conquer approach based in sucessive solutions of overdetermined linear systems in order to construct a candidate to global solution.
In our experiments, the distance data were derived from the real structural data from the Protein Data Bank (PDB) [24]. It needs to be pointed that each of the algorithms considered has a level of randomness, the algorithm dgsol takes random start point and the algorithms lsbuild and buildup starts with an incomplete random matrix D = [d
ij
] where l
ij
≤ d
ij
≤ u
ij
. So, in order to do a fair comparison, we run each test 30 times.
We considered two set of instances. The first one was proposed by Moré and Wu in order to validate the algorithm dgsol [22]. This set is derived from the three-dimensional structure of the fragments made up of the first 100 an 200 atoms of the chain A of protein PDB:1GPV[25, 26]. For each fragment, we generated a set of constraints considering only atoms in the same residue or the neighboring residues. Formally,
where R(k) represents the k-th residue.
In this set of instances, the bounds l
ij
and u
ij
were given by the equations
where is the real distance between the nodes x
i
and x
j
in the known structure x* of protein PDB:1GPV. In this way, all distances between atoms in the same residue or neighboring residues were considered. We generated two instances for each fragment by taking ε equals to 0.00 and 0.08.
In order to measure the precision of the solutions just with respect to the constraints, without providing any information about the original structure x*, we use the function
(5)
where
is the error associated to the constraint l
ij
≤ ||x
i
- x
j
|| ≤ u
ij
: We also measured the deviation
of the solutions generated by each algorithm with respect to the original solution x* in the PDB files, using the function
(6)
with and orthogonal.
In the second experiment, we use a more realistic set of instances with larger proteins proposed by Biswas in [17]. Typically, just distances below 6Å (1Å = 10-8 cm) between some pair of atoms can be measured by NMR techniques. So, in order to produce more realistic data, we considered only 70% of the distances lower than R = 6 Å. To introduce noise in the model, we set the bounds using the equations
(7)
where is the true distance between atom i and atom j and (normal distribution). With this model, we generate a sparse set of constraints and introduce a noise in the distances that are not so simple as the one used in the instances proposed by Moré and Wu.