Enumeration method for treelike chemical compounds with benzene rings and naphthalene rings by breadthfirst search order
 Jira Jindalertudomdee^{1},
 Morihiro Hayashida^{1}Email author,
 Yang Zhao^{1} and
 Tatsuya Akutsu^{1}
https://doi.org/10.1186/s1285901609624
© Jindalertudomdee et al. 2016
Received: 27 April 2015
Accepted: 19 February 2016
Published: 1 March 2016
Abstract
Background
Drug discovery and design are important research fields in bioinformatics. Enumeration of chemical compounds is essential not only for the purpose, but also for analysis of chemical space and structure elucidation. In our previous study, we developed enumeration methods BfsSimEnum and BfsMulEnum for treelike chemical compounds using a treestructure to represent a chemical compound, which is limited to acyclic chemical compounds only.
Results
In this paper, we extend the methods, and develop BfsBenNaphEnum that can enumerate treelike chemical compounds containing benzene rings and naphthalene rings, which include benzene isomers and naphthalene isomers such as ortho, meta, and para, by treating a benzene ring as an atom with valence six, instead of a ring of six carbon atoms, and treating a naphthalene ring as two benzene rings having a special bond. We compare our method with MOLGEN 5.0, which is a wellknown general purpose structure generator, to enumerate chemical structures from a set of chemical formulas in terms of the number of enumerated structures and the computational time. The result suggests that our proposed method can reduce the computational time efficiently.
Conclusions
We propose the enumeration method BfsBenNaphEnum for treelike chemical compounds containing benzene rings and naphthalene rings as cyclic structures. BfsBenNaphEnum was from 50 times to 5,000,000 times faster than MOLGEN 5.0 for instances with 8 to 14 carbon atoms in our experiments.
Keywords
Benzene ring Naphthalene ring Enumeration Breadthfirst searchBackground
Enumeration of chemical compounds is important in bioinformatics, and has been adapted to several applications such as drug discovery and design [1–3], structure elucidation [4–6], and analyses of chemical spaces [7–13]. It is defined as a problem of generating all nonredundant chemical structures satisfying some constraints. For example, a chemical formula, which consists of the number of each atom included in the compound, is given as an input. There are several algorithms for enumerating chemical compounds from a chemical formula and most of them use a molecular graph to represent a chemical compound, where the nodes and edges of the graph refer to atoms and bonds of the chemical compound, respectively. Some of those algorithms are claimed to be able to enumerate various chemical structures without restriction of the structure, such as MOLGEN [14] and Open Molecule Generator (OMG) [15]. It was reported that OMG is able to deal with different valences for a kind of atom, and was not efficient for several instances compared with MOLGEN. While the remaining ones, such as EnuMol [16, 17] as well as BfsSimEnum and BfsMulEnum [18], have a limitation of the structure of enumerated compounds, such as acyclic compounds for BfsSimEnum and BfsMulEnum and compounds with no cycle except for benzene rings for EnuMol, the methods consume significantly less computational time. There are also related application softwares, e.g. SmiLib [19] and CLEVER [20], that generate chemical compounds from given fragments. The limitation of these tools is that they require a library of desired chemical fragments, which can be generated by the enumeration tool.
Our previous methods, BfsSimEnum and BfsMulEnum, use a tree structure, instead of a general graph, to represent a chemical compound and call it a molecular tree so they can generate only treelike chemical compounds. In this work, we develop BfsBenNaphEnum, which aims to reduce the limitation of previous methods by extending them such that they can enumerate chemical compounds containing only benzene rings and naphthalene rings as cyclic structures, which are six carbon atoms cyclic structures and ten carbon atoms bicyclic structures, respectively. Pólya proposed a grouptheoretic method for isomer counting of single cyclic structures such as a benzene ring, a naphthalene ring, and an anthracene ring using the cycle index, from which many studies followed [21]. However, structures enumerated by these methods are restricted to certain types. Indeed, Meringer wrote that up to now the only way to calculate the number of isomers belonging to an arbitrary molecular formula is to use structure generators [22]. Suzuki et al. considered the problem of enumerating structures having monocyclic graph structures, each of which has exactly one cycle [23]. An enumeration method for treelike chemical compounds containing only benzene rings as cyclic structures has been implemented on EnuMol web server (http://sunflower.kuicr.kyotou.ac.jp/tools/enumol/). On the other hand, our method can enumerate compounds containing naphthalene rings in addition to benzene rings. Moreover, the proposed algorithm can calculate the number of benzene rings and naphthalene rings from chemical formula, while users have to specify the number of benzene rings in EnuMol.
Chemical structures considered in this study can be represented by a molecular tree, where a benzene ring is converted to a node with valence six and a naphthalene ring is considered as two benzene nodes having a special bond. We name that special bond as a merge bond. Since a merge bond merges two carbon atoms of two benzene rings together, it reduces the number of carbon atoms with free valence electron of two benzene rings by two so we represent a merge bond by a doubleedge. Moreover, benzene nodes cannot have double bonds with other nodes because they bond with other nonbenzene atoms by a single bond [24]. This means that a doubleedge represents a double bond if it connects two nonbenzene nodes, while it represents a merge bond if it connects two benzene nodes. Therefore, bonds in a benzene ring and a naphthalene ring are considered as the same bond and Kekulé representation is not included in this work. Besides, this work uses a twodimensional molecular tree to represent a chemical structure so it cannot deal with stereoisomers. For tautomeric, this work considers two structures in a pair of tautomeric as non redundant compounds and generates both of them.
BfsSimEnum and BfsMulEnum are modified to return a set of molecular trees as the output, given a chemical formula, the number of benzene rings, and the number of naphthalene rings. After that, an attribute called carbon position list is added into benzene nodes in a molecular tree to represent the way that benzene nodes bond with their adjacent nodes. This attribute is important because bonding with different carbon atoms in a benzene ring may result in different chemical structures. Finally, for each molecular tree from BfsSimEnum and BfsMulEnum, we generate a set of molecular trees whose nodes adjacent to benzene nodes are labeled with a carbon position such that all chemical structures are enumerated without redundancy based on normal form rule.
For evaluating our proposed method, we perform computational experiments for several instances, and compare the execution time by our method with that by MOLGEN. We show that our proposed method is efficient for enumerating chemical compounds containing benzene rings and naphthalene rings, and is from 50 times to 5,000,000 times faster than MOLGEN for several instances in our experiments.
Preliminaries
Enumeration problem
Let Σ be a finite set of labels of atoms, for example, Σ={C,N,O,H }, where ‘C’, ‘N’, ‘O’, and ‘H’ denote carbon, nitrogen, oxygen, and hydrogen atoms, respectively. A molecular graph is defined as a multigraph G(V, E), where V is a set of nodes and E is a set of multiedges, also denoted by V(G) and E(G), respectively. Each node is labeled with an atomlabel in Σ, while each edge represents the bond between two atoms and the multiplicity of edge represents the bond type. The degree of each node is equal to the valence of its atom. Let d e g(v) and l(v) be the degree and the label of node v, respectively. Let v a l(l _{ i }) be the valence of the atom represented by label l _{ i } in Σ. It should be noted that there exist different valences for a kind of atom, for example, carbon atoms of CO_{2} and CO. For this case, it is sufficient to put two distinct labels C and C ^{(2)} in Σ, and to define v a l(C)=4 and v a l(C ^{(2)})=2. Let n u m(G,l _{ i }) be the total number of nodes labeled with label l _{ i } in molecular graph G. Then, the enumeration problem is defined as follows.
Problem 1.
Given the numbers \(n_{l_{i}}\) of atoms for all labels l _{ i }∈Σ, the number n _{ b } of benzene rings, and the number n _{ n } of naphthalene rings, enumerate all nonredundant molecular graphs G such that \(num({G,l}_{i}) = n_{l_{i}}\phantom {\dot {i}\!}\) for all l _{ i }∈Σ, d e g(v)=v a l(l(v)) for all nodes v∈V(G), and G includes exactly n _{ b } benzene rings, n _{ n } naphthalene rings, and no other cyclic structures. It must be noted that n _{ b } and n _{ n } can be zero.
In the case that the input chemical formula contains five or less carbon atoms, BfsStructEnum can enumerate only treelike chemical compounds by specifying the number of benzene rings and the number of naphthalene rings to be zero. Because we enumerate molecular trees such that degree of each node equals to valence of atom label of that node, charged molecules cannot be enumerated automatically. However, they can still be enumerated by specifying a charged atom as a new kind of atom type with appropriate valence value.
Centerrooted and leftheavy
In our previous work, we defined the normal form for molecular trees without any cyclic structures using centerrooted and leftheavy to avoid its redundant generation. In this work, we also utilize centerrooted and leftheavy for carbon positionassigned molecular trees, of which properties do not depend on carbon position lists.
Let c h i l d(v)=(v _{1},v _{2},…) be a list of all child nodes of node v in BFS order. It is defined that a molecular tree T is leftheavy if T(v _{ i })≥_{ m } T(v _{ i+1}) holds for all nodes v in T and all i=1,…,c h i l d(v)−1.
It should be noted that centerrooted and leftheavy are different from centroidrooted and leftheavy defined by Fujiwara et al. [16], for example, the molecular tree in Fig. 1 b is centerrooted and is not centroidrooted because the number of nodes in the left subtree by removing the root, 4, is more than (total number of nodes −1)/2=(7−1)/2=3. In addition, their leftheavy is defined using depthfirst search order, not our breadthfirst search order.
Carbon position list
Proposition 1.
For a node v that has the parent node v _{ p } and a child node v _{ c } in a centerrooted molecular tree T, T ^{ s u b }(v _{ p },v)≠_{ m } T ^{ s u b }(v _{ c },v).
Proof.
The height of T ^{ s u b }(v _{ p },v) is larger than that of T ^{ s u b }(v _{ c },v) because T is centerrooted. Hence, T ^{ s u b }(v _{ p },v) is always different from T ^{ s u b }(v _{ c },v).
We define an equality T _{1}=_{ C } T _{2} for two rooted carbonposition assigned trees T _{1} and T _{2} if T _{1}=_{ m } T _{2}, and C v _{1} T _{1}=C v _{2} T _{2} for all benzene nodes v _{1}∈V(T _{1}), where v _{2}∈V(T _{2}) satisfies i n d e x(v _{1},T _{1})=i n d e x(v _{2},T _{2}), and \({C^{T}_{v}}\) is a list of lists, called a carbon position list explained later, for a benzene node v in T. For convenience, we define another equality \(T_{1}=_{\underline {C}}T_{2}\) by removing the condition that C r _{1} T _{1}=C r _{2} T _{2} for the roots r _{1} and r _{2} of T _{1} and T _{2}, respectively, from the conditions of T _{1}=_{ C } T _{2}, if r _{1} and r _{2} are benzene nodes.
For a node v having the parent v _{ p } and a child v _{ c }, T ^{ s u b }(v _{ p },v)≠_{ C } T ^{ s u b }(v _{ c },v) if T ^{ s u b }(v _{ p },v)≠_{ m } T ^{ s u b }(v _{ c },v). Hence, only carbon position lists of descendent benzene nodes are needed to determine whether or not \(T^{sub}(v_{c_{1}},v)=_{C} T^{sub}(v_{c_{2}},v)\) for child nodes \(v_{c_{1}}\) and \(v_{c_{2}}\) of v.
Definition 1.

\({A^{T}_{v}}[\!i]\leq {A^{T}_{v}}[\!i+1]\) for all i,

\(index \left ({A^{T}_{v}}[\!i][\!1]\right) < index \left ({A^{T}_{v}}[\!i+1][\!1]\right)\) if \({A^{T}_{v}}[\!i]={A^{T}_{v}}[\!i+1]\),

\(index \left ({A^{T}_{v}}[\!i][\!j]\right) < index \left ({A^{T}_{v}}[\!i][j+1]\right)\) for all i,j,

\({A^{T}_{v}}[\!i]=(v')\) if (v, v ^{′}) is a merge bond for some i,

\(v'\in {A_{v}^{T}}[\!i]\) if (v, v ^{′}) is not a merge bond, and \(T^{sub}(v',v)=_{C}T^{sub} \left ({A^{T}_{v}}[\!i][\!1],v\right)\).
Proposition 2.
For a benzene node v that has the parent node v _{ p } in a centerrooted molecular tree T, \({A^{T}_{v}}[\!1]=(v_{p})\).
Proof.
If v has no child, it is clear because the adjacent node of v is only v _{ p }. We assume that v has a child v _{ c }. From Proposition 1 and i n d e x(v _{ p })<i n d e x(v _{ c }), \({A^{T}_{v}}[\!1]=(v_{p})\) always holds.
A carbon position list \({C^{T}_{v}}\) of a benzene node v in T is a list of lists, where \({C^{T}_{v}}[\!i]\) is a list of carbon positions of the nodes in \({A^{T}_{v}}[\!i]\). It is sufficient to enumerate \({C^{T}_{v}}[\!i]\) in ascending order because each node in \({A^{T}_{v}}[\!i]\) has the same subtree. If \(\left ({A^{T}_{v}}[\!i][\!1],v\right)\) is a merge bond, \({C^{T}_{v}}[\!i]\) has two carbon positions instead of one as usual. It should be noted that \({C^{T}_{v}}[\!i] \subseteq \{1,\ldots,6\}\) and two carbon positions are assigned for a merge bond because a naphthalene ring shares two carbon atoms between two benzene rings. In the examples of Fig. 6, \(C^{T_{1}}_{v_{1}}=((3),(4),(1,2))\) for \(A^{T_{1}}_{v_{1}}=((v_{4}),(v_{5}),(v_{2},v_{3}))\), \(C^{T_{2}}_{v_{1}}=((1),(4),(2,3))\) for \(A^{T_{2}}_{v_{1}}=((v_{4}),(v_{5}),(v_{2},v_{3}))\), \(C^{T_{3}}_{v_{1}}=((1,2),(3),(5),(4))\) for \(A^{T_{3}}_{v_{1}}=((v_{2}),(v_{3}),(v_{4}),(v_{5}))\).
Definition 2.

\(A^{T}_{(v_{1},v_{2})}[\!i]\leq A^{T}_{(v_{1},v_{2})}[\!i+1]\) for all i,

\(index \left (A^{T}_{(v_{1},v_{2})}[\!i][\!1]\right) < index \left (A^{T}_{(v_{1},v_{2})}[\!i+1][\!1]\right)\) if \(A^{T}_{(v_{1},v_{2})}[\!i]=A^{T}_{(v_{1},v_{2})}[\!i+1]\),

\(index \left (A^{T}_{(v_{1},v_{2})}[\!i][\!j]\right) < index \left (A^{T}_{(v_{1},v_{2})}[\!i][j+1]\right)\) for all i,j,

\(v'\in A^{T}_{(v_{1},v_{2})}[\!i]\) if \(T^{sub}(v',bn(v'))=_{C}T^{sub} \left (A^{T}_{(v_{1},v_{2})}[\!i][\!1],bn(A^{T}_{(v_{1},v_{2})}[\!i][\!1])\right)\), where b n(v) is v _{1} or v _{2} that is adjacent to v.
Consider the case that \(T^{sub}(v_{1},v_{2})=_{\underline {C}}T^{sub}(v_{2},v_{1})\). We can prove that x=1 if T is in normal form (see Proposition 4). Then, a carbon position list \(C^{T}_{(v_{1}, v_{2})}\) of a naphthalene ring consisting of two benzene nodes v _{1}, v _{2} is a list of lists determined from \(C^{T}_{v_{1}}\) and \(C^{T}_{v_{2}}\) according to the following rule, where \(C^{T}_{(v_{1}, v_{2})}[\!i]\) is a list of carbon positions of nodes in \(A^{T}_{(v_{1}, v_{2})}[\!i]\) in ascending order.
Definition 3.

For the benzene ring of v _{1}, positions 1,2 are assigned to carbons of the merge bond in \(C^{T}_{v_{1}}\). Position i (i=3,…,6) in \(C^{T}_{v_{1}}\) corresponds to i−2 in \(C^{T}_{(v_{1}, v_{2})}\).

For the benzene ring of v _{2}, positions 1,2 are assigned to carbons of the merge bond in \(C^{T}_{v_{2}}\). Position i (i=3,…,6) in \(C^{T}_{v_{2}}\) corresponds to i+2 in \(C^{T}_{(v_{1}, v_{2})}\).
Definition 4.

C v T _{1}[ i ^{′}][j ^{′}]=C v T _{2}[ i ^{′}][j ^{′}] for all i ^{′}<i and all \(j' = 1,\ldots,C^{T_{1}}_{v}[\!i']\),

C v T _{1}[ i][j ^{′}]=C v T _{2}[ i][j ^{′}] for all j ^{′}<j,

C v T _{1}[ i][ j]<C v T _{2}[ i][ j].
This definition is applied to comparison of \(C^{T_{1}}_{(v_{1},v_{2})}\) and \(C^{T_{2}}_{(v_{1},v_{2})}\) for a naphthalene ring with v _{1} and v _{2} in the same way.
In the example of Fig. 6, T _{1} and T _{2} have the same tree structure, and C v _{1} T _{2}=((1),(4),(2,3))<((3),(4),(1,2))=C v _{1} T _{1} because C v _{1} T _{2}[ 1][ 1]=1<3=C v _{1} T _{1}[ 1][ 1].
Normal form of a carbon positionassigned molecular tree
In order to prevent generating redundant molecular trees in enumeration, we define a normal form of a carbon positionassigned molecular tree.
Definition 5.

\(T^{sub} \left (v_{\lfloor \frac {n}2\rfloor },v_{\lfloor \frac {n}2\rfloor +1}\right) {=\;}_{m}T^{sub} \left (v_{n\lfloor \frac {n}2\rfloor +1},v_{n\lfloor \frac {n}2\rfloor }\right)\),

\(index \left (v_{i},T^{sub}\left (v_{\lfloor \frac {n}2\rfloor },v_{\lfloor \frac {n}2\rfloor +1}\right)\right) = index \left (v_{ni+1},T^{sub}\left (v_{n\lfloor \frac {n}2\rfloor +1},v_{n\lfloor \frac {n}2\rfloor }\right)\right)\) for all \(i=1,\cdots,\lfloor \frac {n}2\rfloor \), where ⌊x⌋ is the largest integer less than or equal to x,

\({C^{T}_{v}} = C^{T}_{v'}\) for all benzene nodes \(v\in V\left (T^{sub}\left (v_{\lfloor \frac {n}2\rfloor },v_{\lfloor \frac {n}2\rfloor +1}\right)\right) \backslash V \left (T^{sub}(v_{1},v_{2})\right)\), where \(v'\in V \left (T^{sub}\left (v_{n\lfloor \frac {n}2\rfloor +1},v_{n\lfloor \frac {n}2\rfloor }\right)\right)\) satisfies \(index \left (v', T^{sub}\left (v_{n\lfloor \frac {n}2\rfloor +1},v_{n\lfloor \frac {n}2\rfloor }\right)\right) = index \left (v, T^{sub}\left (v_{\lfloor \frac {n}2\rfloor },v_{\lfloor \frac {n}2\rfloor +1}\right)\right)\), and v∈V _{1}∖V _{2} means that v∈V _{1} and v∉V _{2}.
Proposition 3.
For a centerrooted molecular tree, either of \(v_{\frac {n}2}\) and \(v_{\frac {n}2+1}\) is the root if the length of a symmetric path (v _{1},⋯,v _{ n }) is even. Otherwise, the depth of \(v_{\frac {n+1}2}\) is less than that of any node in the path.
Proof.
For a path (v _{1},⋯,v _{ n }), v _{ i+1} and v _{ n−i } must be the parent nodes of v _{ i } and v _{ n−i+1}, respectively, for \(i=1,\cdots,\frac {n1}{2}\) if n is odd and for \(i=1,\cdots,\frac {n}{2}1\) if n is even due to the center rooted property. Therefore, if the length of path is odd, \(v_{\frac {n+1}{2}}\) is the parent node of both \(v_{\frac {n+1}{2}1}\) and \(v_{\frac {n+1}{2}+1}\), which means that the depth of \(v_{\frac {n+1}2}\) is less than that of any node in the path.
In the case that n is even, either \(v_{\frac {n}{2}}\) or \(v_{\frac {n}{2}+1}\) has the least depth among all nodes in the path and another node is the child node of that node. Assume that between these two nodes the parent node is v _{ a } and the child node is v _{ b }. v _{ a } cannot have a parent node because the height of T ^{ s u b }(v _{ p },v _{ a }), where v _{ p } is the parent node of v _{ a }, cannot be equal to the height of T ^{ s u b }(v _{ c },v _{ b }) for any nodes v _{ c } that are adjacent to v _{ b } due the centerrooted condition, which means that T ^{ s u b }(v _{ a },v _{ b })=_{ m } T ^{ s u b }(v _{ b },v _{ a }) cannot be hold and the first condition of symmetric path is violated. In other words, v _{ a }, which is either \(v_{\frac {n}{2}}\) or \(v_{\frac {n}{2}+1}\), is the root node of the tree if n is even.
We say that v _{1} is left of v _{ n } for a symmetric path (v _{1},…,v _{ n }) when \(v_{n\lfloor \frac {n}2\rfloor +1}\) is the root, or i n d e x(v _{1})<i n d e x(v _{ n }).
We define an inequality T _{1}>_{ C } T _{2} for carbon positionassigned molecular trees T _{1} and T _{2} if T _{1}>_{ m } T _{2}, or T _{1}=_{ m } T _{2}, and there exists an integer i such that v _{ i } is a benzene node, C v _{ i } T _{1}>C v i′T _{2}, and C v _{ j } T _{1}=C v j′T _{2} for all benzene nodes v _{ j } with j>i, where i n d e x(v _{ k },T _{1})=i n d e x(v k′,T _{2}) for all k=1,…,V(T _{1}).
Definition 6.
 1.
T is centerrooted and leftheavy.
 2.
T(v)≥_{ m } T ^{ s u b }(r,v) if the center of the longest path in T with the root r is the edge (r,v).
 3.
Positions in each sublist of \({C^{T}_{v}}\) for each benzene node v are in ascending order.
 4.
\({C^{T}_{v}}\leq \phi _{b}\left ({C^{T}_{v}}\right)\) for all benzene nodes v that is not connected by a merge bond with the parent node and all ϕ _{ b }∈A u t _{ b }.
 5.For benzene nodes v _{1},v _{2} connected by a merge bond such that v _{1} is the root of T,
 (a)
\(C^{T}_{(v_{1},v_{2})}\leq \phi _{n} \left (C^{T}_{(v_{1},v_{2})}\right)\) for all ϕ _{ n }∈A u t _{ n } if \(T^{sub}(v_{1},v_{2})=_{\underline {C}}T^{sub}(v_{2},v_{1})\), where \(C^{T}_{(v_{1},v_{2})}\) is related with \(C^{T}_{v_{1}}\) and \(C^{T}_{v_{2}}\) by Definition 3.
 (b)
\(C^{T}_{v_{2}}\leq \phi_{ref} \left (C^{T}_{v_{2}}\right)\) if \(T^{sub}(v_{1},v_{2})\neq _{\underline {C}} T^{sub}(v_{2},v_{1})\) and \(C^{T}_{v_{1}}=\phi_{ref} \left (C^{T}_{v_{1}}\right)\).
 (a)
 6.
T ^{ s u b }(v _{1},v _{2})≥_{ C } T ^{ s u b }(v _{ n },v _{ n−1}) for all pairs v _{1},v _{ n } of nodes such that the path (v _{1},…,v _{ n }) is a symmetric path, v _{1} and v _{ n }(=v _{2}) are not connected by a merge bond, and v _{1} is left of v _{ n }.
We call a tree in normal form a normal tree.
Figure 8 also shows molecular trees in normal form and not in normal form. For condition 4 of the definition, \(C^{T'_{4}}_{v_{1}}=((1,2),(4),(3))\leq \phi _{b} \left (C^{T'_{4}}_{v_{1}}\right)\), \(C^{T^{\prime \prime }_{4}}_{v_{1}} = ((1,2),(4),(5))\leq \phi _{b}\left (C^{T^{\prime \prime }_{4}}_{v_{1}}\right)\). \(T^{\prime }_{4}\) and \(T^{\prime \prime }_{4}\) satisfy conditions 1, 2, 3, and 4. For condition 5, \(C^{T'_{4}}_{(v_{1},v_{2})}=((2,6),(1,7))\leq \phi _{n}\left (C^{T'_{4}}_{(v_{1},v_{2})}\right)\), whereas \(C^{T^{\prime \prime }_{4}}_{(v_{1},v_{2})}=((2,6),(3,5))>((2,6),(1,7))=\phi_{rot}\left (C^{T^{\prime \prime }_{4}}_{(v_{1},v_{2})}\right)\) for rotation ϕ _{ rot } of π radians, and \(T^{\prime \prime }_{4}\) violates the condition. It is noted that \(T^{\prime \prime }_{4}\) is rotated by π radians from \(T^{\prime }_{4}\). For condition 6, v _{1} and v _{2} are connected by a merge bond. Thus, \(T^{\prime }_{4}\) is a normal tree, and \(T^{\prime \prime }_{4}\) is not a normal tree.
Proposition 4.
For a normal tree T with a benzene node v _{1} that is connected by a merge bond with its child node v _{2} and satisfies \(T^{sub}(v_{1},v_{2})=_{\underline {C}}T^{sub}(v_{2},v_{1})\), positions 1,2 are assigned to the merge bond in the benzene ring of v _{1}. Furthermore, if \(C^{T}_{(v_{1},v_{2})}\leq \phi _{n}\left (C^{T}_{(v_{1},v_{2})}\right)\) for all ϕ _{ n }∈A u t _{ n }, then \(C^{T}_{v_{1}}\leq \phi _{b}\left (C^{T}_{v_{1}}\right)\) for all ϕ _{ b }∈A u t _{ b }.
Proof.
We assume that there exists a node v _{ l } as a left sibling of v _{2}, and v _{ l } is the leftmost child of v _{1}. Since T is leftheavy, T(v _{ l })≥_{ m } T(v _{2}), and l(v _{ l })=l(v _{2})=‘b’ is needed. However, T(v _{ l })=_{ C } T(v _{ c }), where v _{ c } is the leftmost child of v _{2}, because \(T^{sub}(v_{1},v_{2})=_{\underline {C}}T^{sub}(v_{2},v_{1})=_{C}T(v_{2})\). Hence, T(v _{ l })<_{ m } T(v _{2}). It contradicts the assumption, and v _{2} is the leftmost child of v _{1}. Therefore, \(A^{T}_{v_{1}}[\!1]=(v_{2})\). From condition 4 of Definition 6, \(C^{T}_{v_{1}}[\!1]=(1,2)\), and positions 1,2 are assigned to the merge bond, that is x=1 in Fig. 7 a.
For a map ϕ _{ b }∈A u t _{ b } other than the identity and reflection map ϕ _{ ref } for a benzene ring, \(C^{T}_{v_{1}}<\phi _{b}\left (C^{T}_{v_{1}}\right)\) because each of ϕ _{ b }(1) and ϕ _{ b }(2) is at least 2. From \(C^{T}_{(v_{1},v_{2})}\leq \phi_{ref}\left (C^{T}_{(v_{1},v_{2})}\right)\) and the correspondence between \(C^{T}_{v_{1}}\) and \(C^{T}_{(v_{1},v_{2})}\), \(C^{T}_{v_{1}}\leq \phi_{ref}\left (C^{T}_{v_{1}}\right)\). Therefore, \(C^{T}_{v_{1}}\leq \phi _{b}\left (C^{T}_{v_{1}}\right)\) for all ϕ _{ b }∈A u t _{ b }.
Proposition 5.
For a benzene node v of a normal tree T, \({C^{T}_{v}}[\!1][\!1]\) is always equal to 1.
Proof.
If v is not connected by a merge bond with the parent node, from condition 4, \({C^{T}_{v}}\) must be the least possible carbon position list. Hence, \({C^{T}_{v}}[\!1][\!1]=1\). Otherwise, from Definition 3, \({C^{T}_{v}}[\!1][\!1]=1\).
Lemma 1.
Given a molecular graph G without cyclic structures except benzene rings and naphthalene rings, G can be represented by a normal tree.
Proof.
We can assign numbers to carbons in benzene rings and naphthalene rings of G such that the conditions of Definition 6 are satisfied.
Lemma 2.
Given two different molecular graphs G _{1} and G _{2}, they cannot be represented by the same normal tree.
Proof.
We can unambiguously obtain a molecular graph from a normal tree by replacing all benzene nodes with benzene rings according to its carbon position lists.
Proposition 6.
For a normal tree T with a path (v _{1},…,v _{ n }), G ^{′} is the molecular graph obtained from the tree T ^{′} by removing T ^{ s u b }(v _{1},v _{2}) and T ^{ s u b }(v _{ n },v _{ n−1}) except v _{1} and v _{ n } from T, where v _{1} is left of v _{ n }. If there is a nonidentity map ϕ of the automorphism group of G ^{′} satisfying ϕ(v _{ i })=v _{ n−i+1} for all i=1,…,n, then T ^{ s u b }(v _{1},v _{2})≥_{ C } T ^{ s u b }(v _{ n },v _{ n−1}), where ϕ in G ^{′} is naturally extended to T.
Proof.
Let u _{ j } and w _{ j } be child nodes of v _{ i+1} and v _{ n−i }, respectively. Then, \(v_{i}=u_{j_{2}\phantom {\dot {i}\!}}\) and \(v_{ni+1}=w_{j_{1}\phantom {\dot {i}\!}}\), where j _{1}=i n d e x(v _{ n−i+1},T ^{ s u b }(v _{ n−i },v _{ n−i−1})) and j _{2}=i n d e x(v _{ i },T ^{ s u b }(v _{ i+1},v _{ i+2})). If v _{ i+1} and v _{ n−i } are benzene nodes, \(T(u_{j_{1}})=_{C}T(v_{i})\), \(T(v_{ni+1})=_{C}T\left (w_{j_{2}}\right)\), and T(v _{ i })=_{ C } T(v _{ n−i+1}) because \(C^{T}_{v_{i+1}}=C^{T}_{v_{ni}}\) and ϕ(v _{ i })=v _{ n−i+1}.
If T ^{ s u b }(v _{ i+1},v _{ i+2})>_{ C } T ^{ s u b }(v _{ n−i },v _{ n−i−1}), then there is an integer j (j _{1}≤j≤j _{2}) such that T(u _{ j })>_{ C } T(w _{ j }), and it contradicts Eq. (2). Therefore, T ^{ s u b }(v _{ i+1},v _{ i+2})=_{ C } T ^{ s u b }(v _{ n−i },v _{ n−i−1}), and T(v _{ i })=_{ C } T(v _{ n−i+1}). Also for the case that (v _{ i+1},…,v _{ n−i }) is a symmetric path for some i and i n d e x(v _{ i },T ^{ s u b }(v _{ i+1},v _{ i+2}))<i n d e x(v _{ n−i+1},T ^{ s u b }(v _{ n−i },v _{ n−i−1})), then T(v _{ i })=_{ C } T(v _{ n−i+1}). Thus, T ^{ s u b }(v _{1},v _{2})≥_{ C } T ^{ s u b }(v _{ n },v _{ n−1}).
Lemma 3.
Given two different normal trees T _{1} and T _{2}, T _{1} does not represent the same molecular graph as T _{2}.
Proof.
We assume that T _{1} represents the same molecular graph as T _{2}. Let G _{1} and G _{2} be molecular graphs transformed from T _{1} and T _{2}, respectively, where each carbon in benzene rings and naphthalene rings is connected with adjacent atoms according to carbon position lists of T _{1} and T _{2}. From the assumption, there is an isomorphism ψ from G _{1} to G _{2}. It means that l(v _{1})=l(ψ(v _{1})) for all v _{1}∈V(G _{1}), (ψ(v _{1}),ψ(v _{2}))∈E(G _{2}) if and only if (v _{1},v _{2})∈E(G _{1}), and m u l(ψ(v _{1}),ψ(v _{2}))=m u l(v _{1},v _{2}).
If v _{1} is not connected by a merge bond with the parent node, there is a nonidentity map ϕ _{ b }∈A u t _{ b } such that \(C^{T_{1}}_{v_{1}}=\phi _{b}\left (C^{T_{2}}_{\psi (v_{1})}\right)\) because T _{1} and T _{2} represent the same molecular graph. It contradicts condition 4 of Definition 6. Suppose that v _{1} is connected by a merge bond with the parent node v _{ p } and C v _{ p } T _{1}=C ψ(v _{ p })T _{2}. If \(T^{sub}\left (v_{p},v_{1}\right) {=}_{\underline {C}}T^{sub}\left (v_{1},v_{p}\right)\), then v _{ p } is the root, and there is a nonidentity map ϕ _{ n }∈A u t _{ n } such that \(C^{T_{1}}_{(v_{p},v_{1})}=\phi _{n}\left (C^{T_{2}}_{(\psi (v_{p}),\psi (v_{1}))}\right)\) because T _{1} and T _{2} represent the same molecular graph. It contradicts condition 5a. Otherwise, \(T^{sub}\left (v_{p},v_{1}\right)\neq _{\underline {C}}T^{sub}\left (v_{1},v_{p}\right)\). If v _{ p } is not the root, then T _{1} does not represent the same molecular graph as T _{2} because T ^{ s u b }(v _{ a },v _{ p }), where v _{ a } is the parent of v _{ p }, is different from other subtrees connected to the naphthalene ring. It contradicts the assumption. If v _{ p } is the root, \(C^{T_{1}}_{v_{p}}=\phi_{ref}\left (C^{T_{1}}_{v_{p}}\right)\) and \(C^{T_{1}}_{v_{1}}=\phi_{ref}\left (C^{T_{2}}_{\psi (v_{1})}\right)\) because T _{1} and T _{2} represent the same molecular graph. It contradicts condition 5b.
because T _{1} and T _{2} are normal trees. There is no carbon position lists that satisfy Eqs. (4), (6) and (7).
Therefore, T _{1} does not represent the same molecular graph as T _{2}.
Methods
We propose an algorithm BfsBenNaphEnum for enumerating chemical compounds containing benzene rings and naphthalene rings as cyclic structures. BfsBenNaphEnum utilizes our previously developed algorithms BfsSimEnum, BfsMulEnum [18], and assigns carbon position lists.
Modification of BfsSimEnum and BfsMulEnum
Suppose that the numbers \(n_{l_{i}}\) of atoms with label l _{ i } for all l _{ i }∈Σ, the numbers n _{ b }, n _{ n } of benzene rings and naphthalene rings are given. BfsBenNaphEnum introduces a special label ‘b’ representing a benzene node to Σ with b>l _{ i }∈Σ and v a l(b)=6, and executes BfsSimEnum to generate all nonredundant molecular trees T such that \(num(T, l_{i})=n_{l_{i}\phantom {\dot {i}\!}}\) for l _{ i }∈Σ except l _{ i }=b,C and n u m(T,b)=n _{ b }+2n _{ n }, n u m(T,C)=n _{ C }−6n _{ b }−10n _{ n }. At this time, all edges of enumerated trees are single because BfsSimEnum generates only simple trees. Then, we modify BfsMulEnum to assign n _{ n } merge bonds to edges between benzene nodes in each tree enumerated by BfsSimEnum in addition to adding \(1+\sum _{l_{i}\in \Sigma, l_{i}\neq b}num(T,{l_{i}})(val(l_{i})2)/2\) bonds to edges between usual nodes. It should be noted that multiple bonds cannot be assigned to edges connected to benzene nodes since a carbon atom in benzene rings and naphthalene rings is connected with another adjacent atom by a single bond.
Assignment of carbon positions for molecular trees
In this algorithm, we traverse along the tree T from the rightmost deepest benzene node to the root in reverse BFS order because an adjacent node list depends on carbon position lists of descendant nodes. For each benzene node v we found, we assign a carbon position list not to violate the conditions of normal form.
Carbon position lists for \({A^{T}_{v}}\), where v is the root, and \({A^{T}_{v}}[\!1]\geq 3\)
\({A^{T}_{v}}[\!1]\)  \({A^{T}_{v}}[\!2]\)  \({C^{T}_{v}}\) 

3  0  ((1,2,3)), ((1,2,4)), ((1,3,5)) 
3  3  ((1,2,3),(4,5,6)), ((1,2,4),(3,5,6)), ((1,3,5),(2,4,6)) 
4  0  ((1,2,3,4)), ((1,2,3,5)), ((1,2,4,5)) 
5  0  ((1,2,3,4,5)) 
6  0  ((1,2,3,4,5,6)) 
For other carbon positions from 2 to 6, we use ASSIGN_CHILD to assign such positions to the remaining adjacent nodes. For example, let T _{1} in Fig. 6 be output without any carbon position list by BfsMulEnum. T _{1} has a benzene node v _{1}, and \(A^{T_{1}}_{v_{1}}=((v_{4}),(v_{5}),(v_{2},v_{3}))\). First, carbon position 1 is assigned to \(A^{T_{1}}_{v_{1}}[\!1][\!1]=v_{4}\), that is, \(C^{T_{1}}_{v_{1}}[\!1][\!1]=1\). Since v _{1} is the root and \(A^{T_{1}}_{v_{1}}[\!1]=1<3\), Table 1 is not used, and the other nodes v _{5},v _{2},v _{3} are assigned by ASSIGN_CHILD. For v _{5}, each carbon position from 2 to 6 is examined (line 26 in ASSIGN_CHILD). For v _{2}, each position from 2 to 6 except the position assigned to v _{5} is examined (line 27). For v _{3}, each position from 2 to 6 that is more than the position assigned to v _{2} except the position assigned to v _{5} is examined (line 27) because v _{2} and v _{3} have the same subtree and condition 3 must be satisfied. Thus, \(C^{T_{1}}_{v_{1}}\!=((1),(2),(3,4)),((1),(2),(3,5)),((1),(2),(3,6)),\dots, ((1), (3),(2,4)),((1),(3),(2,5)),((1),(3),(2,6)),\dots,((1), (6),(4,5))\) are examined, where ((1),(6),(2,3)),((1),(6),(2,4)),((1),(5),(2,3)) and so on are discarded in the next step.
For each benzene node v, after assignment of a carbon position list to \({A^{T}_{v}}\), whether or not \({C^{T}_{v}}\) violates conditions 4, 5 of the normal form is confirmed (lines 5, 11, 14 in ASSIGN_CHILD). After carbon position lists are assigned to all benzene nodes, condition 6 is confirmed (line 4 in ASSIGN).
Since an input of this part, that is, an output of BfsMulEnum, satisfies conditions 1, 2 of the normal form, BfsBenNaphEnum always outputs normal trees. In ASSIGN_CHILD, a distinct carbon position list is always assigned, and all patterns are assigned (line 28). Hence, BfsBenNaphEnum outputs all distinct normal trees.
Theorem 1.
BfsBenNaphEnum outputs all nonredundant molecular graphs that are solutions of Problem 1.
Results
Results on execution time (sec), the number of enumerated structures by BfsBenNaphEnum and MOLGEN, and the number of chemical compounds exist in PubChem database for several instances
Chemical formula  #atoms  #all compounds in PubChem  #enumerated structures  Computational time (sec)  

n  b  C  N  O  H  BfsBenNaphEnum  MOLGEN  
C _{7} O _{2} H _{8}  0  1  1  0  2  8  728  19  0.001  0.053 
C _{8} O _{3} H _{10}  0  1  2  0  3  10  1602  307  0.002  0.124 
C _{9} O _{4} H _{10}  0  1  3  0  4  10  1469  6406  0.010  1.699 
C _{10} N _{2} O _{4} H _{10}  0  1  4  2  4  10  1592  8,333,991  12.260  957.53 
1  0  0  2  4  10  7980  0.031  69.51  
C _{11} N _{2} H _{10}  0  1  5  2  0  10  790  9012  0.021  630.44 
1  0  1  2  0  10  56  0.005  24.061  
C _{12} N _{1} O _{1} H _{11}  0  1  6  1  1  11  1582  80,883  0.155  2,611.57 
0  2  0  1  1  11  33  0.001  98.99  
1  0  2  1  1  11  888  0.009  560.98  
C _{13} O _{2} H _{12}  0  1  7  0  2  12  1239  162,122  0.289  6,497.55 
0  2  1  0  2  12  190  0.002  2,069.3  
1  0  3  0  2  12  2458  0.013  1,731.92  
C _{14} O _{4} H _{12}  0  1  8  0  4  12  1 397  19,514,480  35.655  197,264.54 
0  2  2  0  4  12  15,581  0.021  107,509.42  
1  0  4  0  4  12  337,178  1.061  97,326.71 
Since MOLGEN can enumerate chemical compounds without restriction on the structure, we must specify a benzene ring and a naphthalene ring as a substructure so that the enumerated structures contain only benzene rings and naphthalene rings as cyclic structures. As can be seen from Table 2, where ‘n’ and ‘b’ denote a naphthalene ring and a benzene ring, respectively, BfsBenNaphEnum enumerated chemical compounds much faster than MOLGEN while giving the same number of enumerated structures. BfsBenNaphEnum was from 50 times to 5,000,000 times faster than MOLGEN for instances with 8 to 14 carbon atoms. Table 2 also compares the number of discovered compounds in PubChem, which are not limited to treelike chemical compounds, with the number of compounds enumerated by the proposed algorithm for several chemical formulas. When the number of carbon atoms is large (greater than 8 in this case), the number of discovered compounds is much less than the number of enumerated compounds. This implies that there are still a numerous number of unknown compounds to be discovered, which possibly include some essential compounds. In this study, we examined chemical formulas including up to two benzene rings and one naphthalene ring because MOLGEN was not able to output results in practical time for chemical formulas including more benzene rings and naphthalene rings.
Discussion
Our algorithm is limited to treelike chemical structures without any cyclic structures except benzene rings and naphthalene rings while MOLGEN does not have such limitation. Therefore, in the future, we would like to extend the algorithm such that it can enumerate more complex cyclic structures, such as polycyclic aromatic compounds and nucleotides. Besides, in order to make enumeration tools practical, we need to rank enumerated structures because a large number of structures are usually enumerated. For that purpose, it might be useful to employ drug likeness filters such as Lipinski RO5, and QED score. Incorporation of such filters into our system is also important future work.
Conclusions
We proposed a way to represent a benzene ring in a molecular tree by regarding it as a new defined atom with valence six and introducing a new attribute named carbon position list to benzene nodes. Carbon position of an atom specifies which carbon in a benzene ring that the corresponding atom bonds with. We also proposed a new kind of bond called merge bond that merges two benzene rings together to form a naphthalene ring. With merge bond a molecular tree can represent a structure containing naphthalene rings without defining new kind of atom. Moreover, since a benzene ring and a naphthalene ring are symmetric structures, we defined a rule to assign carbon position lists such that no redundant structures due to the symmetry of a benzene ring and a naphthalene ring are enumerated.
The algorithm of this work consists of two main steps. Given the number of benzene rings, the number of naphthalene rings as well as a chemical formula, BfsSimEnum and BfsMulEnum are applied such that they can enumerate molecular trees with benzene nodes. Next, the new extension BfsBenNaphEnum assigns carbon position lists to benzene nodes in normal molecular trees.
To show the performance of our algorithm, all nonredundant chemical structures were enumerated for several chemical formulas by BfsBenNaphEnum and MOLGEN 5.0, a wellknown general purpose structure generator. It is shown that our algorithm is reliable since it generated the same number of structures as MOLGEN, while expended much less computational time. BfsBenNaphEnum was from 50 times to 5,000,000 times faster than MOLGEN for instances with 8 to 14 carbon atoms in our experiments. This is mainly because the number of nodes decreases from six to one for each benzene ring and from ten to two for each naphthalene ring in a chemical structure and because we enumerate chemical structures in the form of trees instead of graphs.
Declarations
Acknowledgements
This work was partially supported by GrantsinAid #26240034, #24500361, and #252920 from MEXT, Japan.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Ward RA, Kettle JG. Systematic enumeration of heteroaromatic ring systems as reagents for use in medicinal chemistry. J Med Chem. 2011; 54(13):4670–7.View ArticlePubMedGoogle Scholar
 Blum LC, Reymond JL. 970 million druglike small molecules for virtual screening in the chemical universe database gdb13. J Am Chem Soc. 2009; 131(25):8732–3.View ArticlePubMedGoogle Scholar
 Mishima K, Kaneko H, Funatsu K. Development of a new de novo design algorithm for exploring chemical space. Mol Inform. 2014; 33(1112):779–89.Google Scholar
 Funatsu K, Sasaki S. Recent advances in the automated structure elucidation system, chemics. utilization of twodimensional NMR spectral information and development of peripheral functions for examination of candidates. J Chem Inform Comput Sci. 1996; 36(2):190–204.View ArticleGoogle Scholar
 Meringer M, Schymanski EL. Small molecule identification with MOLGEN and mass spectrometry. Metabolites. 2013; 3:440–62.View ArticlePubMedPubMed CentralGoogle Scholar
 Koichi S, Arisaka M, Koshino H, Aoki A, Iwata S, Uno T, Satoh H. Chemical structure elucidation from ^{13}C NMR chemical shifts: Efficient data processing using bipartite matching and maximal clique algorithms. J Chem Inform Model. 2014; 54:1027–35.View ArticleGoogle Scholar
 Bytautas L, Klein DJ, Schmalz TG. All acyclic hydrocarbons: Formula periodic table and property overlap plots via chemical combinatorics. New J Chem. 2000; 24(5):329–36.View ArticleGoogle Scholar
 Faulon J, Visco DP, Roe D. Enumerating molecules. Rev Comput Chem. 2005; 21:209.Google Scholar
 Koch MA, Schuffenhauer A, Scheck M, Wetzel S, Casaulta M, Odermatt A, Ertl P, Waldmann H. Charting biologically relevant chemical space: A structural classification of natural products (sconp). Proc Natl Acad Sci U S A. 2005; 102(48):17272–7.View ArticlePubMedPubMed CentralGoogle Scholar
 Mauser H, Stahl M. Chemical fragment spaces for de novo design. J Chem Inf Model. 2007; 47(2):318–24.View ArticlePubMedGoogle Scholar
 Andricopulo AD, Guido RV, Oliva G. Virtual screening and its integration with modern drug design technologies. Curr Med Chem. 2008; 15(1):37–46.View ArticlePubMedGoogle Scholar
 Reymond JL, van Deursen R, Blum LC, Ruddigkeit L. Chemical space as a source for new drugs. MedChemComm. 2010; 1(1):30–8.View ArticleGoogle Scholar
 Bürgi JJ, Awale M, Boss SD, Schaer T, Marger F, ViverosParedes JM, Bertrand S, Gertsch J, Bertrand D, Reymond JL. Discovery of potent positive allosteric modulators of the α3β2 nicotinic acetylcholine receptor by a chemical space walk in chembl. ACS Chem Neurosci. 2014; 5(5):346–59.View ArticlePubMedPubMed CentralGoogle Scholar
 Gugisch R, Kerber A, Kohnert A, Laue R, Meringer M, Rücker C, Wassermann A. MOLGEN 5.0, a molecular structure generator. Sharjah, United Arab Emirates: Bentham Science Publishers Ltd.; 2012.Google Scholar
 Peironcely JE, RojasChertó M, Fichera D, Reijmers T, Coulier L, Faulon JL, Hankemeier T. OMG: Open Molecule Generator. J Cheminformatics. 2012; 4:21.View ArticleGoogle Scholar
 Fujiwara H, Wang J, Zhao L, Nagamochi H, Akutsu T. Enumerating treelike chemical graphs with given path frequency. J Chem Inf Model. 2008; 48(7):1345–57.View ArticlePubMedGoogle Scholar
 Shimizu M, Nagamochi H, Akutsu T. Enumerating treelike chemical graphs with given upper and lower bounds on path frequencies. BMC Bioinformatics. 2011; 12:14–3.View ArticleGoogle Scholar
 Zhao Y, Hayashida M, Jindalertudomdee J, Akutsu T. Breadthfirst search approach to enumeration of treelike chemical compounds. J Bioinformatics Comput Biol. 2013; 11:1343007.View ArticleGoogle Scholar
 Schüller A, Hähnke V, Schneider G. SmiLib v2.0: A Javabased tool for rapid combinatorial library enumeration. QSAR Comb Sci. 2007; 26(3):407–10.View ArticleGoogle Scholar
 Song CM, Bernardo PH, Chai CLL, Tong JC. CLEVER: Pipeline for designing in silico chemical libraries. J Mol Graph Model. 2009; 27(5):578–83.View ArticlePubMedGoogle Scholar
 Trinajstić N. Chemical Graph Theory, 2nd edn. Boca Raton, Florida: CRC Press; 1992, pp. 275–391. Chap. 11 Isomer Enumeration.Google Scholar
 Meringer M. Handbook of Chemoinformatics Algorithms. Boca Raton, Florida: CRC Press; 2010, pp. 233–67. Chap. 8 Structure Enumeration and Sampling.Google Scholar
 Suzuki M, Nagamochi H, Akutsu T. Efficient enumeration of monocyclic chemical graphs with given path frequencies. J Cheminformatics. 2014; 6:31.View ArticleGoogle Scholar
 Hardinger SA, University of California LADoC. Biochemistry: Chemistry 14D: Organic Reactions and Pharmaceuticals : Course Thinkbook, Lecture Supplements, Concept Focus Questions, OWLS Problems, Practice Problems. Plymouth, MI 48170: HaydenMcNeil Pub; 2008.Google Scholar