Decision tree constructed using all protein pairs. Each leaf node is labeled with the numbers of CCPs and non-CCPs associated with it, while each internal node is labeled with the attribute (j) used for subsequent partitioning (see Table 4 or Supplementary Information for descriptions of the attributes). Two edges originate from each internal node, labeled "+" or "-," corresponding to the daughter nodes that have or do not have attribute j, respectively. Nodes with percentages of CCPs higher than that of the root node are colored red, while those with lower CCP percentages are blue. The color saturation depends on the relative entropy compared with the root node. The arrowhead size of an edge from a given node approximately represents the fraction of protein pairs in the parent node assigned to the corresponding daughter node.