A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
© Ryslik et al.; licensee BioMed Central Ltd. 2014
Received: 9 July 2013
Accepted: 11 March 2014
Published: 26 March 2014
It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach.
We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html.
GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.
Cancer, one of the most widespread and heterogeneous diseases, is at its most fundamental level a disease brought on by the accumulation of somatic mutations . These mutations typically occur in either tumor suppressors or oncogenes. While oncogenic mutations either tend to deregulate or up-regulate the resulting protein behavior, mutations within tumor suppressors typically lower the activity of genes that prevent cancer. Pharmacological intervention has shown to be more effective with inhibiting activating oncogenes than with restoring functionality of tumor suppressing genes. Combined with the theory of "oncogene addiction", that many cancers are dependent upon a small set of key genes to drive their rapid cellular multiplication with the rest of the mutations simply being passenger mutations [2, 3], the identification of driver oncogenic mutations has become of critical importance in cancer research.
Due to the importance of this problem, several approaches have been proposed to detect naturally selected regions in which activating mutations occur. One general approach postulates that driver mutations will have a higher non-synonymous mutation rate as compared to the background level after normalizing for the length of the gene [4–6]. Similarly, assuming that the neutral rate of nucleotide substitution is surpassed when positive selection is acting on a specific region, one can check if the ratio of nonsynonymous (K a ) to synonymous (K s ) mutations per site is greater than 1 . Relatedly, Ye et al. and Ryslik et al. showed that mutational clusters can be indicative of activating mutations and that finding such clusters is a way to reduce the driver mutation search space needing to be analyzed. An alternative approach relies on creating classifiers to categorize mutations. Machine learning algorithms such as Polyphen-2, which predicts whether a missense mutation is damaging, and CHASM[11–13], which discriminates between known driver mutations and a set of passenger mutations, rely upon a set of rules developed using a variety of machine learning techniques such as Random Forests  and Support Vector Machines . These rules can be used to calculate a score for each mutation based upon both sequence and non-sequence-based features such as evolutionary conservation, size and polarity of the substituted residue as well as accessible surface area . Other classifiers, such as SIFT, use only a subset of these features, e.g. evolutionary conservation, for prediction.
While the methods based upon background mutational rates have had some success in identifying regions of positive selections or driver mutations, they nonetheless suffer from several shortcomings. First, many of these methods rely upon calculating the difference between synonymous and non-synonymous mutations but do not take into account that selection can act upon minute regions of the gene. Thus, when the mutations rates are averaged over the entire gene, the signal may be lost. Second, the methods proposed by Kreitman  and Wang  do not differentiate between activating gain-of-function mutations and inactivating loss-of-function non-synonymous mutations. Third, many of the machine learning methods require an extensive rule set that must first be trained using a well annotated database that is still limited. Until the requisite literature and information is developed, the machine learning algorithm is unable to create a well-performing classifier. Furthermore, the rules must be updated periodically to reflect updated knowledge and information. For a recent review of several popular methods that attempt to discern missense substitution effect on protein function see Gnad et al. and Gonzalez-Perez et al..
Building on the work of Bardelli et al. and Torkamani and Schork , which stipulated that only a small number of specific mutations can activate a protein, Ye et al. developed Non-Random Mutational Clustering (NMC) to identify potential activating mutations. NMC works on the hypothesis that absent any previously known mutational hotspot, a mutational cluster is indicative of a possible activating mutation. This is based on the observation that most amino acid substitutions are either neutral or incompatible with protein function, resulting in a concentration of activating mutations within a small subset of protein residues and domains . For the null hypothesis that mutation locations are random in the candidate protein when represented in linear form, NMC identifies clustering by evaluating whether there is statistical evidence of mutations occurring closer together on the line than expected by chance. While NMC is able to implicate some cancer related genes, it is limited by the fact that it considers the protein as a linear sequence and does not take into account the tertiary protein structure. To account for protein structure information, Ryslik et al. developed iPAC (identification of Protein Amino acid Clustering), which reorganizes the protein into a one dimensional space that preserves, as best as possible, the three dimensional amino acid pairwise distances using Multidimensional Scaling (MDS) . As described by Ryslik et al., utilizing the tertiary information is critical when identifying clustering as mutations that occur far apart when the protein is considered linearly can be very close together once the protein is folded in 3D space. The 3D proximity of such mutations might thus yield novel clusters. While it was shown that iPAC provides an improvement over NMC, the reliance upon a global method like MDS can potentially result in a distorted rearrangement of the protein, since distant residues will nevertheless have an impact on each other’s final position in one dimensional space.
In this manuscript, we provide an alternative method to iPAC by remapping the protein into one dimensional space via a graph theoretic approach. This approach allows for a more natural consideration of the protein, one that is sensitive to protein domains and linkers. We show that our methodology is effective in identifying proteins with mutational clustering that are missed by both iPAC and NMC such as NRP1 and MAPK24. We also show that for some proteins, GraphPAC identifies fewer clusters than inferred by both iPAC and NMC while for other proteins GraphPAC identifies more clusters than the other two methods. While both GraphPAC and iPAC are an improvement over NMC since they account for tertiary structure, the differences between GraphPAC and iPAC point to the fact that different rearrangements of the protein must be considered in order to better understand the mutational clustering landscape. We show that many of the clusters identified by GraphPAC are also classified as damaging by Polyphen-2 and as an activating mutation by CHASM. By providing a more complete picture of mutational clustering than iPAC or NMC individually, GraphPAC allows us to obtain a more accurate landscape of where potential activating mutations may occur on the protein.
GraphPAC uses a four step approach to identifying mutational clusters. The first step, as described in Sections ‘Obtaining mutational data’ and ‘Obtaining the 3D structural data’, retrieves mutational and positional data from COSMIC  and the PDB , respectively. After reconciling the mutational and positional databases (Section ‘Reconciling the structural and mutational data’), the residues are realized as a connected graph where each residue is a vertex whereupon the traveling salesman problem is heuristically solved in order to find the shortest path through the protein (Section ‘Traveling salesman approach’). Once the shortest path has been identified, the protein residues are reordered along this path providing a one dimensional ordering of the protein. The linear NMC algorithm is then used to calculate which mutations are closer together than expected by chance. Lastly, the clusters are unmapped back into the original space and the results reported back to the user. We detail each of the steps in the sections below.
Obtaining mutational data
The mutational positions were obtained from the 58th version of the COSMIC database that was downloaded via the following ftp site: http://ftp.sanger.ac.uk/pub/CGP/cosmic. The database was implemented locally using Oracle 11g. Only missense mutations that were classified as "Confirmed somatic variant" or "Reported in another cancer sample as somatic" were selected, with nonsense and synonymous mutations excluded. Moreover, we only considered mutations originating from studies that were classified as whole gene screens. Next, since multiple studies can report mutational data from the same cell line, mutational redundancies were removed to avoid double counting the mutations. Lastly, only the proteins with a UniProt Accession Number  were kept in order to correctly match the mutational and positional data, resulting in 777 proteins. See "COSMIC query" in Additional file1 for the SQL code required to generate the mutational data.
Obtaining the 3D structural data
The PDB web interface was used to obtain the protein tertiary information for each of the 777 proteins described in Section ‘Obtaining mutational data’. Since multiple structures are often available for the same protein, all structures with a matching UniProt Accession Number were used and an appropriate multiple comparisons adjustment (see Section ‘Multiple comparison adjustment for structures’) was performed afterwards. For proteins where the resolution provided alternative conformations, the first conformation listed in the file was used. Similarly, for structures where more than one polypeptide chain with a matching Uniprot Accession Number was available, the first matching chain listed in the file was used (typically chain A). Finally, after the chain and conformation were selected, the cartesian coordinates of all the α-carbon atoms were used to represent the tertiary backbone structure of the protein. While we only used the α-carbon location to represent the residue location in this paper, our methods are robust if any of the other backbone atoms are used including the amide nitrogen, main chain carbonyl carbon or the main chain carbonyl oxygen.
Also, while X-ray crystallography was used to determine many of the tertiary structures in the PDB, we note that molecular dynamics (MD) could in principle be used to model the protein structure in solution. However, taking into account the time complexity of such an approach for larger proteins as well as the number of structures that we consider, such a task is beyond the scope of this paper . Further, as crystal structures are almost always representative of the correctly folded protein, using the current structural information is more than sufficient until MD simulations can be applied on much faster time scales. See "Structure Files" in Additional file2 for a full listing of all the 1,904 structure/chain combinations used.
Reconciling the structural and mutational data
In order to reference the same residue in the COSMIC and PDB databases, an alignment was performed to accommodate their different numbering systems. Like iPAC, GraphPAC allows two such reconciliations. The first is based upon a pairwise alignment as described in Pages et al. while the second is based upon a numerical reconstruction from the structural information available in the PDB file. Due to the fact that the PDB file structure potentially changes depending upon the structure release date along with other technical complications, pairwise alignment was used for all the analysis described in this paper unless specifically noted. For further information on the alignment please see the documentation in the GraphPAC package available on Bioconductor. Protein/structure/chain combinations that resulted in only one mutation or no mutations on the residues for which tertiary information was available were dropped. Similar to iPAC, a successful alignment of the tertiary and mutational data was obtained for 140 proteins corresponding to 1100 unique structure/chain combinations. See "Structure Files" in Additional file2 for a full listing and description.
Traveling salesman approach
Here, d(i,j) represents the distance between residues i and j (with d(i,i) = 0) and π(i) represents the residue that follows residue i on the tour. The difference between the three insertion methods rests on how the next residue k is selected for insertion. Under cheapest insertion, the next k to be inserted into the tour is chosen such that the increase in tour length is minimal. Under nearest insertion, at each iteration, the k that is closest to a residue already on the tour is selected. Finally, under farthest insertion, the k that is farthest away from any residue already on the tour is selected.
These algorithms have different upper bounds on their tour lengths. For example, the farthest insertion algorithm creates tours that approach of the shortest length while the nearest and cheapest insertion algorithms can be linked to the minimal spanning tree algorithm and thus have an upper bound of twice the shortest tour length when distances satisfy the triangular inequality . Due to the varied nature of these methods and that there is no biological justification to favor one over the other, we consider all three methods when identifying clusters and then perform an appropriate multiple comparison adjustment to infer the statistical evidence of mutation clusters (see Section ‘Multiple comparison adjustment for structures’).
Path lengths between nearby and distant residues are statistically different
This table shows the p values when testing the difference in path length between residues that are close versus far apart
Close vs. Far
<5 Å vs >25Å
<10 Å vs >30Å
<15 Å vs >35Å
<20 Å vs >40Å
The above calculation is then performed on all pairwise mutations and an appropriate multiple comparison adjustment is then applied. For the remainder of this study, we use the more conservative Bonferroni correction [31, 32] to adjust for the intra-protein cluster p-values. See Section ‘Multiple comparison adjustment for structures’ for a description of how we account for the inter-protein multiple comparisons. Lastly, it is important to mention that the structural information obtained for each protein does not always contain the (x,y,z) coordinates for every residue in the protein. In such cases, in order to compare GraphPAC, iPAC and NMC on an equal basis, these missing residues are removed from the protein.
We also note that since we obtained our mutational data from COSMIC, some tissue types are more represented than others in the database. However, this scenario results in our analysis being more conservative and our findings even more significant. Assuming that mutations occur in different parts of the protein for different tissue types, when collapsing over all tissues a larger value of n is obtained while the values of i and k (as seen in Equation 2) for two specific mutations are not changed. This results in a larger p-value signifying that clusters found when collapsing over tissue types would be even more significant if only a unique tissue type was analyzed.
Multiple comparison adjustment for structures
A comparison of the 15 proteins that were found to contain significant clustering via GraphPAC , iPAC or NMC
In this section we compare the results between GraphPAC, iPAC and NMC in terms of the number of structures found (Section ‘Method comparison’) and describe the new proteins identified by GraphPAC (Section ‘GraphPAC identifies novel proteins with significant clustering’). We also show the results of our method in comparison to two machine learning methods along with a descriptions of whether our results overlap biologically relevant structures (Section ‘Cluster localization in relevant sites and performance evaluation’).
GraphPACidentifies novel proteins with significant clustering
GraphPAC identified four proteins with clustering that are missed by the iPAC algorithm: DPP4, MAP2K4, NRP1, and PSCK9. DPP4 is a serine protease that can modify tumor cell behavior and is a potential cancer therapeutic target . Both MAP2K4 and NRP1 are well known to be associated with lung cancer [35, 36]. Finally, while PCSK9 mutations are well known in causing hypercholesterolemia , recent research shows that absence of PCSK9 can provide a protective benefit against melanoma due to lower circulating LDLc. This allows for a potential additional cancer therapy via PCSK9 inhibitors . . For a full listing of which structure-protein combinations were found significant, see "Results Summary" in Additional file4. Please see Sections ‘GraphPAC finds novel proteins compared to iPAC and NMC’, ‘GraphPAC identifies additional clusters compared to iPAC and NMC’ and ‘GraphPAC finds fewer clusters compared to NMC’ for an in-depth review of selected protein-structure combinations.
Cluster localization in relevant sites and performance evaluation
We note that 9 of the 13 proteins that GraphPAC identified as having significant clustering have their most significant cluster overlap a binding site, catalytic domain or kinase domain. Out of the remaining four proteins, three proteins have their most significant cluster fall within a previously identified biologically relevant region. For instance, IDE’s most significant cluster is located on residues 684–698, a denaturation-resistant epitope region . For NRP1, which plays roles in angiogenesis  and axon guidance , the most significant cluster directly overlaps the F5/8 type C 1 domain - a domain in many blood coagulation factors. Finally, for PIK3C- α, the most significant cluster overlaps residue 1047 which has been shown to potentially increase the substrate turnover rate, a common oncogenic behavior . For further detail on relevant biological site information, please see "Relevant Sites" in Additional file5.
Further, we evaluated the performance of GraphPAC via two well-known machine learning algorithms: CHASM and PolyPhen-2. It is critical to first note however, that the machine learning algorithms utilize a much more detailed set of features when evaluating the mutation. Thus these algorithms may identify mutations as significant while GraphPAC would not. Nevertheless, of all the mutations that fall within significant clusters identified by GraphPAC, 93% and 91% of them were also identified as significant (FDR ≤ 20%) by CHASM and PolyPhen-2 (respectively). We note that GraphPAC is only able to determine statistically significant clustering and not whether a mutation is truly damaging and/or activating. However, given the high percentages described above, the evidence supports the hypothesis that clustering is in fact indicative of potential driver mutations. Thus, via GraphPAC, the researcher has a fast and easily available tool to identify potential driver mutations for further study. The benefit of GraphPAC is that it is able to be executed with far less prior information as compared to the machine learning approaches. For further details, see "Performance Evaluation" in Additional file6.
Finally, we note that while GraphPAC provides an improvement in cluster identification compared to prior work, the algorithm is unable to distinguish between mutations that increase or decrease kinase activity nor between gain-of-function (GOF) or loss-of-function (LOF) mutations. As described by Lapenna and Giordano , Brognard et al., Geiger et al., Ahn et al., Lisabeth et al. and Linka et al., a large body of literature suggests that inactivating loss-of-function mutations are more common than previously thought and often occur in regions that regulate kinase activation. Nevertheless, as described above, many of the clusters identified by GraphPAC contain mutations that are classified as driver and/or damaging by common machine learning algorithms. As such, GraphPAC provides a fast and easy method to identify such potential mutations, which can then be verified and analyzed via additional approaches. These approaches can range from the aforementioned machine learning algorithms to experimental approaches that test for GOF mutations as described by Fawdar et al..
In this section we discuss in depth some of the clustering results presented in Section ‘Results’. Specifically, we review in detail three situations: 1) GraphPAC identifies novel proteins (Section ‘GraphPAC finds novel proteins compared to iPAC and NMC’), 2) GraphPAC finds additional clusters in proteins identified to contain clustering by other methods (Section ‘GraphPAC identifies additional clusters compared to iPAC and NMC’) and 3) GraphPAC finds fewer clusters compared to other methods (Section ‘GraphPAC finds fewer clusters compared to NMC’). In each of these sections, we discuss the biological relevance of our findings.
GraphPAC finds novel proteins compared to iPAC and NMC
As shown in Table 2, GraphPAC identified five additional proteins as compared to the linear NMC algorithm. In this section we will consider two of these proteins, both of which are directly related to cancer: EGFR, which is also identified by iPAC, and NRP1, which is not identified by iPAC.
GraphPAC identifies additional clusters compared to iPAC and NMC
P-value comparison of the three algorithms for several significant clusters
Finally, the majority of mutations in cluster 61–146 also segregate along pathological lines with all the mutations in our data either occurring in lung or gastrointestinal tract/large intestine carcinomas. Specifically, residue 61 mutations are typically found in colorectal and lung cancer [6, 73] while mutations K117N and A146T are found in colorectal cancer .
GraphPAC finds fewer clusters compared to NMC
A comparison of GraphPAC and NMC identified clusters for the BRAF structure
(a) Clusters found by GraphPAC
(b) Clusters found by NMC and dropped by GraphPAC
Although it is outside the scope of this manuscript to consider every difference between Tables 4a and 4b, we observe that three of the longest clusters 464–671, 466–671 and 469–671 are dropped by GraphPAC. Since after alignment of the protein structural data to the mutational data (see Section ‘Obtaining the 3D structural data’), tertiary information was available on residues 448–603 and 610–723, these clusters cover 77.0%, 76.3% and 75.2% of all the available residues, respectively. By considering the 3D structure via GraphPAC, the longest clusters are dropped and the remaining overlapping clusters focus almost exclusively on residues 464–600.
After structure and mutation alignment, the residue substitutions in significant clusters include: G464V, G466V, G469V, G469A, N581S, G596R, L597V, LV597R, V600E, V600K, K601N and R671Q. Since mutation R671Q does not have extensive literature and comes from a non-specified tissue sample in the COSMIC database, it will no longer be considered here. Thus, by considering the tertiary structure, we significantly narrow the window of which residues to consider for potential driver mutations and can partition the protein into three segments: I) 464–599 and II) 600 and III) 601. Segment I is primarily associated with lung and colorectal cancer as shown in [3, 75–77]. Segment II represents the two most common mutations in BRAF, V600E and V600K. Overall, 95% of BRAF mutations occur on V600, with some studies showing that V600E occurs within 73% to 79% of patients while V600K occurs within 12% to 19% of patients [78, 79]. Mutations at this position result in the oncogene being constitutively activated with increased kinase activity and have been found in a wide range of cancers such as metastatic melanoma , ovarian serous carcinoma  and hairy cell leukemia . Furthermore, recent inhibitors, such as Vemurafenib and GSK2118436 specifically target the V600E and V600E/K mutations (respectively), supporting the hypothesis that somatic clusters can provide pharmacological targets . Lastly, segment III is comprised of the much less common K601N mutation which has been observed in myeloma cases along with V600E. Since these patients share the more common BRAF mutations as well, they may also potentially benefit from BRAF inhibitors .
Further, as shown in Section ‘Results’ and described above, GraphPAC finds fewer clusters for a significant percentage of the structures analyzed. Overall, the reduction in total clusters identified can result from two sources: the removal of some residues because no tertiary data was available or the cluster is no longer significant when using the traveling salesman algorithm to account for 3D structure. The first case, which is already rare, will become increasingly more so as additional studies result in more complete and detailed structural information. For the second case, if a cluster is not found to be significant under GraphPAC when compared to NMC, a near or overlapping cluster is usually found (see Tables 4a and 4b). For BRAF specifically, under every type of graph insertion method (cheapest, nearest and farthest), every "probably damaging" or "possibly damaging" mutation (as classified by PolyPhen-2) was still identified in at least one significant cluster for the structure. For a complete analysis, see "Potential Driver Loss" in Additional file7.
In this manuscript we provide an alternative method to utilize protein tertiary structure when identifying somatic mutation clusters. By employing a graph theoretic approach to restructuring the protein order, we identify both new clusters in proteins previously shown to have clustering as well as proteins that were not previously shown to have clustering. We have also provided several examples where we are able to identify clusters of mutations that may benefit from pharmacological treatment. Moreover, as GraphPAC uses the NMC algorithm to identify clusters rather than a fixed window size, we are able to detect clusters of varying lengths. Finally, the methodology is fast and robust with the overwhelming majority of structure/protein combinations taking under 10 minutes each to analyze on a consumer desktop.
The GraphPAC algorithm, while presenting a viable alternative to the MDS restriction of iPAC and an improvement over NMC, nevertheless contains several limitations. First, while no longer bound to the MDS requirement of iPAC, there is no closed form solution to the shortest path problem and our algorithm must appeal to heuristic approximations. Second, to satisfy the uniformity assumption, the mutation status of all residues must be known ahead of time. With the growth of high-throughput sequencing however, this issue is temporary. Next, unequal rates of mutagenesis along with hypermutability of specific genomic regions may violate the assumption that every residue has a uniform probability of mutation. To help ensure that this assumption holds, we only consider single residue missense substitutions and have removed insertions and deletions from the analysis since they tend to be sequence dependent. Further, research has shown that CpG dinucleotides may have a mutational frequency ten times or higher compared to other dinucleotides . However, in the analyses presented in Sections ‘GraphPAC finds novel proteins compared to iPAC and NMC’, ‘GraphPAC identifies additional clusters compared to iPAC and NMC’, ‘GraphPAC finds fewer clusters compared to NMC’, only approximately 13% of the mutations used to identify clustering occurred in CpG sites. Relatedly, colorectal carcinomas  contain more transition mutations while cigarette use results in more transversion mutations in lung carcinomas . Still, when considering KRAS, the overwhelming majority of substitutions occur on residues 12, 13, and 61 for both colorectal and lung cancer, implying that while the mutational landscape may vary, it does not have a significant effect on mutation location and thus would not violate the uniformity assumption. Hence, while this analysis is influenced by a variety of factors, as are previous studies, it nevertheless appears that the primary cause of clustering is selection for a cancer phenotype.
Several areas for future research are also directly evident. First, an approach that considers the protein directly in 3D space via simulation may be employed. However, such an approach would not be able to use the order statistic methodology to identify clustering and thus might not be as sensitive for small mutation counts. Moreover, while we only consider distance when finding the shortest path through the graph, future research can incorporate the physico-chemical properties of the specific residues or domains by appropriately increasing or decreasing edge length. The potential additive effect of multiple cancer mutations in the same protein, as discussed in the case of EGFR by Hashimoto et al., can also be incorporated via additional refinement of the edge weights. Additional research is required in this area in order to incorporate these improvements.
Overall however, GraphPAC utilizes protein tertiary structure via a graph theoretic approach in identifying mutational clustering. We show that this method identifies new clusters that are otherwise missed and that in some cases, pharmaceutical targets for mutations in these clusters have already been found and therapies created. Specifically, Erlotinib and Getfinib are used to target mutations in EGFR significant clusters (see Section ‘GraphPAC finds novel proteins compared to iPAC and NMC’) while Vemurafenib is used to target mutations that occur within BRAF significant clusters (see Section ‘GraphPAC finds fewer clusters compared to NMC’). This helps confirm the hypothesis that mutational clustering may be indicative of driver mutations and as new protein structures become available, GraphPAC can provide a rapid methodology to identify such potential mutations.
Availability and requirements
Project Name: GraphPAC: Identification of Mutational Clusters in Proteins via a Graph Theoretical Approach.
Project Home Page: http://www.bioconductor.org/packages/release/bioc/html/GraphPAC.html
Operating system(s): Platform independent
Programming Language: R
Other Requirements: R ≥2.15 (see homepage for R package requirements).
a Under a complete graph, every vertex is connected to every other vertex. The length of the edge between vertices i and j is set to be equal to the length between amino acids i and j in .
b A Hamiltonian path is a walk through the graph that visits every vertex once and only once.
c For this analysis, a manual reconstruction was performed in order to include residue 61 which is listed as a histidine under isoform 2B in the Uniprot Database and a glutamine in the COSMIC database. As the substitution of one amino acid in the structure would not have a significant impact on the spatial structure of the protein, and residue 61 is a highly mutated position, the residue was kept in the analysis. As a result, amino acids 1–167 are used.
We thank Drs. Robert Bjornson and Nicholas Carriero for their time and help in discussing this methodology.
- Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med. 2004, 10 (8): 789-799. 10.1038/nm1087.View ArticlePubMedGoogle Scholar
- Weinstein IB, Joe AK: Mechanisms of disease: Oncogene addiction–a rationale for molecular targeting in cancer therapy. Nat Clin Pract Oncol. 2006, 3 (8): 448-457. 10.1038/ncponc0558.View ArticlePubMedGoogle Scholar
- Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C, Edkins S, O’Meara S, Vastrik I, Schmidt EE, Avis T, Barthorpe S, Bhamra G, Buck G, Choudhury B, Clements J, Cole J, Dicks E, Forbes S, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jenkinson A, Jones D, et al: Patterns of somatic mutation in human cancer genomes. Nature. 2007, 446 (7132): 153-158. 10.1038/nature05610.View ArticlePubMed CentralPubMedGoogle Scholar
- Wang T: Prevalence of somatic alterations in the colorectal cancer cell genome. Proc Natl Acad Sci. 2002, 99 (5): 3076-3080. 10.1073/pnas.261714699.View ArticlePubMed CentralPubMedGoogle Scholar
- Bardelli A, Parsons DW, Silliman N, Ptak J, Szabo S, Saha S, Markowitz S, Willson JKV, Parmigiani G, Kinzler KW, Vogelstein B, Velculescu VE: Mutational analysis of the tyrosine kinome in colorectal cancers. Science. 2003, 300 (5621): 949-10.1126/science.1082596.View ArticlePubMedGoogle Scholar
- Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JKV, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE: The consensus coding sequences of human breast and colorectal cancers. Science. 2006, 314 (5797): 268-274. 10.1126/science.1133427.View ArticlePubMedGoogle Scholar
- Kreitman M: Methods to detect selection in populations with applications to the human. Annu Rev Genomics Hum Genet. 2000, 1: 539-559. 10.1146/annurev.genom.1.1.539.View ArticlePubMedGoogle Scholar
- Ye J, Pavlicek A, Lunney EA, Rejto PA, Teng C: Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010, 11: 11-10.1186/1471-2105-11-11.View ArticlePubMed CentralPubMedGoogle Scholar
- Ryslik GA, Cheng Y, Cheung KH, Modis Y, Zhao H: Utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics. 2013, 14: 190-10.1186/1471-2105-14-190.View ArticlePubMed CentralPubMedGoogle Scholar
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR: A method and server for predicting damaging missense mutations. Nat Methods. 2010, 7 (4): 248-249. 10.1038/nmeth0410-248.View ArticlePubMed CentralPubMedGoogle Scholar
- Carter H, Chen S, Isik L, Tyekucheva S, Velculescu VE, Kinzler KW, Vogelstein B, Karchin R: Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 2009, 69 (16): 6660-6667. 10.1158/0008-5472.CAN-09-1133.View ArticlePubMed CentralPubMedGoogle Scholar
- Carter H, Samayoa J, Hruban RH, Karchin R: Prioritization of driver mutations in pancreatic cancer using cancer-specific high-throughput annotation of somatic mutations (CHASM). Cancer Biol Ther. 2010, 10 (6): 582-587. 10.4161/cbt.10.6.12537.View ArticlePubMed CentralPubMedGoogle Scholar
- Douville C, Carter H, Kim R, Niknafs N, Diekhans M, Stenson PD, Cooper DN, Ryan M, Karchin R: CRAVAT: cancer-related analysis of variants toolkit. Bioinformatics. 2013, 29 (5): 647-648. 10.1093/bioinformatics/btt017.View ArticlePubMed CentralPubMedGoogle Scholar
- Breiman L: Random forests. Mach Learn. 2001, 45: 5-32. 10.1023/A:1010933404324.View ArticleGoogle Scholar
- Cortes C, Vapnik V: Support-vector networks. Mach Learn. 1995, 20 (3): 273-297.Google Scholar
- Reva B, Antipin Y, Sander C: Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011, 39 (17): e118-e118. 10.1093/nar/gkr407.View ArticlePubMed CentralPubMedGoogle Scholar
- Ng PC, Henikoff S: Predicting deleterious amino acid substitutions. Genome Res. 2001, 11 (5): 863-874. 10.1101/gr.176601.View ArticlePubMed CentralPubMedGoogle Scholar
- Gnad F, Baucom A, Mukhyala K, Manning G, Zhang Z: Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genomics. 2013, 14 (Suppl 3): S7-PubMed CentralPubMedGoogle Scholar
- Gonzalez-Perez A, Mustonen V, Reva B, Ritchie GRS, Creixell P, Karchin R, Vazquez M, Fink JL, Kassahn KS, Pearson JV, Bader GD, Boutros PC, Muthuswamy L, Ouellette BFF, Reimand J, Linding R, Shibata T, Valencia A, Butler A, Dronov S, Flicek P, Shannon NB, Carter H, Ding L, Sander C, Stuart JM, Stein LD, Lopez-Bigas N: International Cancer Genome Consortium Mutation Pathways and Consequences Subgroup of the Bioinformatics Analyses Working Group: Computational approaches to identify functional genetic variants in cancer genomes. Nature methods. 2013, 10 (8): 723-729. 10.1038/nmeth.2562.View ArticlePubMed CentralPubMedGoogle Scholar
- Torkamani A, Schork NJ: Prediction of cancer driver mutations in protein kinases. Cancer Res. 2008, 68 (6): 1675-1682. 10.1158/0008-5472.CAN-07-5283.View ArticlePubMedGoogle Scholar
- Borg I, Groenen PJF: Modern Multidimensional Scaling : Theory and Applications. 1997, New York: SpringerView ArticleGoogle Scholar
- Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, Menzies A, Teague JW, Futreal PA, Stratton MR: The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet. 2008, Chapter 10: Unit 10.11-PubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, Shindyalov I, Bourne P: The protein data bank. Nucleic Acids Res. 2000, 28: 235-242. 10.1093/nar/28.1.235.View ArticlePubMed CentralPubMedGoogle Scholar
- The UniProt Consortium: Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 2011, 40 (D1): D71-D75.View ArticlePubMed CentralGoogle Scholar
- Durrant J, McCammon JA: Molecular dynamics simulations and drug discovery. BMC Biology. 2011, 9: 71-10.1186/1741-7007-9-71.View ArticlePubMed CentralPubMedGoogle Scholar
- Pages H, Aboyoun P, Gentleman R, DebRoy S: Biostrings: String objects representing biological sequences, and matching algorithms. 2012, [R package version 2.24.1]. http://www.bioconductor.org/packages/release/bioc/html/Biostrings.html.,Google Scholar
- Applegate DL: The Traveling Salesman Problem: A Computational Study. 2006, Princeton: Princeton University Press, Princeton series in applied mathematicsGoogle Scholar
- Hahsler M, Hornik K: TSP—Infrastructure for the Traveling Salesperson Problem. J Stat Software. 2007, 23 (2): 1-21.View ArticleGoogle Scholar
- Gutin G, Punnen AP: The Traveling Salesman Problem and its Variations. 2007, New York: Springer, No. 12 in Combinatorial optimizationView ArticleGoogle Scholar
- Rosenkrantz DJ, Stearns RE, Lewis PM II: An analysis of several heuristics for the traveling salesman problem. SIAM J Comput. 1977, 6 (3): 563-581. 10.1137/0206041.View ArticleGoogle Scholar
- Dunn OJ: Confidence intervals for the means of dependent, normally distributed variables. J Am Stat Assoc. 1959, 54 (287): 613-621. 10.1080/01621459.1959.10501524.View ArticleGoogle Scholar
- Dunn OJ: Multiple comparisons among means. J Am Stat Assoc. 1961, 56 (293): 52-64. 10.1080/01621459.1961.10482090.View ArticleGoogle Scholar
- Gong Y, Kakihara Y, Krogan N, Greenblatt J, Emili A, Zhang Z, Houry WA: An atlas of chaperone–protein interactions in Saccharomyces cerevisiae: implications to protein folding pathways in the cell. Mol Syst Biol. 2009, 5: doi:10.1038/msb.2009.26Google Scholar
- Kelly T: Fibroblast activation protein-α and dipeptidyl peptidase IV (CD26): Cell-surface proteases that activate cell signaling and are potential targets for cancer therapy. Drug Resist Updat. 2005, 8 (1–2): 51-58.View ArticlePubMedGoogle Scholar
- Lantuéjoul S, Constantin B, Drabkin H, Brambilla C, Roche J, Brambilla E: Expression of VEGF, semaphorin SEMA3F, and their common receptors neuropilins NP1 and NP2 in preinvasive bronchial lesions, lung tumours, and cell lines. J Pathol. 2003, 200 (3): 336-347. 10.1002/path.1367.View ArticlePubMedGoogle Scholar
- Ahn YH, Yang Y, Gibbons DL, Creighton CJ, Yang F, Wistuba II, Lin W, Thilaganathan N, Alvarez CA, Roybal J, Goldsmith EJ, Tournier C, Kurie JM: Map2k4 functions as a tumor suppressor in lung adenocarcinoma and inhibits tumor cell invasion by decreasing peroxisome proliferator-activated receptor 2 expression. Mol Cell Biol. 2011, 31 (21): 4270-4285. 10.1128/MCB.05562-11.View ArticlePubMed CentralPubMedGoogle Scholar
- Abifadel M, Varret M, Rabès JP, Allard D, Ouguerram K, Devillers M, Cruaud C, Benjannet S, Wickham L, Erlich D, Derré A, Villéger L, Farnier M, Beucler I, Bruckert E, Chambaz J, Chanu B, Lecerf JM, Luc G, Moulin P, Weissenbach J, Prat A, Krempf M, Junien C, Seidah NG, Boileau C: Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet. 2003, 34 (2): 154-156. 10.1038/ng1161.View ArticlePubMedGoogle Scholar
- Seidah NG: Proprotein Convertase Subtilisin Kexin 9 (PCSK9) Inhibitors in the treatment of hypercholesterolemia and other pathologies. Curr Pharm Des. 2013, 19 (17): 3161-3172. 10.2174/13816128113199990313.View ArticlePubMedGoogle Scholar
- Cavender JF, Mummert C, Tevethia MJ: Transactivation of a ribosomal gene by simian virus 40 large-T antigen requires at least three activities of the protein. J Virol. 1999, 73: 214-224.PubMed CentralPubMedGoogle Scholar
- Jubb AM, Strickland LA, Liu SD, Mak J, Schmidt M, Koeppen H: Neuropilin-1 expression in cancer and development. J Pathol. 2012, 226: 50-60. 10.1002/path.2989.View ArticlePubMedGoogle Scholar
- Maden CH, Gomes J, Schwarz Q, Davidson K, Tinker A, Ruhrberg C: NRP1 and NRP2 cooperate to regulate gangliogenesis, axon guidance and target innervation in the sympathetic nervous system. Dev Biol. 2012, 369 (2): 277-285. 10.1016/j.ydbio.2012.06.026.View ArticlePubMed CentralPubMedGoogle Scholar
- Mankoo PK, Sukumar S, Karchin R: PIK3CA somatic mutations in breast cancer: Mechanistic insights from Langevin dynamics simulations. Proteins: Struct, Funct, Bioinf. 2009, 75 (2): 499-508. 10.1002/prot.22265.View ArticleGoogle Scholar
- Lapenna S, Giordano A: Cell cycle kinases as therapeutic targets for cancer. Nat Rev Drug Discovery. 2009, 8 (7): 547-566. 10.1038/nrd2907.View ArticleGoogle Scholar
- Brognard J, Zhang YW, Puto LA, Hunter T: Cancer-associated loss-of-function mutations implicate DAPK3 as a tumor-suppressing kinase. Cancer Res. 2011, 71 (8): 3152-3161. 10.1158/0008-5472.CAN-10-3543.View ArticlePubMed CentralPubMedGoogle Scholar
- Geiger TR, Song JY, Rosado A, Peeper DS: Functional characterization of human cancer-derived TRKB mutations. PLoS ONE. 2011, 6 (2): e16871-10.1371/journal.pone.0016871.View ArticlePubMed CentralPubMedGoogle Scholar
- Lisabeth EM, Fernandez C, Pasquale EB: Cancer somatic mutations disrupt functions of the EphA3 receptor tyrosine kinase through multiple mechanisms. Biochemistry. 2012, 51 (7): 1464-1475. 10.1021/bi2014079.View ArticlePubMed CentralPubMedGoogle Scholar
- Linka RM, Risse SL, Bienemann K, Werner M, Linka Y, Krux F, Synaeve C, Deenen R, Ginzel S, Dvorsky R, Gombert M, Halenius A, Hartig R, Helminen M, Fischer A, Stepensky P, Vettenranta K, Köhrer K, Ahmadian MR, Laws HJ, Fleckenstein B, Jumaa H, Latour S, Schraven B, Borkhardt A: Loss-of-function mutations within the IL-2 inducible kinase ITK in patients with EBV-associated lymphoproliferative diseases. Leukemia. 2012, 26 (5): 963-971. 10.1038/leu.2011.371.View ArticlePubMedGoogle Scholar
- Fawdar S, Trotter EW, Li Y, Stephenson NL, Hanke F, Marusiak AA, Edwards ZC, Ientile S, Waszkowycz B, Miller CJ, Brognard J: Targeted genetic dependency screen facilitates identification of actionable mutations in FGFR4, MAP3K9, and PAK5 in lung cancer. Proc Natl Acad Sci. 2013, 110 (30): 12426-12431. 10.1073/pnas.1305207110.View ArticlePubMed CentralPubMedGoogle Scholar
- Herbst RS: Review of epidermal growth factor receptor biology. Int J Radiat Oncol Biol Phys. 2004, 59 (2, Supplement): S21-S26. 10.1016/j.ijrobp.2003.11.041.View ArticleGoogle Scholar
- Heimberger AB, Hlatky R, Suki D, Yang D, Weinberg J, Gilbert M, Sawaya R, Aldape K: Prognostic effect of epidermal growth factor receptor and EGFRvIII in glioblastoma multiforme patients. Clin Cancer Res. 2005, 11 (4): 1462-1466. 10.1158/1078-0432.CCR-04-1737.View ArticlePubMedGoogle Scholar
- Ladanyi M, Pao W: Lung adenocarcinoma: guiding EGFR-targeted therapy and beyond. Mod Pathol. 2008, 21: S16-S22.View ArticlePubMedGoogle Scholar
- Markman B, Javier Ramos F, Capdevila J, Tabernero J: EGFR and KRAS in colorectal cancer. Adv Clin Chem. 2010, 51: 71-119.View ArticlePubMedGoogle Scholar
- Yun CH, Boggon TJ, Li Y, Woo MS, Greulich H, Meyerson M, Eck MJ: Structures of lung cancer-derived egfr mutants and inhibitor complexes: mechanism of activation and insights into differential inhibitor sensitivity. Cancer Cell. 2007, 11 (3): 217-227. 10.1016/j.ccr.2006.12.017.View ArticlePubMed CentralPubMedGoogle Scholar
- Simonetti S, Molina M, Queralt C, de Aguirre I, Mayo C, Bertran-Alamillo J, Sanchez J, Gonzalez-Larriba J, Jimenez U, Isla D, Moran T, Viteri S, Camps C, Garcia-Campelo R, Massuti B, Benlloch S, y Cajal S, Taron M, Rosell R: Detection of EGFR mutations with mutation-specific antibodies in stage IV non-small-cell lung cancer. J Transl Med. 2010, 8: 135-10.1186/1479-5876-8-135.View ArticlePubMed CentralPubMedGoogle Scholar
- Masago K, Fujita S, Irisa K, Kim YH, Ichikawa M, Mio T, Mishima M: Good clinical response to gefitinib in a non-small cell lung cancer patient harboring a rare somatic epidermal growth factor gene point mutation; codon 768 AGC > ATC in exon 20 (S768I). Jpn J Clin Oncol. 2010, 40 (11): 1105-1109. 10.1093/jjco/hyq087.View ArticlePubMedGoogle Scholar
- Yoshikawa S, Kukimoto-Niino M, Parker L, Handa N, Terada T, Fujimoto T, Terazawa Y, Wakiyama M, Sato M, Sano S, Kobayashi T, Tanaka T, Chen L, Liu ZJ, Wang BC, Shirouzu M, Kawa S, Semba K, Yamamoto T, Yokoyama S: Structural basis for the altered drug sensitivities of non-small cell lung cancer-associated mutants of human epidermal growth factor receptor. Oncogene. 2012, 32: 27-38.View ArticlePubMedGoogle Scholar
- Yun CH, Boggon TJ, Li Y, Woo MS, Greulich H, Meyerson M, Eck MJ: Structures of lung cancer-derived EGFR mutants and inhibitor complexes: mechanism of activation and insights into differential inhibitor sensitivity. Cancer Cell. 2007, 11 (3): 217-227. 10.1016/j.ccr.2006.12.017.View ArticlePubMed CentralPubMedGoogle Scholar
- Kancha RK, Peschel C, Duyster J: The epidermal growth factor receptor-L861Q mutation increases kinase activity without leading to enhanced sensitivity toward epidermal growth factor receptor kinase inhibitors. J Thorac Oncol. 2011, 6 (2): 387-392. 10.1097/JTO.0b013e3182021f3e.View ArticlePubMedGoogle Scholar
- Peraldo-Neia C, Migliardi G, Mello-Grand M, Montemurro F, Segir R, Pignochino Y, Cavalloni G, Torchio B, Mosso L, Chiorino G, Aglietta M: Epidermal Growth Factor Receptor (EGFR) mutation analysis, gene expression profiling and EGFR protein expression in primary prostate cancer. BMC Cancer. 2011, 11: 31-10.1186/1471-2407-11-31.View ArticlePubMed CentralPubMedGoogle Scholar
- Zhang X, Pickin KA, Bose R, Jura N, Cole PA, Kuriyan J: Inhibition of the EGF receptor by binding of MIG6 to an activating kinase domain interface. Nature. 2007, 450 (7170): 741-744. 10.1038/nature05998.View ArticlePubMed CentralPubMedGoogle Scholar
- Jura N, Endres NF, Engel K, Deindl S, Das R, Lamers MH, Wemmer DE, Zhang X, Kuriyan J: Mechanism for activation of the EGF receptor catalytic domain by the juxtamembrane segment. Cell. 2009, 137 (7): 1293-1307. 10.1016/j.cell.2009.04.025.View ArticlePubMed CentralPubMedGoogle Scholar
- Hansel DE, Wilentz RE, Yeo CJ, Schulick RD, Montgomery E, Maitra A: Expression of neuropilin-1 in high-grade dysplasia, invasive cancer, and metastases of the human gastrointestinal tract. Am J Surg Pathol. 2004, 28 (3): 347-356. 10.1097/00000478-200403000-00007.View ArticlePubMedGoogle Scholar
- Parikh AA, Liu WB, Fan F, Stoeltzing O, Reinmuth N, Bruns CJ, Bucana CD, Evans DB, Ellis LM: Expression and regulation of the novel vascular endothelial growth factor receptor neuropilin-1 by epidermal growth factor in human pancreatic carcinoma. Cancer. 2003, 98 (4): 720-729. 10.1002/cncr.11560.View ArticlePubMedGoogle Scholar
- Hong TM, Chen YL, Wu YY, Yuan A, Chao YC, Chung YC, Wu MH, Yang SC, Pan SH, Shih JY, Chan WK, Yang PC: Targeting neuropilin 1 as an antitumor strategy in lung cancer. Clin Cancer Res. 2007, 13 (16): 4759-4768. 10.1158/1078-0432.CCR-07-0001.View ArticlePubMedGoogle Scholar
- Pan Q, Chanthery Y, Liang WC, Stawicki S, Mak J, Rathore N, Tong RK, Kowalski J, Yee SF, Pacheco G, Ross S, Cheng Z, Le Couter J, Plowman G, Peale F, Koch AW, Wu Y, Bagri A, Tessier-Lavigne M, Watts RJ: Blocking neuropilin-1 function has an additive effect with anti-VEGF to inhibit tumor growth. Cancer Cell. 2007, 11: 53-67. 10.1016/j.ccr.2006.10.018.View ArticlePubMedGoogle Scholar
- Appleton BA, Wu P, Maloney J, Yin J, Liang WC, Stawicki S, Mortara K, Bowman KK, Elliott JM, Desmarais W, Bazan JF, Bagri A, Tessier-Lavigne M, Koch AW, Wu Y, Watts RJ, Wiesmann C: Structural studies of neuropilin/antibody complexes provide insights into semaphorin and VEGF binding. EMBO J. 2007, 26 (23): 4902-4912. 10.1038/sj.emboj.7601906. [PDB ID: 2QQI]View ArticlePubMed CentralPubMedGoogle Scholar
- Tong Y, Tempel W, Shen L, Arrowsmith C, Edwards A, Sundstrom M, Weigelt J, Bockharev A, Park H: Human K-Ras in complex with a GTP analogue. 2009, [http://www.rcsb.org/pdb/explore.do?structureId=3GFT] [PDB ID: 3GFT].,Google Scholar
- Kranenburg O: The KRAS oncogene: past, present, and future. Biochim Biophys Acta Rev Canc. 2005, 1756 (2): 81-82. 10.1016/j.bbcan.2005.10.001.View ArticleGoogle Scholar
- McCoy MS, Bargmann CI, Weinberg RA: Human colon carcinoma Ki-ras2 oncogene and its corresponding proto-oncogene. Mol Cell Biol. 1984, 4 (8): 1577-1582.View ArticlePubMed CentralPubMedGoogle Scholar
- Motojima K, Urano T, Nagata Y, Shiku H, Tsurifune T, Kanematsu T: Detection of point mutations in the Kirsten-ras oncogene provides evidence for the multicentricity of pancreatic carcinoma. Ann Surg. 1993, 217 (2): 138-143. 10.1097/00000658-199302000-00007.View ArticlePubMed CentralPubMedGoogle Scholar
- Zenker M, Lehmann K, Schulz AL, Barth H, Hansmann D, Koenig R, Korinthenberg R, Kreiss-Nachtsheim M, Meinecke P, Morlot S, Mundlos S, Quante AS, Raskin S, Schnabel D, Wehner LE, Kratz CP, Horn D, Kutsche K: Expansion of the genotypic and phenotypic spectrum in patients with KRAS germline mutations. J Med Genet. 2007, 44 (2): 131-135.View ArticlePubMed CentralPubMedGoogle Scholar
- Gremer L, Merbitz-Zahradnik T, Dvorsky R, Cirstea IC, Kratz CP, Zenker M, Wittinghofer A, Ahmadian MR: Germline KRAS mutations cause aberrant biochemical and physical properties leading to developmental disorders. Hum Mutat. 2011, 32: 33-43. 10.1002/humu.21377.View ArticlePubMed CentralPubMedGoogle Scholar
- Tam IYS, Chung LP, Suen WS, Wang E, Wong MCM, Ho KK, Lam WK, Chiu SW, Girard L, Minna JD, Gazdar AF, Wong MP: Distinct epidermal growth factor receptor and KRAS mutation patterns in non-small cell lung cancer patients with different tobacco exposure and clinicopathologic features. Clin Cancer Res. 2006, 12 (5): 1647-1653. 10.1158/1078-0432.CCR-05-1981.View ArticlePubMedGoogle Scholar
- Qin J, Xie P, Ventocilla C, Zhou G, Vultur A, Chen Q, Liu Q, Herlyn M, Winkler J, Marmorstein R: Identification of a Novel family of BRAF V600E inhibitors. J Med Chem. 2012, 55 (11): 5220-5230. 10.1021/jm3004416. PDB ID: 4E26View ArticlePubMed CentralPubMedGoogle Scholar
- Naoki K, Chen TH, Richards WG, Sugarbaker DJ, Meyerson M: Missense mutations of the BRAF gene in human lung adenocarcinoma. Cancer Res. 2002, 62 (23): 7001-7003.PubMedGoogle Scholar
- Davies H, Bignell GR, Cox C, Stephens P, Edkins S, Clegg S, Teague J, Woffendin H, Garnett MJ, Bottomley W, Davis N, Dicks E, Ewing R, Floyd Y, Gray K, Hall S, Hawes R, Hughes J, Kosmidou V, Menzies A, Mould C, Parker A, Stevens C, Watt S, Hooper S, Wilson R, Jayatilake H, Gusterson BA, Cooper C, Shipley J, et al: Mutations of the BRAF gene in human cancer. Nature. 2002, 417 (6892): 949-954. 10.1038/nature00766.View ArticlePubMedGoogle Scholar
- Gandhi J, Zhang J, Xie Y, Soh J, Shigematsu H, Zhang W, Yamamoto H, Peyton M, Girard L, Lockwood WW, Lam WL, Varella-Garcia M, Minna JD, Gazdar AF: Alterations in genes of the EGFR signaling pathway and their relationship to EGFR tyrosine kinase inhibitor sensitivity in lung cancer cell lines. PLoS ONE. 2009, 4 (2): e4576-10.1371/journal.pone.0004576.View ArticlePubMed CentralPubMedGoogle Scholar
- Lovly CM, Dahlman KB, Fohn LE, Su Z, Dias-Santagata D, Hicks DJ, Hucks D, Berry E, Terry C, Duke M, Su Y, Sobolik-Delmaire T, Richmond A, Kelley MC, Vnencak-Jones CL, Iafrate AJ, Sosman J, Pao W: Routine multiplex mutational profiling of melanomas enables enrollment in genotype-driven therapeutic trials. PloS one. 2012, 7 (4): e35309-10.1371/journal.pone.0035309.View ArticlePubMed CentralPubMedGoogle Scholar
- Menzies AM, Haydu LE, Visintin L, Carlino MS, Howle JR, Thompson JF, Kefford RF, Scolyer RA, Long GV: Distinguishing clinicopathologic features of patients with V600E and V600K BRAF-mutant metastatic melanoma. Clin Cancer Res. 2012, 18 (12): 3242-3249. 10.1158/1078-0432.CCR-12-0052.View ArticlePubMedGoogle Scholar
- Sosman JA, Kim KB, Schuchter L, Gonzalez R, Pavlick AC, Weber JS, McArthur GA, Hutson TE, Moschos SJ, Flaherty KT, Hersey P, Kefford R, Lawrence D, Puzanov I, Lewis KD, Amaravadi RK, Chmielowski B, Lawrence HJ, Shyr Y, Ye F, Li J, Nolop KB, Lee RJ, Joe AK, Ribas A: Survival in BRAF V600–mutant advanced melanoma treated with vemurafenib. N Engl J Med. 2012, 366 (8): 707-714. 10.1056/NEJMoa1112302.View ArticlePubMed CentralPubMedGoogle Scholar
- Grisham RN, Iyer G, Garg K, DeLair D, Hyman DM, Zhou Q, Iasonos A, Berger MF, Dao F, Spriggs DR, Levine DA, Aghajanian C, Solit DB: BRAF Mutation is associated with early stage disease and improved outcome in patients with low-grade serous ovarian cancer. Cancer. 2013, 119 (3): 548-554. 10.1002/cncr.27782.View ArticlePubMed CentralPubMedGoogle Scholar
- Ewalt M, Nandula S, Phillips A, Alobeid B, Murty VV, Mansukhani MM, Bhagat G: Real-time PCR-based analysis of BRAF V600E mutation in low and intermediate grade lymphomas confirms frequent occurrence in hairy cell leukaemia. Hematol Oncol. 2012, 30 (4): 190-193. 10.1002/hon.1023.View ArticlePubMedGoogle Scholar
- Lemech C, Infante J, Arkenau HT: The potential for BRAF V600 inhibitors in advanced cutaneous melanoma: rationale and latest evidence. Ther Adv Med Oncol. 2011, 4 (2): 61-73.View ArticleGoogle Scholar
- Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, Schinzel AC, Harview CL, Brunet JP, Ahmann GJ, Adli M, Anderson KC, Ardlie KG, Auclair D, Baker A, Bergsagel PL, Bernstein BE, Drier Y, Fonseca R, Gabriel SB, Hofmeister CC, Jagannath S, Jakubowiak AJ, Krishnan A, Levy J, Liefeld T, Lonial S, Mahan S, Mfuko B, Monti S, Perkins LM, et al: Initial genome sequencing and analysis of multiple myeloma. Nature. 2011, 471 (7339): 467-472. 10.1038/nature09837.View ArticlePubMed CentralPubMedGoogle Scholar
- Sved J, Bird A: The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proc Natl Acad Sci. 1990, 87 (12): 4692-4696. 10.1073/pnas.87.12.4692.View ArticlePubMed CentralPubMedGoogle Scholar
- Hollstein M, Sidransky D, Vogelstein B, Harris CC: p53 mutations in human cancers. Science. 1991, 253 (5015): 49-53. 10.1126/science.1905840.View ArticlePubMedGoogle Scholar
- Hashimoto K, Rogozin IB, Panchenko AR: Oncogenic potential is related to activating effect of cancer single and double somatic mutations in receptor tyrosine kinases. Hum Mutat. 2012, 33 (11): 1566-1575. 10.1002/humu.22145.View ArticlePubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.