Chemistry-First Approach for Nomination of Personalized Treatment in Lung Cancer


  • A chemistry-first approach for druggable target identification for lung cancer

  • Mapping the associations between chemicals and genetic lesions in lung cancer

  • Matching chemicals with diverse patient-specific cancer-promoting mechanisms

  • Validating the effect of targeting chemically addressable mechanisms in NSCLC cells


Diversity in the genetic lesions that cause cancer is extreme. In consequence, a pressing challenge is the development of drugs that target patient-specific disease mechanisms. To address this challenge, we employed a chemistry-first discovery paradigm for de novo identification of druggable targets linked to robust patient selection hypotheses. In particular, a 200,000 compound diversity-oriented chemical library was profiled across a heavily annotated test-bed of >100 cellular models representative of the diverse and characteristic somatic lesions for lung cancer. This approach led to the delineation of 171 chemical-genetic associations, shedding light on the targetability of mechanistic vulnerabilities corresponding to a range of oncogenotypes present in patient populations lacking effective therapy. Chemically addressable addictions to ciliogenesis in TTC21B mutants and GLUT8-dependent serine biosynthesis in KRAS/KEAP1 double mutants are prominent examples. These observations indicate a wealth of actionable opportunities within the complex molecular etiology of cancer.

Large image of Figure 1.


The future of cancer treatment lies in a personalization of medicine, where each patient’s treatment regime is tailored to the genetic diversity of their tumors. Accomplishing this requires a “therapeutic triad,” where appropriate context-specific intervention targets, tightly linked to response biomarkers, are coupled to agents to engage these targets. To date, this has been best realized in disease harboring a druggable oncogenic driver. However, many of the more prevalent and most lethal cancers do not present with this opportunity. Non-small cell lung cancer (NSCLC) is a leading cause of cancer-related death in the United States and is a highly heterogeneous disease. A clinically relevant contributor to this disease heterogeneity is the diversity of molecular etiology associated with individual NSCLC tumors. Specifically, lung squamous carcinoma (LUSC) and lung adenocarcinoma (LUAD) represent the second and the third most highly mutated tissue subtypes in The Cancer Genome Atlas (TCGA), with a mean non-synonymous mutation burden of ∼250 mutations/tumor. This greatly increases the challenge of understanding the molecular drivers underpinning a patient’s disease, knowledge that is usually the starting point for hypothesis driven design of new therapeutic approaches. However, a large mutational burden also increases the probability that NSCLCs will contain vulnerabilities, not found in normal cells, which might be exploited therapeutically. The problem is how to identify and engage such vulnerabilities. Here, we employ a chemistry-driven de novo discovery strategy tailored for coincident delivery of preclinical therapeutic triads.


A Tiered Screening and Analytic Strategy for Chemistry-First Target Discovery

To generate an experimental testbed reflective of the molecular and mechanistic heterogeneity of lung cancer, we assembled a panel of 96 NSCLC (mainly LUAD) cell lines and 4 immortalized human bronchial epithelial cell lines (HBECs) (Data S1). Reasonable concordance of the phenotypic variation of this panel with human tumors was evaluated using legacy whole-genome transcript array data (Figures 1A and S1A). High-resolution molecular characterization was then carried out by whole-exome sequencing (WES) (Data S2), RNA sequencing (RNA-seq) (Data S3), tiled SNP arrays (Data S4), reverse phase protein array (RPPA) (Data S5), and heavy carbon tracing from glucose and glutamine into selected metabolites. For 34 NCLC cell lines, matching B cell lines from the same patients were sequenced, allowing for robust discrimination of somatic versus germline variation (Data S2). For the remaining cell lines, we developed a computational pipeline leveraging somatic alleles detected in the matched pairs and public datasets to filter probable germline variation (Figures 1B, S1B, and S1C; STAR Methods). We noted high concordance between transcript profiles from RNA-seq and hybridization arrays that were performed years apart, providing confidence in the accuracy and stability of cell line provenance (Figure S1D).

A deterministic clustering method, affinity propagation clustering (APC), (Frey and Dueck, 2007) produced more than 15 distinct phenotypic groups as defined by gene expression profiles (Figure S1E). Based on this, we devised a tiered high-throughput screening strategy to screen 202,103 chemicals across 12 cell lines representing overall phenotypic diversity of the panel (Figure S1F). Filters were implemented at each tier to enrich for small molecules that could selectively target the phenotypic variation across the NSCLC cell line panel (Figures S1F–S1I; STAR Methods). This culminated in 208 compounds and an additional 14 chemicals with known mechanism of action, which were tested for efficacy across the complete panel of 100 NSCLC cell lines at 12 doses in triplicate in two independent runs. This set was evaluated for potency and selectivity using both area under the dose response curve (AUC) and the effective dose required to reduce activity 50% (ED50). In both instances, low values indicate sensitivity. While we observed statistically significant correlations between AUC and ED50 values, there was a subset of chemicals for which AUC was uncoupled from ED50 values (Figure S1J). ED50 values produce a large dynamic range of chemical response values corresponding to the activity inflection point while AUC values reflect magnitude of response. As both metrics provide complementary information, we employed AUC and ED50 values in all subsequent association analyses (Data S6). Furthermore, we examined potential correlations of chemical sensitivities (AUC and ED50 metrics) with cell doubling times (Data S6) as differences in proliferation rates can make confounding contributions to chemical sensitivity profiles within cell panels (Hafner et al., 2017). APC of the final collection of compounds, based on activity across the cell line panel, produced at least 38 distinct clusters (Figure S1K). A ranking of this collection, which we refer to as a “precision oncology probe set” (POPS), by potency and activity revealed robust selectivity profiles (median fold changes from 2–90,000, Figures 1C and S1L).

To begin to pursue chemical/genetic associations, we first parsed the cell line panel based on similarity of chemical response (Figure 1D) and each quantitative molecular feature set (Figures S1M–S1P). We overlaid annotations from feature-specific clusters onto those derived from chemical sensitivity, observing unimpressive correspondence with any feature set (Figures 1E, 1F, and S1Q–S1T). This suggests global molecular diversity cannot account for the observed selective chemical responses. This observation, together with similar previous observations from our group and others, led us to pursue sparse feature selection for finding robust chemical/genetic associations (Eskiocak et al., 2017Garnett et al., 2012Iorio et al., 2016Kim et al., 2013). For this, we employed a combination of regularized machine learning (elastic net) and probability-based metrics (scanning Kolmogorov-Smirnov [KS]) to isolate features from each molecular dataset predicting sensitivity to each chemical. As compelling proof of concept, these methods linked high ALK expression as predicting sensitivity to the ALK inhibitor crizotinib (Figure 1G) and EGFR mutations and amplifications to predict sensitivity to the EGFR inhibitor erlotinib (Figure 1H). As expected, the EGFR mutant, erlotinib-sensitive cell lines have mutations in the kinase domain known to affect EGFR function. Notably, among the EGFR mutant non-responders, 2 cell lines harbored preexisting T790M mutations (Figure S1U), a known cell-autonomous adaptive mechanism promoting inhibitor resistance not detected by related chemical profiling efforts (Figure S1V). Together, these observations indicate that clinically relevant associations are discoverable within this experimental schema.

High-Throughput Discovery of Pharmacological Liabilities among Chemicals with Robust Selective Sensitivity Profiles

From the elastic net feature discovery approach, we noted a cohort of chemicals (12/221) for which sharp and selective sensitivity profiles (median deltas of >60-fold) were associated with expression of 1 of 9 known drug metabolism enzymes (Figures 2A and S2A). The compounds are structurally diverse and target distinct groups of cell lines (Figures S2B–S2D), prompting the attractive possibility that pharmacological liabilities could be detected and flagged early in chemistry-first target discovery screening cascades (i.e., for these chemicals, selective sensitivity may be due to selective production of a toxic metabolite). To test this, we first assessed chemical stability of each compound in groups of sensitive and resistant cell lines using liquid chromatography-mass spectrometry (LC/MS)-based approaches (Figures 2B–2E and S2E–S2J). 6 compounds displayed accelerated metabolism, evident by loss of parent compound, selectively in the sensitive cell lines (Figures 2B–2E, S2E, and S2F). We further examined SW157765 as a member of a cluster of compounds associated with activity in cells with high expression of the cytochrome p450 family member, CYP4F11. Notably, the CYP4F family inhibitor, HET0016, reversed metabolism of the compound (Figure 2F), and CRISPR-mediated knockout of CYP4F11 reversed toxicity in otherwise sensitive cell lines (Figures 2G and S2K–S2M). We next predicted the activity of SW001286 and SW126788, within a panel of 26 previously untested NSCLC cells, using the weighted elastic net models. Expression of the carboxylesterases CES1 and CES1P1 was cleanly predictive of SW126788 sensitivity in this test set (Figure S2N). However, prediction accuracy of SW001286 sensitivities was lower due to 4 unanticipated non-responders (Figure S2O). Like SW157756, HET0016 rescued SW001286 toxicity in two sensitive cell lines (Figure 2H). Thus, the metabolic products of SW001286 in CYP4F11-expressing cells may not be behaving as a general toxin(s), but rather may be targeting a selective vulnerability in sensitive cell lines. In line with this possibility, a manual examination of the KS test output indicated mutations in LKB1 as an additional marker that, together with CYP4F11expression, better stratify response (Figures 2I and S2P). In response to metabolic stress, LKB1 activates AMPK to suppress anabolic and activate catabolic pathways to maintain energy and redox homeostasis via inhibition of ACC1 (Jeon et al., 2012). A mechanistic connection between SW001286 sensitivity and loss-of-function LKB1 mutations was supported by the observation that depletion of ACC1 (Figure 2J), or addition of a ROS scavenger (Figure S2Q), partially rescued SW001286-sensitivity.

Chemicals correlating with expression of CYP4F11 represent the largest observed “prodrug” class (Figure 2A). The P450 class of enzymes, of which CYP4F11 is a member, can oxidize a variety of substrates, the scope of which has not been fully characterized but some chemical transformations tend to recur thematically within the class. One such transformation, the demethylation of aryl methoxy groups, is predicted to occur with two of the proposed pro-drug chemicals (Figure S2R). The remaining 3 compounds share a common furan-substituted alkene functional group that can likely be engaged by xenometabolic enzymes. Furthermore, the internal alkene is a reasonable site for enzymatic oxidation and protein conjugation. Consistent with the possibility that protein/small molecule conjugation might be at play, 72 hr ED50s were similar following either transient or sustained exposure to SW157765 (Figure S2S). In an effort to identify structural components of the molecule required for biological activity, we designed and synthesized a series of analogs of SW157765, of which analog 500-01 (Figure S2T) was found to be completely inert when tested for viability in a NSCLC cell line panel (Figure S2U). Interestingly, 500-01 differs from the parent molecule in only the hydrogenation of its internal alkene, suggesting SW157765 requires this functional group for biological activity. Considering that alkenes and other sites of unsaturation can be transformed into points of conjugation with larger biomolecules, the discovery of SW157765s active moiety converged with our hypothesis that the molecule can behave as a covalent modifier. We next sought to identify labile metabolites of SW157765 that might be susceptible to protein conjugation, with specific attention given to metabolite species whose modifications appeared at the double bond. Mass spectrometry-based evaluation of metabolites produced by H2122 cells treated with either SW157765 or 500-1 for 8 hr identified an oxidized metabolite, unique to SW157765 (Figures S2V–S2Y), potentially containing an epoxide at the site of the internal alkene (Figure S2Z). Considering that strained epoxide ring systems are subject to facile protein adduction, we propose this epoxide metabolite is the active covalent ligand. Efforts to chemically synthesize the metabolite revealed it to be too unstable to produce in quantities needed for in vitro testing, perhaps underpinning the semi-transient nature and robust reactivity profile of the molecule in a biological setting.

As a final example of high-throughput detection of pharmacological liabilities, we noted that high expression of the multi-drug resistance transporter, ABCG2, predicts resistance to the CDK7 inhibitor THZ1 (Figure 2K). RNAi-mediated depletion of ABCG2 sensitized resistant cells to THZ1, suggesting it is an ABCG2 substrate (Figure 2L).

NOTCH2 Mutations Predict Cellular Sensitivity to Glucocorticoids

From within the Prestwick library, we noted a cluster of 5 glucocorticoid (GC) receptor agonists with highly correlated selective activity profiles and a strong association with mutations in NOTCH2 (Figures 3A and 3B ). NOTCH2 has been implicated as a tumor suppressor in some settings. For example, NOTCH2 and NOTCH1 expression are oppositely correlated with prognosis in colorectal cancer, where low expression of NOTCH2 and high expression of NOTCH1 is predictive of poor patient outcome (Chu et al., 2011). Additionally, in experimental models of lung cancer, NOTCH2 loss, but not NOTCH1 loss, promotes aggressive disease (Baumgart et al., 2015). The “dispersed” pattern of NOTCH2 alleles detected among the NSCLC cell lines is reminiscent of loss-of-function alterations typically associated with a tumor suppressor (Figure S3A), and the glucocorticoid-sensitive cell lines harboring these mutations display downregulation of Notch pathway genes as compared to wild-type counterparts. (Figure 3C).

Depletion of the ubiquitously expressed GC receptor, NR3C1, was sufficient to reduce cellular sensitivity to GC exposure, suggesting the selective toxicity phenotype is receptor-dependent (Figure 3D). Intriguingly, several studies have linked GC response to Notch pathway activity. For example, activation of Notch signaling is associated with GC resistance in T-ALL (Inaba and Pui, 2010) and gamma-secretase inhibitors, which block the activation of Notch, restore sensitivity to GC. Moreover, a mutually antagonistic relationship exists between Notch effector, HES1, and NR3C1, in which each represses transcription of the other (Real et al., 2009Revollo et al., 2013). Consistent with these observations, we found significantly higher basal expression levels of NR3C1 mRNA in NOTCH2 mutant, GC-responsive cell lines (Figure 3E). Transcription of NR3C1 itself is responsive to GC induction, and we observed significant induction of NR3C1 in response to GC stimulation in these cells (Figure 3F). These observations indicate that GC-responsive cells are primed to propagate an NR3C1 signal through a GC-dependent positive feedback amplification loop.

We found the selective efficacy of GCs was preserved in 3D spheroid models of lung cancer (Figure 3G). Thus, we sought to understand the mechanism by which differential activity of Notch signaling may specify sensitivity to GCs. HES1 is a general transcriptional repressor that has been described to occupy the promoters of GC-inducible genes and acts as a master negative regulator of GC response (Revollo et al., 2013). Similarly, we found that GC exposure selectively reduced cellular HES1 protein levels in GC-sensitive NSCLC cells (Figures 3H and S3C). GC exposure was cytostatic, resulting in a selective G1/S arrest (Figure S3D). GCs suppress inflammation through transcriptional activation of anti-inflammatory genes and direct inhibition of nuclear factor κB (NF-κB) and activator protein 1 (AP-1). A well-known target of both pathways is cyclin D1, which was selectively reduced in sensitive NSCLC cells exposed to GC (Figure 3I). Finally, stable overexpression of HES1 from a GC-independent CMV promoter was sufficient to rescue GC-induced cell-cycle arrest (Figures 3J, S3E, and S3F). We therefore suspect that NOTCH2mutations, in NSCLC cells, result in reduced Notch signaling and higher basal NR3C1 expression, priming cells to respond to GC with G1 cell-cycle arrest (Figure S3G). While GC therapy is not commonly used in therapeutic doses to treat patients with lung cancer, 5.9% of LUAD tumors and 5.1% of LUSCs in the TCGA have mutations or deletions in Notch2, corresponding to thousands of patients a year that could be treated with a FDA-approved therapy.

Ready Detection of a Biologically Diverse Array of Chemical/Genetic Associations

To enrich for robust chemical/genetic associations that enable productive new target pursuit, we established a strict inclusion criteria threshold for automated reporting of predictive biomarker hypotheses from the elastic net (STAR Methods). Receiver operator characteristics and odds ratios were calculated as confidence metrics. Finally, predicted responder population frequencies were evaluated using the TCGA LUAD cohort. To enable open access for community-based hypothesis testing, we integrated the final results and all associated quantitative data into a searchable web-based GUI (Data S6) (

From the output, we selected 26 predicted chemical/genetic associations to experimentally evaluate for a distributive assessment of reliability and biological diversity. The weighted elastic net models derived from the training set (100 cell line panel) were applied to a distinct panel of 33 previously untested NSCLC cell lines (Data S1; test set). For 21 of the 26 chemicals, at least one cell line in the test set was predicted to be sensitive. 9 of these were validated by empirical testing (Figures 4A and S4B). In addition, 13 chemicals were selected for evaluation of conservation of selective activity in spheroid assays, 9 of which validated. (Figures 4B and S4A). 4 chemicals passing one or both criteria were selected for additional functional characterization (SW036310, SW151511, SW140154, SW208097).

The automated scanning KS analysis indicated that mutations in TTC21Bcorrespond to sensitivity to the benzothiazole-containing small molecule SW036310 (Figure 4C), in which selective efficacy was preserved in spheroid assays (Figure 4D). TTC21B (aka, IFIT139B) is the only known protein to act solely as a retrograde transport motor for primary cilia. Somatic mutations in TTC21B have not been characterized in the setting of cancer, however, loss of function TTC21Bmutations upregulate cilia-dependent processes in mice (Tran et al., 2008) and are causal mutations in human developmental diseases driven by primary cilia malfunction (ciliopathies) (Davis et al., 2011). Gain-of-function primary cilia growth and signaling occurs upon loss of TTC21B activity including activation of sonic hedgehog and potentially other processes known to be regulated by primary cilia, including NF-κB, VHL, and transforming growth factor β (TGF-β) signaling pathways. Indeed, whole genome transcript profiles indicated that gene signatures associated with activation of these pathways were selectively enriched (Figure S4B) and primary cilia were selectively detectable (Figure S4C) in TTC21B mutant, SW036310-sensitive cell lines (Figure S4B). Given these associations, we suspect that SW036310 may perturb a target(s) associated with primary cilia biology that supports survival of TTC21B mutant cells. Consistently, SW036310 sensitivity almost perfectly correlated sensitivity to the cytoplasmic dynein inhibitor, ciliobrevin, known to disrupt primary cilia by perturbing anterograde trafficking to that organelle (Figure 4E).

From a distinct biological context, we examined two chemicals, with anti-correlated activity profiles (Figure 4F), corresponding to expression of positive and negative modulators of innate immune signaling (Figure 4G). Using the derived weighted sum elastic net model, we found SW140154 sensitivity was accurately predicted outside the training set by a combination of high expression of the negative regulator of Toll-like receptor signaling (TLR) pathway, SARM1, and low expression of the cytokine receptor, IL18R1 (Figures 4H and S4D) while SW151511 sensitivity could be predicted by high expression of the positive regulator of the TLR pathway, PELI2(Figures 4H and S4E). We compared cell lines on opposing ends of the sensitivity spectrum and noted that high expression of TLR pathway genes was associated with sensitivity to SW151511 and resistance to SW140154 (Figure 4I). Sensitivity to SW151511 (Figures 4B and 4J), but not SW140154 (Figure 4B and S4F), was recapitulated in spheroid culture models. We therefore selected SW151511 for examination of global gene expression responses to chemical challenge. 2 sensitive and 2 resistant cell lines were treated with SW151511 for 24 hr prior to transcript profiling. We found significant chemically induced expression changes associated with the host defense response (Figure S4G). Notably, this signature was elevated above base-line upon compound exposure, suggesting amplification of a maladaptive innate-immune signaling program may represent a conditional vulnerability in cell lines response to SW151511.

Finally, low nanomolar sensitivity to SW208097 was predicted and validated to correspond to co-occurring mutations in TP53 and KEAP1 (Figures 4K and 4L). This is notable, as the molecule is a well-tolerated investigational drug (GSK923295) targeting the mitotic motor protein CENPE. GSK923295 has dose proportional pharmacokinetics in humans and a low number of grade 3/4 adverse events, however, responder populations have not been identified. Co-occurring TP53 and KEAP1 mutations are detected in 9.6% of LUAD in the TCGA, which extrapolates to ∼17,000 patients/year potentially harboring GSK923295-responsive disease.

A Chemically Addressable Vulnerability of KRAS/KEAP1 NSCLC Cells to Perturbation of SLC2A8

KRAS mutant lung cancers are common, aggressive, and difficult to manage in the clinic. Therefore, we chose this class for in-depth pursuit of tool compound/target/biomarker triads. We previously reported the overarching phenotypic diversity among KRAS mutant lung cancer cell lines is essentially equivalent to the global phenotypic variation found across all characterized NSCLC cell lines (Kim et al., 2016). Consistent with this, we noted KRAS mutant NSCLC lines distributed across the majority of APC similarity clusters, defined by RNA-seq, within the larger NSCLC cell line panel (Figure 5A). This genomic mRNA expression diversity was mirrored by a diversity of sensitivity of KRAS mutant cell lines to the POPs collection (Figures 5B and 5C). However, automated scanning KS analyses returned 4 chemical associations with KRAS mutant subtypes that passed p value thresholds (p < 2E−4). These subtypes were defined by co-occurring mutations in KEAP1NUP214PTPRT, and TTC21B (Figures S5A and 5D). Among these, the association of KRAS/KEAP1 double mutant cell lines with sensitivity to SW157765 (Figure 5D) was verifiable in a test set of NSCLC lines distinct from the training set (Figure 5E), and selective efficacy was preserved in spheroid models of lung cancer (Figures 5F and S5B). KEAP1 is a major regulator of the NRF2 antioxidant response. Under normal physiological conditions, NRF2 is constantly ubiquitinated in the cytoplasm by the CUL3/KEAP1 E3 ligase/substrate adaptor complex. Upon stress, KEAP1 inactivation facilitates NRF2 nuclear translocation and consequent activation of the NRF2-dependent anti-oxidant and cytoprotective transcriptional responses. Deleterious mutations or deletions in KEAP1 are present in ∼19% of LUADs and ∼12% of LUSCs corresponding to constitutive NRF2 activity. Co-occurring mutations in KEAP1 and KRAS are present in ∼6% of LUADs (, significantly more than expected by chance (p = 0.007), suggesting they are under positive selective pressure during disease development. There were, however, a few KEAP1 wild-type cell lines that were responsive to SW157765 (Figure 5G). These included the only cell line in the panel that harbors a KEAP1 homozygous deletion (Figure S5C) resulting in undetectable KEAP1 mRNA (Figure 5G). These also included cell lines (2/2) with amino acid substitutions in the degron domain of NRF2 (Figure 5G), corresponding to hotspot NRF2 mutations detected in LUAD tumors lacking functional KEAP1 degradation motifs producing constitutively activated variants (Figure S5D). We did not detect additional variants in known NRF2 pathway genes among the 8 remaining sensitive cell lines that were KRAS/KEAP1 wild-type. However, these cell lines had a significant upregulation of an NRF2-dependent gene expression signature (p < 2.2E−16) (Figure S5E) and can be predicted to have high NRF2 pathway activity despite the absence of discernable NRF2-related lesions. High expression of this NRF2 gene signature (Figure S5F) was predictive of sensitivity to SW157765 when applied to cell lines outside the training set (Figure 5H).

SW157765 is found as a member of the “prodrug” compounds in which high expression of CYP4F11 is predictive of, and required for, cellular response (Figures 2A and 2G). Of note, CYP4F11 is a candidate NRF2 target gene whose expression is upregulated in NRF2-dependent NSCLC (Goldstein et al., 2016). Given this association, we assessed NRF2-dependent regulation of CYP4F11 in an SW157765-sensitive cell line and found that small interfering RNA (siRNA)-mediated depletion of NRF2 resulted in depletion of CYP4F11 (Figures 5I and S5G) and reduction of sensitivity to SW157765 (Figure 5J). This suggests that NRF2 pathway activity leads to selective production of a toxic SW157765 metabolite. However, we also noted that HCC44 is an SW157765-resistant cell line with high expression of CYP4F11 and corresponding SW157765 metabolism (Figure 2E), suggesting CYP4F11-dependent modification of SW157765 is not sufficient to account for chemical sensitivity. Consistent with this, we found siRNA-mediated depletion of KRAS completely rescued cellular sensitivity to SW157765 even though metabolism of the compound was unaffected (Figures 5K and 5L). Thus, KRAS and NRF2 pathway activation combine to produce a selective cellular vulnerability to SW157765 intervention.

To help identify the nature of this vulnerability, we employed an arrayed genome-scale affinity-selection/mass spectrometry screening strategy to identify SW157765 interacting proteins from among a panel of ∼14,000 candidates (Figure S6A). The non-canonical glucose transporter, GLUT8 (SLC2A8), was the sole hit, with an estimated Kd = 200 nM (Figure 6A). Parental SW157765 was used as a substrate for the binding assay rather than the CYP4F11-dependent oxidized metabolite (Figure 2Z) as we were unable to synthesize the latter de novo. As an alternate approach, we undertook docking studies to assess binding of SW157765 to GLUT8 in comparison to the predicted metabolite (Figure S2Z). A homology model GLUT8 was developed from homologous crystal structure of GLUT1 (Kapoor et al., 2016). A crystal structure of GLUT8 has not been published, however, GLUT1 has a 48% sequence similarity with GLUT8, and most of the critical residues for glucose transport are conserved between the two proteins. SW157765 and the epoxide metabolite were predicted to dock on top of one another, except for a deviation near the epoxide region. (Figure S6B). While both chemicals were predicted to interact with residues on GLUT8 (Figure S6B), the epoxide constrains the furan ring, resulting in a significant shift of this ring (55°) relative to the parent molecule, producing stronger predicted interactions with Trp433 and a higher docking score (−7.9 versus −7.6). Notably, docking studies with the inactive analog, 500-1, indicate it fails to achieve a binding pose similar to SW157765, with less H-bonding interactions and a relatively poor docking score of only −6.8. In aggregate, these analyses are consistent with interaction of both SW157765 and its oxidized metabolite with GLUT8. Enhanced GLUT8 thermal stability in cells treated with SW157765, but not 500-1, provided orthogonal evidence for this interaction (Figure S6C).

GLUT8 is a member of the class III glucose transporters thought to mainly participate in translocation of glucose across the blastocyst membrane (Carayannopoulos et al., 2000). A role for GLUT8 in cancer has not been well studied, although it has been found to be significantly upregulated in endometrial cancer and in multiple myeloma relative to normal tissue. While the dominant glucose transporter in tumor cells is thought to be GLUT1, upregulated class III glucose transporters may support higher energy demands in some cases (Schmidt et al., 2009). Supporting this notion, glucose uptake and viability of a subset of multiple myeloma cell lines was found to be dependent on the continued expression of GLUT8 but not GLUT1 (McBrayer et al., 2012). Notably, we found SW157765-sensitive NSCLC cell lines were also selectively sensitive to glucose deprivation (Figure S6D) and to GLUT8 depletion (Figure 6B). Furthermore, SW157765 selectively inhibited fluorescent 2-deoxyglucose (2DG) uptake in SW157765-sensitive cells in a dose-dependent manner (Figure 6C). In contrast, GLUT1depletion (Figures S6E and S6F) had no effect on 2DG uptake (Figure 6D) or viability (Figure S6G). These observations are consistent with action of SW157765 at the level of GLUT8 inhibition and a selective dependence of KRAS/KEAP1 mutant cells on GLUT8 for glucose consumption.

Uniformly labeled [13C6] glucose is metabolized via the glycolytic cycle to 3-phosphoglutarate (3PG), which can enter the serine biosynthetic pathway where it is converted in a series of steps to serine, which is subsequently cleaved to produce glycine and a one-carbon intermediate that can enter the folate cycle to ultimately result in the production of purines and thymidines (Figure S6H). Of relevance to this study, high NRF2 activity was recently demonstrated to promote serine/glycine biosynthesis, in some NSCLC cell types, through ATF4-dependent expression of rate-limiting serine biosynthetic enzymes (PHGDH, PSAT1, PSPH, SHMT1, and SHMT2). De novo serine biosynthesis is upregulated in subsets of lung cancer, breast cancer, glioma, and melanoma, presumably to support glutathione and nucleotide production, and can be required for tumor cell survival. When we examined carbon flux from uniformly labeled [13C6] glucose into serine (SerM3) and glycine (GlyM2) mass isotopomers across 63 NSCLC cell lines, we found significant enrichment of heavy carbons in the SW157765 sensitive cell lines (Figure 6E), corresponding to significantly higher expression of serine biosynthetic pathway genes (Figure S6I) and selective consequences on cell survival upon depletion of the ATF4 transcription factor (Figure 6F) and PHGDH (Figure 6G), the enzyme that catalyzes the first committed step in the serine biosynthetic pathway. Taken together, these findings indicate a dependence of KRAS/KEAP1 mutant NSCLC cells on consumption of glucose to support serine biosynthetic pathway activity.

These cumulative observations led us to consider that SW157765 may be acting to reduce carbon flux through the serine biosynthetic pathway, leading to selective targeting of cancer cells dependent on this pathway. To test this, we pretreated H647 (KRAS/KEAP1 mutant) with SW157765 for 24 hr, an interval where we do not observe significant induction of cell death (Figure S6J), followed by exposure to glucose ([13C2]) (Figure S6O). Heavy carbon labeling of serine reached steady state after 2 hr (SerM2), which was reduced 5-fold by SW157765 (Figure 6H). Iteration of this approach within a panel of NSCLC cell lines indicated basal carbon flux through the serine biosynthetic pathway was higher in SW157765-sensitive cell lines, as expected, and exposure to SW157765 selectively reduced serine labeling in these cell lines (Figure 6I). Notably, carbon flux from glucose through the pentose phosphate pathway (PPP; LacM1) or the citric acid cycle (TCA; CitM2) (Figures S6K, S6L, and S6O) was not affected by SW157765. Both KRAS and NRF2 are known to shunt glucose toward the PPP. Thus, the selective consequences of SW157765 exposure on serine/glycine metabolism suggests a dominant routing of available glucose to the PPP and TCA. GLUT8 may therefore support supplementary glucose consumption to an extent that provides for serine/glycine biosynthetic demands of the KRAS/KEAP1 mutant cellular context.

While KEAP1 and KRAS mutation status was robustly predictive of sensitivity to SW157765, we noted 4 unanticipated non-responders (DFCI024, HCC44, H2030, HCC4019) with co-occurring mutations in KEAP1 and KRAS and high expression of CYP4F11 (Figure S6M). Importantly, PHGDH was greatly reduced or absent in all 4 cell lines (Figure 6J). Additionally, cell line H2030 is completely missing mRNA expression of PSAT1 (Figures S6H and S6M). These cells may be resistant to SW157765 due to a pre-existing adaptation that reduces contributions of glycolysis to serine/glycine biosynthesis. To test this, we stably expressed either full-length PHGDH or a hypomorphic mutant (PHGDHV490M) (Tabatabaie et al., 2009) in HCC44 cells under the control of a doxycycline-inducible promoter (Figure S6N). Overexpression of PHGDH, but not PHGDHV90M, sensitized HCC44 cells to SW157765 (Figure 6K), suggesting that PHGDH re-established carbon flux into serine/glycine production with consequent reliance on GLUT8 to maintain sufficient glucose consumption.

In summary, we have shown co-occurring mutations in KEAP1 and KRAS define a vulnerability to continued function of GLUT8. Inhibition of GLUT8 is associated with a reduction of glucose intake leading to a selective shunting of glucose from serine biosynthesis. We found overexpression of wild-type PHGDH can re-sensitize HCC44 cells to SW157765. Perhaps most intriguingly, re-introduction of PHGDH also can sensitize cells to GLUT8 inhibition. These findings suggest that shunting cellular consumption of glucose to serine biosynthesis generates a dependency on GLUT8 that can be selectively targeted with SW157765. To potentially help assess the generality of this relationship, we profiled SW157765 for toxicity in a panel of 27 breast cancer cell lines with publically available genomics data (Barretina et al., 2012). Importantly, we found that copy number-driven amplification of PHGDH expression together with high expression of CYP4F11 significantly corresponded to SW157765 sensitivity (Figure 6L).


The chemistry-first target nomination approach employed here was designed to leverage large-scale uncharacterized chemical diversity as a de novo discovery tool unconstrained by any preconceived notions of mechanistic relationships. Of the 202,103 chemicals employed, <1% are associated with known or suspected modes of action. However, computer-automated rediscovery of current precision medicine relationships from within the collection of known compounds authenticated the screening cascade. This included the association of erlotinib-sensitivity with EGFRmutation status and the association of crizotinib-sensitivity with EML4-ALKtranslocations. Likewise, the pipeline returned novel and robust repurposing hypotheses and biomarkers for clinically available compounds that currently lack patient selection hypotheses.

While the above observations credentialed the rigor of the experimental and data analytics pipeline, the identification of novel chemical/genetic relationships was the primary objective. To that end, 171 compounds were linked to genomic features within a 95% confidence interval by the elastic net. These chemical/genetic relationships spanned a strikingly diverse array of biological processes including selective vulnerabilities associated with host defense pathway activation, ciliogenesis, and nuclear hormone signaling. Furthermore, pharmacological relationships that were a consequence of selective chemical clearance and/or selective chemical metabolism were readily detectable.

To pursue candidate “therapeutic triads,” we focused on KRAS mutant adenocarcinoma. When considered as a single class, these cell models displayed diverse and discordant response profiles to the chemical collection employed. However, significant chemical/genetic associations were detected upon segmentation of cell models according to mutations in additional genes that co-occurred with KRAS mutations at a reasonable frequency. This is in accordance with accumulating evidence that KRAS mutant lung cancers parse across multiple distinct mechanistic subtypes. Most notably, KRAS/KEAP1 double mutant NSCLC cells were selectively sensitive to the benzothiozole, SW157765, due to the convergent consequences of dual KRAS and NRF2 modulation of metabolic and xenobiotic gene regulatory programs. GLUT8 was identified as a mechanistic target of SW157765 and is selectively required to support the diversion of glucose to serine biosynthesis in this genetic background. Modulation of these regulatory programs by orthogonal means was sufficient to modulate SW157765 responsiveness. We note that lineage-restricted biomarker discovery was key to identifying the KRAS/KEAP1synthetic chemical relationship. Of note, parallel analysis within a large cohort of breast cancer cell lines returned sensitivity-associated biomarkers indicative of a conserved biological mode-of-action for SW155765, but there was no relationship with KRAS/KEAP1 mutation status, an oncogenotype that is exceedingly rare in breast.

We find that the application of a large diversity-oriented chemical collection, within a carefully delineated phenotypic screening convention, can uncover a compelling diversity of heretofore unappreciated target opportunities within the seeming cacophony of molecular etiology of lung cancer. Importantly, these target opportunities are, by nature of the discovery paradigm, associated with precision medicine strategies and pharmacologically addressable. Furthermore, it is clear that chemical vulnerabilities can be revealed that are linked to recurrent mutations in lung cancer patients that are not currently “actionable.” Thus, we argue that many undeveloped avenues remain open for productive pursuit of tumor-intrinsic precision medicine.


This work was supported by grants from the NIH (CA197717, CA176284, CA70907, CA142543), CPRIT (RP120732, RP110708, RP110708), Robert Welch Foundation (I-1414), the Korea Health Technology R & D project through the Korea Health Industry Development Institute (HI14C1324), and the National R&D Program for Cancer Control (1420100), funded by the Ministry of Health & Welfare, Republic of Korea. E.A.M. was supported by NIH training grant 5T32GM8203-27, C.D. and R.M.V were supported by CPRIT training grant RP140110, and R.M.V was supported by NIH training grant 5T32CA124334-09. We would like to thank Hanspeter Niederstrasser, Melissa McCoy, Shuguang Wei, Hong Chen, and Anwu Zhou in the UT Southwestern High-throughput Screening Core for their support of the large-scale screening and dose-response experiments described herein.

See full details