The Complexity Of Cancer

A deep dive into the causes and consequences of tumour heterogeneity

John Cassidy
January 29, 2020
February 10, 2020

Two of the biggest problems facing oncology are that tumours are heterogeneous, and that they are constantly changing over time and in response to therapeutic intervention. The clinical implication of these phenomenon are that drugs will not have uniform efficacy across a patient group and throughout the entire treatment cycle. has developed the precision AI platform for personalised cancer treatment. OncOS gives oncologists a flexible means by which to stratify their patients for clinical trials and/or FDA approved therapies.

In this technical read, I will discuss the causes and consequences of tumour heterogeneity and evolution.


Cancer has been known to be heterogeneous since its detailed study by experimental pathologists began at the start of the 19th century. At first, this heterogeneity was primarily characterised by differences in cellular morphology (1), followed by heterogeneity across surface marker expression (2) and later differences in tumour growth rates (3) and response to therapy (4). Recently, the study of tumour heterogeneity has exploded, thanks in part to large profiling endeavours such as TCGA and ICGC. These multinational collaborations have helped elucidate the true scale of diversity across human neoplasms (5).

Study of tumour heterogeneity is far from a purely academic pursuit. Early work in breast cancer, for example, allowed stratification of patients based on the presence of oestrogen receptor alpha (ERα), which led to the successful targeting of tamoxifen for ERα positive (ERα+) patients (6). More recent work has enabled comprehensive stratification of breast and other cancers (5,7).

For example, again in breast cancer, a 50 gene signature (PAM50) can be used to stratify patients into four intrinsic subtypes (luminal A, luminal B, HER2-enriched & basal-like) with distinct clinical outcomes (8,9). One study even integrated copy number (CN) data with transcriptomics to improve on these earlier classifications and uncover 11 distinct Integrative Clusters (7). Defined in 2,000 tumours, this classification was validated in over 7,500 tumours (10) and shown to clearly associate with discrete clinical outcomes, suggesting that these clusters represent distinct biological entities.

Improving the taxonomy of cancer is the initial step towards a better understanding of the drivers of tumour growth and consequently towards improved precision medicines. It is hoped that this strategy may ultimately lead to development of the next generation of targeted therapies (11).

Further to heterogeneity between patients, it has long been known that tumours harbour distinct cellular populations within their bulk (1,12). For example Fidler and Kripke reported in 1977 that clonal populations derived from mouse metastatic melanomas varied extensively in their ability to seed metastasis in syngeneic hosts (13). Of note amongst the relatively early research into tumour heterogeneity, were the important observations using human patients, although highly unethical by today’s standards these offered unequivocal evidence for functional heterogeneity and fuelled five decades of subsequent research. For example, various studies around the late 1960s showed by in vivo radiolabelling that the morphologically distinguishable populations of human leukemic cells differed remarkably in their proliferative potential (14, 15, 16).

The observation that human cancers contain functionally different populations was echoed by Chester Southam, an immunologist and oncologist at Memorial Sloan Kettering Cancer Center and Cornell University Medical College, who in 1962 showed that autologous engrafted human tumour cells differ in their ability to reform tumours (17). The lack of informed patient consent in these studies led to Southam being punished by Regents of the University of the State of New York who found him guilty of fraud, deceit, and unprofessional conduct. Echoing the questionable ethics of the time, he was elected president of the American Association for Cancer Research the following year.

Collectively, the early studies discussed showed that tumours were not simply a growth of homogeneous cells with equal proliferative potential and tumour forming ability, but contained a heterogeneous mixture of cellular populations. The observation that tumour cells differ in their ability to xeno- and auto-transplant was added to by seminal studies on teratocarcinomas (18), small cell lung carcinomas (19) and mammary adenocarcinomas (20), to give rise to the cancer stem cell (CSC) model of tumour development and heterogeneity. Early evidence that genetic aberrations were the cause of a tumours phenotypic traits (21) supported the idea that somatic evolution of genomic clones could occur, allowing Darwinian selection in response to spatial and temporal selective pressures (12).

Together these theories have influenced a significant proportion of the cancer research occurring today. However, translation of this research into the clinical setting has been slower (22). Cambridge Cancer Genomics was founded to help translate such findings into clinically meaningful insights that can help improve cancer therapy.

Genomic Drivers of Tumour Heterogeneity

Cancer is, first and foremost, a disease of the genome. Indeed, both inter- and intra- tumour heterogeneity can be explained by the genomic instability inherent to a tumour’s biology & the sequential acquisition of driver mutations. For example Adenomatous polyposis coli (APC) loss in colorectal cancer (CRC)) which fuels the first stages of clonal expansion (25,26).

Though changes in a tumour’s microenvironment (for example increase inflammation or immune cell infiltrate) or epigenetic regulation (for example MLH1 promotor methylation in microsatellite unstable CRC) are undoubtedly required to transform a clonal expansion of benign cells into a malignancy (21, 27), the most fundamental feature of a tumour is that it contains genomic mutations.

Through the course of tumour initiation and progression, cancer cells undergo repeated mutational events, which may or may not confer a survival advantage (or ‘fitness’) on their progeny. With time, this process generates a dominant clone (group of cells) that will expand and dominate the site where it was generated through Darwinian selection in response to spatial and temporal selective pressures (12).

When clones arise with an increased fitness (or when selective pressures change), less advantaged clones will either disappear or will be maintained as sub-clones alongside the dominant clone, acting as a reservoir from which evolution can continue (27). This compelling theory was first put forward by Peter Nowell in 1976 (12) and supported by early evidence that genetic aberrations were the cause of a tumours phenotypic traits (21) and more recent genomics research (28,29).

It is now accepted that tumours harbour various layers of genomic complexity and that the resultant heterogeneity can have a profound effect on cancer progression. Moreover, genomic instability, which fuels the diversity essential for any Darwinian process, is intertwined with both the development and maintenance of tumour heterogeneity, and the clinical consequences thereof.

Genomic instability drives mutations (errors) in a cell's DNA, leading to cancer development

Genomic Instability & Tumour Heterogeneity

According to Hanahan and Weinberg, genomic instability is an enabling characteristic that helps generate the hallmarks of cancer, and is the major driving force behind intra- and intertumoral heterogeneity (30,31). Throughout the process of tumour development, cancer cells can accrue thousands of mutations, some of which can even involve the gain or loss of entire chromosome arms (32).

However, there is evidence that the number of mutations cannot increase endlessly without adversely affecting cell fitness (33, 34, 35, 36, 37), implying the existence of a limit of tolerance. Hence, cancer cells must exist in constant balance between instability-driven cell growth and the point where the consequences of gross genomic changes become lethal to the cell (32).

Solid tumours can be classified based on the dominance of single nucleotide variants (SNVs), i.e. M-class tumours, or copy number aberrations (CNAs), i.e. C-class tumours (38).

The contribution of each class of somatic mutation to genomic instability, the balance between instability-driven cell growth & cell death, and tumour heterogeneity, is varied. For example, the total burden of SNVs at a given time depends on the efficacy of SNV appearance and clearance by, for example, p53-induced apoptosis (32).

In some cancers, e.g. microsatellite instability (MSI) high CRCs, SNV burden is increased by a loss of DNA mismatch repair and this is linked to a favourable outcome. Intolerance to a high SNV burden could be due to increased immune clearance. Indeed, MSI high CRCs and metastatic melanomas have both been shown to respond well to immune checkpoint inhibitors (39, 40, 41, 42).

Simulation studies have provided insight into how natural selection adjusts mutation rate in tumours with a high SNV burden (34,44). For example, one study found that under fluctuating environmental conditions (e.g. oxygen availability or temperature), the rate of SNV accumulation increased linearly until reaching a critical limit. SNV accumulation at a rate above this variable threshold led to population level extinction events (45). Another simulation study found that a high SNV accumulation rate initially leads to rapid tumour growth, but that beyond a certain threshold, leads to negative clonal selection and is consequently less favourable for cellular expansion (46).

At, we have developed a system for tracking the gain or reversion of mutations over time (for example during treatment). By following emerging mutations, we may be able to better select therapeutic interventions to counter tumour growth and drug resistance.

Interestingly, concepts such as immune surveillance, which limit the proportion of cells with a high SNV burden are not easy to adapt to explain the toxicity associated with a high rate of SNV formation. However, there does exist a theoretical framework, seen in the development of life, for a limiting threshold in DNA replication error rate (47).

Specifically, it has been proposed that if a SNV error rate were to exceed some catastrophic threshold, then the information in the genome would be effectively decayed and the fidelity of genome maintenance across generations would be severely impacted (34,48).

This conceptual framework is supported by theoretical models and evolutionary experiments (33, 34, 35, 36, 49), applicable to unicellular organisms, multicellular organisms, and neoplastic cells (50, 51). Mechanistically, it is though that SNV error rate limits are supported by ‘gatekeeper’ genes (such as TP53) which may be induced by oxidative stress and/or high SNV mutation rates, halting the cell cycle and initiating apoptosis (52, 53, 54, 55).

CNAs differ from SNVs in that they can encompass a vast region of a cell’s genome. CNAs are thought to confer substantial phenotypic plasticity, through gene duplication or deletion, and have been described as the driving force of genetic diversification (19). Indeed, there is evidence supporting the more central role of CNAs than SNVs in the development and maintenance of neoplastic cell population diversity (56, 57, 58). However, CNAs appear limited by a similar overall rate and burden limit (32, 59, 60).

For example, in one study, fluorescent in situ hybridisation (FISH) of the centromeres of chromosomes 2 and 15 was used to define CNA driven genomic instability in ERα- breast cancer patients. The authors found that patients who’s tumour cells were 45% chromosomally abnormal had a significantly better prognosis than those with lower number of chromosomal abnormalities (60).

Similarly, TNBCs with a gene expression signature associated with high chromosomal instability are associated with increased time until relapse compared with those with low predicted instability (59). Limits on CNA abundance could be explained by biophysical constraints (e.g. chromosome size limiting alignment to the centre of the nucleus and therefor metaphase efficiency) (61), gene dosage (e.g. amplification of neoantigens or tumour suppressors in a duplicated region) (62) or apoptosis initiated by DNA damage (e.g. DNA double-strand breaks initiating p53-dependent signal transduction) (63, 64).

In summary, genomic instability is a major driver of tumour heterogeneity, yet whilst heterogeneity may be associated with a poor prognosis, instability itself may be associated with better patient outcome. These observations can be reconciled by considering that a high CNA burden may result from multiple clones with low levels of CNA burden or of a few clones with high levels of CNA burden (32). When CNA burden is spread among many clones, the associated prognosis is less favourable, indicating that it is the CNA burden per clone that limits tumour viability.

Thus, genomic instability gives rise to heterogeneity and a polyclonal tumour, but an overly high CNA or SNV burden in any specific clone, limits its viability. Genomic instability and tumour heterogeneity is best considered as a delicate balance between favourable growth characteristics and cellular toxicity (32).

Intertumour Heterogeneity

The earliest events in a tumour’s evolution are fuelled by specific genomic aberrations, which can have profound effects on intertumour heterogeneity. For example ESR1 and ERBB2 amplification, leading respectively to ERα (65) and human epidermal growth factor receptor 2 (HER2) (66) protein up-regulation, can be early events in breast cancer initiation. These events might be prognostic and predictive of drug responses, suggesting they can be used to classify cancer into different subtypes.

For example, ERα+ tumours (80% of breast cancers) tend to have a better prognosis and are treated with oestrogen receptor antagonists (e.g. tamoxifen) or aromatase inhibitors (e.g. anastrozole), whereas HER2+ tumours (20% of breast cancers) are generally faster growing, more aggressive and are treated with antibodies against HER2 (e.g. trastuzumab) (67).

At, we have built computational models to predict the affect of individual mutational events on the probability of treatment success. It is hoped that as these methods improve, we will be able to aid oncologists in the best possible treatment selection for an individual tumour.

In breast cancers, ESR1 and ERBB2 amplification can also occur in the same tumours, with or without the presence of progesterone receptor (PR). Indeed, the first molecular based classification that dramatically changed clinical practice and breast cancer patient outcome was based on ERα, PR and HER2 status (67).

Continued technological advances have made clear that a wide range of genomic aberrations can drive the tumorigenic process. Recently, a driver-based taxonomy of breast cancer has been defined based on copy number and gene expression data (7). The 11 molecular subtypes identified show distinct prognosis and molecular drivers, reaffirming breast cancer heterogeneity.

Beyond the effects of individual genomic aberrations, the order in which cells acquire mutations can have profound effects on intertumour heterogeneity and disease progression.

In Philadelphia chromosome negative myeloproliferative neoplasms (MPNs), recent work has demonstrated that within patients harbouring both a Janus kinase 2 (JAK2) and Tet methylcytosine dioxygenase 2 (TET2) mutation, those who acquired the TET2 mutation first were less likely to present with the MPN subtype polycythemia vera than with essential thrombocythemia (68).

Thus, complete phenotypic heterogeneity is observed between patients with the same mutational load depending on the order of mutational events.

Significantly, some of these early oncogenic driver events can also shape the subsequent clonal evolution that heavily influences intratumour heterogeneity. Taking the most extreme example, hypermethylation or mutation of MutL homolog 1 (MLH1) leads to a hypermutator phenotype in colorectal cancer (69).

This microsatellite instability phenotype both distinguishes MLH1 mutant tumours from other CRCs and leads to widespread intratumour heterogeneity (70), which has been linked to higher resistance to therapy. More recently, mutant phosphatidylinositol 3-kinase alpha catalytic subunit (PIK3CA) has been shown to enable plasticity in differentiated breast cells, paving the way towards functional intratumour heterogeneity in breast cancers with PIK3CA mutations (71, 72).

More broadly, PIK3CA or other members of the phosphatidylinositol 3-kinase (PI3K) pathway are amongst the most commonly mutated in breast cancer (73) and cross-talk between signalling networks emanating from mutant PIK3CA and ERα have been shown to impact significantly on breast cancer initiation and progression (74, 75).

An overview of tumour heterogeneity. From left to right; intertumour heterogeneity ensures that no two malignancies are the same; genomic clonal populations exist within a tumour (coloured) and functional heterogeneity (shape) exists within isogeneic populations. This non-genomic intratumour heterogeneity is due to intrinsic epigenetic differences (not shown), interaction with the immune infiltrate (top panel), differences in tumour metabolism (e.g. hypoxia; middle panel) and interaction with the extracellular matrix and stromal component (bottom panel). Each of the depicted environments could have different effects on cellular functions. Highlighting functional consequences of heterogeneity, in the lower half, two genomic clones are depicted as resistant to chemotherapy and able to repopulate the tumour after treatment. Dead cells are coloured black in the centre of the lower panel. These resistant clones could be genomically distinct or isogenic but functionally distinct.

Intratumour Heterogeneity

The acquisition and order of driver mutations can have profound implications for intertumour heterogeneity. However, tumours are characterised by continuous clonal evolution as they develop (76). Progeny of founder clones undergo repeated mutational events that may confer a fitness advantage with regards specific spatial or temporal selective pressures (12).

Clonal evolution is present in precancerous and advanced lesions and helps define both inter- and, in particular, intratumour heterogeneity. For example, sequencing data on 234 biopsies of normal skin from four individuals showed multiple cancer-associated genes were under positive selection even in normal tissue (77). The authors observed clonal expansion of skin cells with early driver mutations across patients and overall found driver mutations at a density of 140 per square centimetre of sun-exposed skin. As this study focussed on pre-cancerous tissues, we cannot know if any of these early lesions would lead to tumour growth but the authors do present strong evidence that clonal evolution occurs even in the earliest stages of a neoplasm’s development (77).

Numerous groups have been able to reconstruct the clonal hierarchy of individual tumours. For example Nik-Zainal et al., were able to combine deep sequencing with novel bioinformatics tools to reconstruct the clonal history of 21 breast cancers (28). The authors showed that breast cancers evolve through the infrequent acquisition of driver mutations; each of which allows clonal expansion and eventual dominance. Interestingly, as the most recent common ancestor appeared relatively early, minor clones were able to coexist and diversify alongside the dominant clone (28).

This model of branched evolution allows for a genetic pool of minor clones able to fuel new stages of clonal evolution if selective pressures change. In agreement with this study, by reconstructing the clonal composition of 104 triple negative breast cancer (TNBC), another study observed a complete spectrum of molecular and clonal compositions at patient diagnosis (78).

Alongside this model of branched evolution, the survival of multiple sub-dominant clones can be explained by the spatial segregation of clones across the tumour as a whole. This pattern was hinted at in the pre-cancerous clonal expansions of the normal skin (77) and fully considered in renal cell carcinoma (79) and lung cancers (80).

Indeed, in a recent study from Caravagna et al., the temporal order of some genomic changes in a tumour could be inferred from multiregional sequencing (81). The authors used transfer learning to transition neural networks trained on their own datasets to large multi-region sequencing datasets from lung, breast, renal, and colorectal cancer, in each case detecting repeated evolutionary trajectories in subgroups of patient.

A public release of the author’s software package, ‘REVOLVER’, could empower researchers to stratify patient groups based on the basis of how their tumour evolved (81). At, we take this one step further by developing tools to predict how tumours are evolving in real time and on a patient by patient basis. These kind of advances allow oncologists to know which medication to prescribe, and at which point in a patients treatment cycle, to best treat an individual patient.

Clinical Consequences of Tumour Heterogeneity

The typical attrition rate of new investigational drugs submitted for clinical trials is around 88% (65). Consequently, the average cost of bringing a new therapeutic agent through to regulatory approval, a process that can take a decade, is over $2.56B (65,66).

In a previous article, we discussed how AI and big data analytics were improving this unsustainable rate of attrition in clinical trials. However, a paradigm shift is required in the field to halt the declining availability of new cancer medicines.

In order to reduce the attrition of experimental cancer agents and improve the outcome of patients already treated with targeted agents, we must develop a more comprehensive picture of tumour heterogeneity.

Tumour heterogeneity means that cancer medicines won't necessarily have the same effectiveness across all tumours

The first and most profound consequence of tumour heterogeneity for clinical practice is that chemotherapy and targeted agents do not have the same efficacy across malignancies of the same subtype or even across the same tumour (67).

In breast cancer, for example, the earliest classifications stratified based on the presence or absence of hormone receptors (ER/PR) and HER2. This first molecular stratification had unprecedented clinical implications exemplified by the strong benefit of oestrogen pathway inhibitors in ER+ and anti-HER2 therapy in HER2+ breast cancers.

With the advent of large scale sequencing projects, our stratification of breast cancer has become more precise (68). Early genomic classifications based on single parameters (e.g. PAM50 and gene expression) have evolved into complex integrative methodologies designed to capture heterogeneity across multiple levels, such as the 11 Integrative Clusters defined by Curtis et al., (7). Multi-parameter stratification continuous to improve.

Now, efforts are underway to stratify both breast (69) and colorectal cancers (70) based on immune infiltrates and immuno-genomic signature. Such classification could allow more targeted use of, for example, novel immunotherapies (44).

Clonal populations within the same tumour can have profound influence on response to therapy, the emergence of drug resistance and disease progression.

Currently our ability to predict the emergence of drug resistance in tumours requires a priori knowledge of resistance mechanisms and the identification of resistance-associated clones within a tumour. However, there is some evidence that the development of resistance is an inevitable consequence of single agent targeted therapies (71), and that by following individual genomic signatures overtime, we may be able to first predict, and then counter, the emergence of drug resistance. At, we are working tirelessly to get these new, novel, complex stratification biomarkers into clinical practice and develop ways in which drug resistance can be predicted.

Typically, resistance results from the outgrowth of specific pre-existing populations within a tumour rather than from de novo evolution (72).

Indeed, the wider the diversity of minor clonal populations in a tumour, the more likely it is that resistance will arise. Such an association between tumour heterogeneity and drug resistance has been noted in ovarian (73), and oesophageal (74) cancers. Additionally, basal-like TNBCs have previously been linked with shorter disease free survival compared to non-basal-like TNBCs and tend to be associated with higher clonal diversity (38).

Epidermal growth factor receptor (EGFR) is a well-established driver of CRC and anti-EGFR therapy shows clear benefit in a subset of the metastatic disease. However, a plethora of events have been shown to predict drug sensitivity (primary resistance) and acquired resistance to anti-EGFR therapy in this setting (71). Interestingly, resistant populations have been shown with mutations in RAS, BRAF and PIK3CA or amplifications in KRAS, ERBB2 and MET.

While the mechanisms of resistance are genetically heterogeneous, they functionally converge on key signalling pathways which might aid the identification of biomarkers of disease progression (76).

Similarly, numerous avenues to PARP-inhibitor resistance have been described in breast and ovarian cancers in either a BRCA1 dependent and independent (e.g. 53BP1/REV7 loss) fashion (77,78). Each mechanism of resistance results in a clone regaining the ability to undergo homologous recombination, suggesting that functional biomarkers of resistance may be possible.

Metastasis is the ultimate cause of 90% of all cancer deaths (79). Aside from the development of resistance and recurrence of disease, heterogeneity among tumour populations widens the diversity available for the evolution of metastatic populations.

The long-standing observation that some cells within a tumour were able to form secondary tumours at a higher frequency than others was one of the key arguments for the CSC hypothesis (15). However, multi-region sequencing studies have found that multiple distinct genomic clones are able to form metastases in pancreatic cancer (80), suggesting that a single ‘CSC clone’ is not necessary responsible for cancer dissemination.

Additionally, new research suggests that metastatic sites must be ‘primed’ before disseminating cells can form distant metastasis (81,82). It is possible that clonal cooperation could contribute to this effect, with one cellular population releasing cytokines and the other disseminating into the circulation.


Both intra- and inter- tumour heterogeneity have profound clinical consequences in terms of differential response to therapy, development of drug resistance and disease progression. Beyond stratified medicine, a better understanding of the causes and consequences of clonal heterogeneity within a tumour will allow a deeper understanding of the emergence of drug resistance. By studying the evolution of clonal populations, we may be able to predict, and ultimately counter, the emergence of drug resistant and metastatic clonal populations.

At, our mission is to ensure that each patient gets the right drug, at the right time, to beat their cancer. To deliver on this mission, we develop AI-focussed tools to analyse and interpret tumour heterogeneity and clonal evolution, helping oncologists make the right clinical decisions on a per patient basis.

  • Written by John Cassidy, CEO at
  • Edited by Belle Taylor, Strategic Partnerships Manager at
This is some text inside of a div block.