Cancer Sub-Clones to Predict Tumour Evolution

NeurIPS 2019 fact-sheets: overviews of the research we presented, and its future impact

Adnan Akbar
January 16, 2020
February 10, 2020


The majority of cancer treatments end in failure due to Intra-Tumour Heterogeneity (ITH): the presence of different types of cancer cells within the same tumour. ITH in cancer is represented by “clonal evolution”, the process by which different cancer cells, or sub-clones, compete with each other for resources under conditions of Darwinian natural selection. Predicting the growth of these sub-clones within a tumour is among the key challenges of modern cancer research.

Predicting tumour behaviour enables the creation of risk profiles for patients and the optimisation of their treatment by therapeutically targeting sub-clones that are more likely to grow. This paper presents our novel data-driven approach and its potential to predict tumour evolution. In the paper, we also highlight the limitations of our approach in the current technology landscape and how in the future, these methods can be further improved with better technology and more data.

"This paper demonstrates the possibility of predicting the course of tumour evolution using data-driven methods."

How does it work?

This work is based on the intuition that if we can capture the true characteristics of sub-clones within a tumour and represent them in the form of features, a sophisticated machine learning algorithm can be trained to predict tumour growth and behaviour over time. In this regard, we extracted several features based on the location of underlying somatic (non-inherited genetic) mutations and combined them with longitudinal liquid biopsy data (blood tests containing information on cancer DNA)to train and optimise our machine learning algorithms. Random forest regression provides the best performing model for predicting clonal evolution. Prediction results for two randomly chosen patients are shown in the figure below.

In our analysis, we observed a better performance of our algorithm for sub-clones consisting of driver mutations, demonstrating the role of selective advantage in sub-clonal evolution. More details about the results and performance can be found in the paper.

Sub-clone prediction comparison, applying our algorithm to two randomly chosen patients

What’s the impact?

This paper demonstrates the possibility of predicting the course of tumour evolution using data-driven methods. This has profound implications in the clinical management of cancer as a chronic disease. Interestingly, our models performed best on ’driver’ mutations, i.e. those thought to be primarily responsible for driving clonal growth. This adds weight to the biological significance of the results.

We strongly believe that combining advancements in technology with more data will lead to a more accurate prediction model for cancer evolution and ultimately greatly improve cancer care.

Find out more

This blog gives a high level overview of a paper presented at the NeurIPS 2019 workshop: Learning Meaningful Representations of Life.

To learn more about this research, read the full paper or have a look at our earlier blog post, which explains the problem in more detail.

We published 5 papers in total at NeurIPS 2019. Check out our press release to learn about our other Machine Learning advances.

  • Written by Adnan Akbar, Data Scientist at
  • Edited by Belle Taylor, Strategic Communications and Partnerships Manager at
  • Thanks to Geoffroy Dubourg-Felonneau and Harry Clifford for valuable discussions
This is some text inside of a div block.