NeurIPS 2018

December 8, 2018

Cambridge Cancer Genomics ( today announce two research papers to be published at NeurIPS 2018, a Y Combinator backed startup building the precision AI platform for personalised cancer medicine, are pleased to have published two applications of machine learning on cancer genomics datasets at the NeurIPS conference. The first ( outlines a method for predicting the outcome of different therapies in >2500 breast cancer patients. The second ( is a best-in-class variant caller, which uses the same principles behind facial recognition, to identify the mutations causing cancers.

Together these tools enable 1) accurate identification of the molecular drivers of a single tumour and 2) prediction of whether targeting these drivers would be a useful therapeutic strategy. CCG has made the underlying technologies freely available, for non-commercial use, at

Dr Harry Clifford CTO and Study Lead says: “When you drill down into the DNA changes behind cancer, you quickly find that no two tumours are the same. To apply cancer therapies more successfully to any given tumour, we need a deeper understanding of what exactly has gone wrong in each case at a molecular level. This starts with effective tools to capture that information. The approaches we're developing at CCG will have widespread applications, from identifying targets for new therapy development, to deciding which personalised approach is best for a given patient. We are thrilled to have the opportunity to share our work at the world’s most prestigious machine learning conference.”

About the individual studies:

1) Interlacing Personal and Reference Genomes for Machine Learning Disease-Variant Detection

Summary:Differences in our DNA underlie many aspects of human health; from rare genetic diseases to cancer. In this paper, we build a new class of software for detecting DNA variants. Based on the same principles behind facial recognition, our technique can identify cancer variants with unparalleled accuracy. We hope that releasing this software for non-commercial use will lead to more successful targeted therapy and personalised cancer medicine.

Variants in DNA underlie traits inherited from our parents, define the difference between two individuals of the same species and are the root cause of diseases such as cancer. Analysis of these variants enables the identification of potentially fatal diseases and conditions. Variant detection is the foundation of targeted cancer therapy and personalized medicine. If we get variant detection wrong, then the drugs we choose could be wrong, and patients may suffer. In this paper, CCG has developed a new class of variant detection algorithm, based on computer vision. In benchmark tests with ground truth datasets, our software already outperforms those developed by Google and the Broad Institute of Harvard and MIT. Our tool is available for free, for non-commercial use, at

2) A Framework for Implementing Machine Learning on Omics Data

Summary:Despite recent advances in the field of cancer therapy, first line treatments still fail for two out of three cancer patients. In this study, Cambridge Cancer Genomics have1)developed an open source tool to allow machine learning researchers to work on cancer genomic datasets, and2)used this tool to predict how effective treatment will be at an accuracy of >80%.

Despite all the recent breakthroughs in cancer treatment, the reality is that first line treatment still fails for 2 out of 3 cancer patients. Despite their best intentions, oncologists often don’t know whether a particular drug will be effective for a particular patient. As whole genome sequencing becomes more affordable, molecular profiling and “omics” data (from genomes and beyond) will be available for all cancer patients. At CCG, we believe that this data has the potential to enable oncologists to make smarter decisions about which drug to use in which circumstance. In this paper, we demonstrate the success of machine learning in pooling sequencing data from thousands of breast cancers, and predicting outcome for patients at risk of unsuccessful treatment, to more than 80% accuracy. CCG’s genomics data pipeline, which is freely available for non-commercial use at, could open the door to increased use of AI for analysis of cancer sequencing data.

About Cambridge Cancer Genomics (

Cambridge Cancer Genomics ( is a Y Combinator backed startup building the underlying tools to empower oncologists to make the best therapeutic decisions for their patients. Their precision AI platform, OncOS, is fast becoming the global open standard for next generation sequencing analysis and clinical decision support in cancer medicine. By focussing on machine learning based analytics of serial liquid biopsy samples, is building predictive models to understand how tumours evolve and how this can impact on response to therapy. As this knowledge base of tumour genomic evolution increases, OncOS will become the operating system for personalised cancer medicine. At, our mission is to ensure every patient gets the right drug, at the right time, to beat their cancer.

Press Enquiries

For more details, or to request an interview, please contact: