Top

This time in our Datawatch series, we shine a spotlight on the oncologists around the world using data science to predict, treat, and mitigate cancer.

If you’ve read any of our other Datawatch blogs, you’ll have seen that people worldwide have made incredible, life-saving achievements with data science in recent years. From saving the lives of hikers lost in the woods to monitoring and mitigating the effects of climate change, the potential of data science is truly limitless.

And this instalment is no exception. Cancer is a condition that’s affected most people in one way or another, and finding ways to treat it has been the focus of the world’s top scientists and oncologists for decades.

Now, researchers have started to make major breakthroughs across the prediction, diagnosis, treatment, and prevention of cancers using the power of data science. And it’s changing the lives of many people battling the condition.

Spotting cancer early with predictive analytics

One of the most valuable areas of oncology data science can support is the initial prediction stage. Predictive analytics can be an incredibly useful tool for helping oncologists detect tumours and categorise them based on their level of danger to patients.

This can be used on a broad scale, stratifying populations according to the presence and absence of risks, considering age, gender, ethnicity, family history, and lifestyle factors. And with the right data, oncologists can make even more accurate predictions on a case-by-case basis.

Combining computer vision and machine learning, oncologists can capture important data about a tumour – such  as its radius, perimeter, compactness, and proximity to organs – and identify whether it’s malignant, metastatic, or benign. Predictive analytics can also be an effective way to identify whether cancers will re-emerge in existing patients by measuring specific patterns and forecasting the reappearance of cancer cells.

For conditions such as breast cancer, which is only caught in its earliest stages 20% of the time, predictive analytics can dramatically improve forecasting accuracy.

For example, one study trained a predictive analytics tool with samples from 70 patients with stage 0 breast cancer – who all had mastectomies, and each had at least ten additional years of medical records available. Using this data, the predictive analytics model was able to successfully spot aggressive and non-aggressive disease from a series of 100 micrographs 96% of the time – which is a considerable boost over the 70% that human physicians can recognise.

Treating patients with the help of data science

In some cases, even when cancers are predicted and spotted early, they can’t be prevented. But data science is also helping oncologists worldwide treat cancers to reduce their impact on patients. 

Using radiomic data – quantitative data extracted from medical images using algorithms – some oncologists in the US can determine which lung cancer patients will benefit from chemotherapy.

Radiomics looks closely at how the cancer will respond to the treatment, showing specific characteristics of the tumour that aren’t clear to the naked eye – including unique patterns of heterogeneity inside and outside the tumour.

Radiomics – a field of study that extracts quantitative from medical images using algorithms – can be used to uncover disease characteristics that aren’t clear to the naked eye, including unique patterns of heterogeneity inside and outside the tumour.

Elsewhere in the world, oncologists are taking data-driven approaches to treat other types of cancer. This study in Korea used a chemotherapy recommendation model based on deep learning for patients with colorectal cancer. The model learns from past clinical cases to recommend personalised treatments on a patient-by-patient basis that’s specific to their cancer, and other key personal characteristics.

New data means groundbreaking new research

One of the key factors that’s held oncologists back from embracing data science in the past is the lack of high-quality medical data available to study. But in recent years, that’s changed.

Now, research bodies have access to large publicly available databases that include a wide variety of data from cancer patients across different races, genders, ages, and genetic makeups. 

Min Zhang, Associate Director of Data Science at the Purdue Centre of Cancer Research, is working on an innovative data science project studying the products of metabolism in the body. This includes studying sugars, amino acids, and other molecules named metabolites in a patient’s body to predict whether that patient has or will get cancer.

Looking at these metabolites individually doesn’t offer a lot of answers. But using data science to study groups of metabolites across a patient’s body, Zhang’s team identified that they work together to perform specific functions.

This means the team can use the groups of metabolites as biomarkers to screen patients for colorectal cancer, using a blood sample and avoiding the need for invasive procedures like colonoscopies. Zhang even uses machine learning methods to study how these genes regulate each other as the patients’ cancer progresses – offering vital insights on how to treat the patient.

As medical data like this continues to become more readily available to oncologists and researchers around the world, data science will become even more valuable in the fight against cancer. And that means many more lives will be saved every day, all around the world.

For more fascinating insights into the role analytics plays in managing real world problems, read more of our Datawatch and Inside Analytics blogs here.

Co-authored by: Patrick Cronin and Abhishek Jain
  • Patrick Cronin

    Patrick is a seasoned business executive and analytics practitioner with 10+ years of experience in developing impactful analytics solutions that grow topline revenue across industries and functions. He leads our Marketing and Commercialisation sales for Life Science industries in North America. Patrick has an MBA from the University of Rochester and studied mathematics at Geneseo State University. In his spare time, he enjoys watching sports, a good book, and is the resident culinary expert on our U.S. team.

  • Abhishek Jain

    Abhishek is a Vice President, Advanced Analytics Solutions for The Smart Cube. He is passionate about developing and implementing analytical solutions for Fortune 500 companies, helping them understand customers and make better business decisions. Abhishek specialises in predictive analytics and visual storytelling around consumers and operations across the Retail, CPG and Life Sciences domains, focusing on data science and stakeholder management.

Co-authored by: Patrick Cronin and Abhishek Jain
  • Patrick Cronin

    Patrick is a seasoned business executive and analytics practitioner with 10+ years of experience in developing impactful analytics solutions that grow topline revenue across industries and functions. He leads our Marketing and Commercialisation sales for Life Science industries in North America. Patrick has an MBA from the University of Rochester and studied mathematics at Geneseo State University. In his spare time, he enjoys watching sports, a good book, and is the resident culinary expert on our U.S. team.

  • Abhishek Jain

    Abhishek is a Vice President, Advanced Analytics Solutions for The Smart Cube. He is passionate about developing and implementing analytical solutions for Fortune 500 companies, helping them understand customers and make better business decisions. Abhishek specialises in predictive analytics and visual storytelling around consumers and operations across the Retail, CPG and Life Sciences domains, focusing on data science and stakeholder management.