Dana-Farber Researchers Use Machine Learning to Understand Rare Familial Blood Cancer

Key Takeaways:

  • In certain families, almost everyone develops Waldenström’s macroglobulinemia, a blood cancer, but no one has found a biological explanation for the increased risk.
  • Researchers at Dana-Farber have put together a data atlas that includes a range of genomic, transcriptomic, epigenetic, and clinical data from patients and healthy donors to try to understand familial WM.
  • The team is using machine learning to generate hypotheses in the hopes that they can develop better diagnostics and individualized therapies for the disease.

Waldenström’s macroglobulinemia (WM) is a blood cancer that is extremely rare, affecting around 1500 people in the US each year. About 20 percent of those cases are considered familial, meaning that many members of the same family also have some form of blood cancer, such as myeloma or lymphoma. 

An even smaller group — five percent of those familial cases — only get Waldenström’s. In those families, the inheritance pattern is striking. 

“A grandparent has it, the mother has it, most of the children are eventually diagnosed,” says Zachary Hunter, PhD, a researcher at the Bing Center for Waldenström’s Macroglobulinemia at Dana-Farber. “It goes straight down the line.” 

A team of Dana-Farber researchers has been caring for patients with Waldenström’s and studying the disease’s roots for decades. They’ve identified gene mutations that drive the disease and helped develop targeted therapies used to treat it. But now they want to take on this new mystery: What is causing Waldenström’s — a disease that occurs when certain genes mutate spontaneously — to occur so reliably in some families?  

There must be a biological explanation. But so far, no obvious one has come up. In other forms of cancer, some families carry inherited mutations in a gene, such as BRCA2, that increase the risk of certain cancers. But families with WM share no such inheritance.  

So the researchers have brought together a powerhouse of patients, data, and machine learning techniques to look even deeper into the biology for an explanation. They hope the effort will help them identify families with a high risk of Waldenström’s and find ways to prevent and treat the disease. 

“Families are worried. They want to know if we can develop preventative therapeutics,” says Steven Treon, MD, PhD, director of the Bing Center for Waldenström’s Macroglobulinemia. “They want to know if we should be treating people with familial disease differently. These are the questions we want to answer.” 

In certain families, almost everyone develops Waldenström’s macroglobulinemia, a blood cancer, but no one has found a biological explanation for the increased risk. Researchers at Dana-Farber are using machine learning to process an atlas of patient data to try to understand the roots familial WM.

Progress treating a rare disease 

WM is a cancer that occurs in blood cells called B cells. The cancerous cells accumulate in bone marrow and secrete a protein that enters the bloodstream causes a range of symptoms. Many patients with the condition do not need treatment right away, but when the disease advances, there are multiple forms of therapy, but there is currently no cure. 

People come from around the world to the Bing Center for care, as it is one of the few places that specifically focuses on WM. The center also conducts research intended to improve treatment, including tumor genomic research that helped Treon and Hunter and colleagues open the door to personalized targeted therapy for the disease.  

For instance, about a decade ago the team studied the tumor genomes of dozens of their patients and discovered that MYD88 gene mutations occur in over 90 percent of WM patients and that CXCR4 mutations occur in up to 40% of WM patients. These mutations help the team determine which treatments are more likely to be beneficial for individual patients. 

“These are the defining mutations in Waldenström’s,” says Treon. “But people aren’t born with them, so what predisposes some families to getting them and developing Waldenström’s?” 

Finding and understanding the roots of familial risk 

To answer this question, the team needs a much broader dataset. They have collected data from about 300 people with WM and a slightly smaller number of healthy people.  

The data includes tumor genome and germline genome sequences. It also contains data about regulators that can change the way a cell uses genetic instructions to produce proteins, such as transcription factors and epigenetic markers.  

Lastly, they are looking at clinical data, such as disease onset, severity, and treatment responses. “None of this would be possible without our amazing clinical team and care center, our lab and informatics teams, and our patient volunteers,” says Hunter. 

Their goal is to sift through this data and find “hits,” says Kris Richardson, a data scientist at the Bing Center. Hits are signals in the data that show up consistently among people with other data in common, such as being in the same family or having similar symptoms of the disease. 

“We’ll look for common signals across an entire family, for example,” says Richardson. 

Finding those signals, however, is not a task for humans. There are hundreds of genes in a potential region of interest in the genome. Each gene might be transcribed into dozens of different variations, called isoforms, of messenger RNA, which provide the instructions for building a protein.  

The data is, in fact, so complex that the team is relying on machine learning to find these patterns by looking across the data and recognizing commonalities a human might miss. 

Some of the hits found by machine learning will be biologically meaningless. “That’s why we humans are still involved,” says Hunter. 

But some have generated intriguing connections. For instance, the team has found cases where genes that had no evident mutations ended up producing proteins that were not structurally or functionally the same as the protein normally created by that gene. The difference was a level deeper — in the isoform created during gene expression. 

The team is following up on these leads and looking for more. They hope that one of these signals could help the team find a way to help patients, particularly those with familial WM.  

“That is the reason why we do all this,” says Hunter. “We want to be able to help people.” 

1 thought on “Dana-Farber Researchers Use Machine Learning to Understand Rare Familial Blood Cancer”

  1. I’ve been a patient of Dr Castillo.. 12/2021.
    Are you studying why patients without MYD88 or CRX gene get Waldenstroms? I read about other studies but not much in this area.
    I’ve had breast, stomach neuroendocrine tumors, kidney cancer.. chromobhobe type and Waldenstroms.
    And genetic testing showing nothing out of the ordinary.
    So I guess this is an area for a few of us.. that is rather frustrating for us both.
    Thanks for all of your hard work.
    Nanci Robison

Comments are closed.