AI Predicts Prostate Cancer Prognosis Fairly Across Races

Fact checked by Sabrina Serani
Publication
Article

Mack Roach III, MD, discussed a study on artificial intelligence’s generalizability for prostate cancer across racial groups.

Mack Roach

Mack Roach III, MD

A multimodal artificial intelligence (AI) algorithm was shown to perform well across racial subgroups without evidence of algorithmic bias, according to data from a study published in the Journal of Clinical Oncology Clinical Cancer Informatics.1

The study used a dataset from prospective phase 3 randomized trials, controlling for treatment biases, and included over 5000 patients, nearly 1000 of whom were Black. AI algorithms analyzed digitized biopsy slides, improving prognosis prediction without racial bias.

The findings suggest that race does not significantly impact prostate cancer outcomes, supporting the use of AI for personalized treatment recommendations. Mack Roach III, MD, emphasized the importance of diverse patient populations and ongoing clinical trials for future advancements in cancer treatment.

“Using this data, we create an opportunity to have a more evidence-based way to make recommendations and to think about prognosis and to think about the future directions that we are moving in. I think that it encourages the people to look toward the future that we think we are going to get better at refining treatment, not overtreating, not undertreating, and also not making assumptions that are not valid about the nature of the interaction between race and treatment,” Roach, professor of radiation oncology, medical oncology and urology, at the University of California, San Francisco, told Targeted OncologyTM in an interview.

In the interview, Roach further discussed this study on artificial intelligence’s generalizability for prostate cancer across racial groups.

Targeted OncologyTM: What was the rationale behind investigating AI's generalizability for prostate cancer across racial groups?

Roach: One of the major controversies in prostate cancer is, what is the impact of race on outcome? We do know that prostate cancer is 1 and a half to 2 times more common in [Black] men. For many years, there have been different opinions about whether there is an inherent biologic factor or not, and so it is important for us to distinguish between the incidence of the disease and the biologic behavior of a disease once it is diagnosed.

The data set that we published is based on men who were diagnosed and treated for prostate cancer on prospective phase 3 randomized trials. The value of that resource is that the quality of care and the eligibility are controlled in such a way that biases do not really enter into the quality of treatment, because the patients have to sign a consent form, and they have to be stratified. Then, they are randomized. Their follow up is systematic, and the question of race is more easily answered because we have adjusted for insurance coverage, made sure that everybody got the same treatment. [We] have long follow-up and large sample sizes.

Some studies are flawed because patients got different kinds of treatments. Some get surgery, some get radiation, some get hormone therapy, some get a different duration of hormone therapy, some get different doses of radiation. But all of these variables are controlled for, and so it creates an opportunity for us to look at whether or not an algorithm that utilizes the power of artificial intelligence can help us get some clarity on whether race is really important in terms of predicting outcome.

How does this AI use images and data to predict prostate cancer outcomes?

We have standard criteria that we use to determine the patient's prognosis there include the [prostate-specific antigen (PSA)], the stage of the [disease], the grade of the [disease], and treatment and age and other variables can play a role. What is added here is that the slides from the patient's biopsies are digitized. They are put into a format that can be fed into a computer, and the computer can look for things and identify parameters that the pathologist does not see. These things that are identified by the computer can be modified in such a way to allow us to have better and more accurate estimates of prognosis so we can define survival, the risk of metastasis, the risk of recurrence, and so forth, more precisely than we can without using the artificial intelligence.

We then take what we found and then look at, does race play a role? Is there a difference in the distribution by race? Do [Black patients] have a more aggressive signature when you use artificial intelligence? One of the aspects of the title is that it says assessing algorithmic fairness. One thing we do not want to do is to induce additional biases against certain groups that are known to have worse outcomes historically. The beauty of this dataset is that we can document the algorithm that is used, and it is fair to the various populations included in the dataset.

Can you discuss the importance of algorithmic fairness?

There have been other studies where people have developed an AI algorithm—for example, facial recognition. It has been used to diagnose patients who would need certain treatments for eye diseases. One of the things they found was that there was a there was an algorithm that was developed to predict a certain eye disease in a predominantly [White] population, and they sought to apply this to a group of patients—I believe it was in Indonesia—and because of pigmentation, differences in how that affected the appearance of the retina in the eye, the algorithm did not work. There were biases related to the fact that it was developed in one cohort and then utilized in a different cohort, and it did not work.

So, how do we determine that there is no algorithmic bias? Well, using traditional methods, statistical methods, we already had an idea of what the impact of race was or was not based on patients previously treated on our randomized trials, and so we had a baseline understanding of where race failed and how it affected prognosis, and it turns out that when we apply this AI algorithm, it gives us more accurate information of the patient's prognosis, but it did not appear to interact with race that it did not say that people of one racial group did worse or better as a result of applying the AI algorithm. It did not induce any bias, and therefore we determined that it was algorithmically fair.

The implications are broader than prostate cancer. They reflect on the issue of using AI and provide an example of how you can apply data from one place to guard over the possibility that you might have induced bias if you use artificial intelligence. In a broad sense, it is an example of how you can use artificial intelligence and be assured that you are not inducing bias. It also helps us understand more about prostate cancer, because we all want to know what are the important things that we need to do or not do? By understanding that race was not explaining some of the population-reported differences.

Previous studies going back 30 years suggested that there were differences that could be attributed to race, that the biology implying that there was a biological difference by race which affected the prognosis. This data allows us to say there is no evidence based on the largest dataset of patients treated on prospective phase 3 randomized trials to support the notion that there are inherent biological differences by race, which are major drivers of outcomes.

Why is a diverse patient population important for AI tool development in oncology?

If all the patients were White or Asian or Black, then we would not be able to use the data to show that there is no difference. By having a dataset that is very heterogeneous in terms of the ethnicity, age, and stages, we can show that it is agnostic. In terms of the algorithm, it does not discriminate one group vs another. It simply is driven by the outcome and the relationship between what things look like on the pathology slide. This is a unique data set with more than 5000 patients and nearly 1000 patients that were [Black], and with long follow-up.

One of the problems is that a lot of studies have 2, 3, 4, or 5 years of follow-up, but this data goes out to patients treated more than 20 years ago. Not only do we have big numbers in terms of the population of patients and excellent control on the quality of treatment and systematic way the treatment was applied to these patients, but we have long follow-up with clinically meaningful end points such as metastasis-free survival and overall survival.

Looking at the data, AI showed strong predictions in both racial groups. What does this kind of imply for broad use?

There were certain groups that did not have anything to do with race who did particularly poorly, and there were some groups that were identified that did very well. When we see a newly diagnosed patient with prostate cancer, we will be reassured that we can say, “Mr. Jones, I have good news for you. You are going to do particularly well. And it has nothing to do with your race. It has something to do with the biology of your prostate cancer.” In addition to that, for those subpopulations who did poorly, we can now home in and go back and look at those patients, their characteristics, their biopsy specimen, and look for mutations in their tumors. This could help us define treatments that might be more beneficial in certain populations of patients. We might need to do different tests, we might need to change [or] add additional drugs, and so forth.

Not only does it give us an example of how AI should be done, how AI could be applied and avoid algorithmic fairness, but how the data that is taken like this can help us define subpopulations of patients that might benefit in particular because of the aggressiveness or the lack of aggressiveness of their disease. It benefits [patients], regardless of race. Whether they [have] low-risk or high-risk [disease], it allows us to give a more personalized, a more scientifically based recommendation of prognosis and treatment.

What were the main findings and key takeaways for an oncologist?

The key takeaway is that there is no good evidence that race should be used at this point in time to predict prognosis and that it is an example for the future. There is a lot of optimism for the use of other biomarker AI. We have preliminary data to suggest that AI might be that this type of approach might be useful in deciding which patients should have their lymph nodes radiated. For example, we will have information on which patients should receive hormone therapy for a short period of time and which patients might receive hormone therapy for a long period of time. Those things are not in this paper, but what they do is they provide a basis for us to believe that when we go to apply it in other circumstances and make recommendations for treatment, because the distribution of the AI signature was similar by race, that we will expect that the utility of use of predicting prognosis and driving treatment recommendations will also be similar by race. It should create some comfort for those who are concerned about not jumping to conclusions that are not valid in using this for patient-directed care.

How might community oncologists practically use such an AI tool for prostate cancer in the future?

Using this data, we create an opportunity to have a more evidence-based way to make recommendations and to think about prognosis and to think about the future directions that we are moving in. I think that it encourages the people to look toward the future that we think we are going to get better at refining treatment, not overtreating, not undertreating, and also not making assumptions that are not valid about the nature of the interaction between race and treatment.

I would encourage [patients] to participate in clinical trials going forward, because that allows us to develop these datasets that we need to make sure that our clinical trials continue to be well funded. That is a big challenge with resources being compromised at the National Cancer Institute. It is only because we have such studies that we will be able to understand how to appropriately select treatment. We continue to hope that research of this nature will be funded, and that patients will participate, because it is critical. There's no experiment that is going to help us determine what the best treatment is, and without patient participation and funding, we will not really make this kind of advance.

REFERENCE:
Roach M 3rd, Zhang J, Mohamad O, et al. Assessing algorithmic fairness with a multimodal artificial intelligence model in men of African and non-African origin on NRG oncology prostate cancer phase III trials. JCO Clin Cancer Inform. 2025;9:e2400284. doi:10.1200/CCI-24-00284

Newsletter

Stay up to date on practice-changing data in community practice.

Recent Videos
Related Content