Improving Identification Within Asian American Subgroups With Ovarian Cancer


Sarah Lee, MD, MBA, discussed a study to better disaggregate the subgroup of Asian American patients with ovarian cancer.

3D rendering of ovarian cancer: © Dr_Microbe -

3D rendering of ovarian cancer: © Dr_Microbe -

Identifying differences within patient subgroups can provide valuable information for clinicians and researchers, particularly to evaluate the distinct needs of the subgroups that exist and also to determine the generalizability of findings. Sarah Lee, MD, MBA, and fellow researchers sought to investigate the feasibility of disaggregating Asian American patient subgroups to analyze trends in ovarian cancer diagnoses.

The researchers aimed to see if separating Asian ethnicities within the Asian American race would be practical to assess ovarian cancer diagnosis trends.

Lee, a gynecologic oncology fellow at NYU Langone, and her colleagues looked at patients diagnosed with ovarian cancer between 2011 and 2023 in a particular healthcare system. Data on race, ethnicity, and medical background was collected from electronic health records (EHR) and a tumor registry.

Asian American patients were then separated into their specific ethnicity or country of origin based on EHR data and their main language spoken, if available. For each patient, a validated method using surnames was used to determine their ethnicity. The success rate of each method for separating Asian subgroups was the main outcome measured.

Results showed that out of 90 Asian American patients, 48% were identified as Chinese, 18% Indian, 14% Filipino, and 3% Korean. Researchers were able to disaggregate61% of Asian patients using information they gave about their race and ethnicity. An additional 16% were separated by their documented preferred non-English language.

For the remaining patients where those methods were not unavailable, researchers were able to disaggregate an additional 15% using surnames. Overall, 9% of Asian American patients could not be separated into subgroups using any of the methods.

The study concluded that separating Asian American subgroups using self-reported data, language preference, and surnames is possible for most patients. It is possible for health care systems to routinely collect this data to improve reporting on health outcomes for Asian American subgroups.

In an interview with Targeted OncologyTM, Lee discussed the study that was presented at the 2024 Society of Gynecologic Oncology (SGO) Annual Meeting on Women’s Health.

Targeted Oncology: What was the background for the study you presented at SGO?

Lee: We know that disaggregation of data are needed to ensure health equity because we really need to understand granularity that make up patients in studies. When you just say“White” or “other” [in a study], these can mask the distinctive cultures or different ethnicities and countries of origin and races of people within those larger categories. Granular data and smaller disaggregated data are needed to understand who are the people that make up the studies, but also to hopefully target interventions that could be culturally and linguistically appropriate in subpopulations.

We were specifically interested in looking at Asian American populations. There [are] over 20 million Asian Americans in the United States. Asian Americans are sometimes under the category of “other,”but oftentimes, Asian American and Pacific Islander patients are grouped as AAPI in studies. Despite the fact that there [are] 20 million [Asian Americans] and [it is the] fastest growing racial ethnic subgroup in the US, we thought that we should better understand how these patients are divided or disaggregated.

Could you talk about the methodology for the study?

Previously, we looked at the National Cancer Database for [gynecologic] cancers. Then, we saw persistently that 25% of Asian American patients could not be disaggregated into a subgroup of ethnicity or country of origin across cervix, uterus, and ovarian cancers, and that proportion rose over time. We were interested in seeing methodologies at an institutional level to see if we can increase the percentage of people that were able to be disaggregated within the Asian American cohort.

We used 3 methodologies. The first one used was self-identification in the electronic health record system. This is what the patient self-identifies as into a disaggregated Asian American group. The second methodology used is if the patient indicated in the electronic health record that they used, or they prefer, anon-English Asian language—Korean, Japanese, Thai—then we use that as a as a proxy.

The third methodology that we brought over from social sciences is the surname methodology by [Diane] Lauderdale [PhD] from University of Chicago. She uses last names of surnames and race groups that just aggregate patients or Asian Americans, and she used this validated methodology was from about 1.8 million people from Social Security files of those born outside of the United States. This methodology has over 20,000 surnames. Some of the examples are, let's say the last name Kim, for instance. So that last name is disaggregated to be Korean, regardless of what race you specify. If [a patient] specified Asia and [“other” category for race], if [they] have the last name, Kim, [they are] strongly predicted to be Korean. This is in contrast to the last name Bang. The patient is only strongly predicted to be Korean if [they] specified Asian as [their] race. The other example, as a last name, Lee, like my last name. Unfortunately, you cannot disaggregate the last name Lee, because this is represented across multiple races and also multiple countries of origin within Asia.We use all 3 methodologies to see, can we maximize the number of patients that are disaggregated? Can we do better than the national database’s 75%?

What were the findings from this study?

We were similar to the National Cancer Database.Using the self-identification on electronic health record alone, 61% of the patients were able to be disaggregated into an Asian American cohort. When we added language, we were able to disaggregate a total of 77%. So that is in line with the National Cancer Database. If we just look at non-English language, we were only able to disaggregate 37% or about one-third of the patients. That is because the majority of the patients indicated English as their preferred language, so non-English language alone is not the best way to disaggregate patients. Then, the surname methodology alone disaggregate is 70%, which is higher than language and higher than self-identification. When we use all 3, so using self-identification, language, and surnames, we were able to disaggregate 91% of Asian American patients, again, in comparison to about 75% in the national databases. Using all three methodologies was effective, and it was feasible in disaggregating Asian Americans.

What are the implications of these findings?

The first thing is, we know that self-identification is still the gold standard. Some health systems are rolling out different ways to incorporate self-identification. Some people have scripts, some people incorporated [it] into electronic health records. But I think the important thing for patients is that we need to explain to patients why we are collecting this disaggregated data, because patients might have concerns regarding privacy or confidentiality. I think we need to explain to patients why understanding who the patients are is important. Then, at a larger picture, clinical trials and research are important to understand the generalizability, to understand the representation of the different subgroups.

What is the importance of identifying disparities within subgroups?

Previously, we looked at [gynecologic] oncology clinical trials that were on, and we only found that about two-thirds of published [gynecologic] oncology studies included race or the race breakdown of who is included in the studies.This is challenging in a couple of ways. One, we do not know who the patients are, so we do not know if it represents a diverse patient population. And from an ideological perspective, we need to make sure that we are including a diverse patient population in our trials to really make sure that we are able to apply those results to a diverse patient population and then to ensure generalizability. Disaggregation is oneway that you can understand who is in the studies and to identify if there [are] gaps in the disparities. We only know by breaking down who was in your study and reporting that, not just to, but in abstracts, presentation, plenaries, and manuscripts. That is how we can understand, are we meeting our goals and including a diverse patient population? Is this generalizable to my patients and my population of patients that I serve?

Lee, S. Methodologies for the disaggregation of Asian racial and ethnic subgroups to determine trends in ovarian cancer. Presented at: 2024 Society of Gynecologic Oncology Annual Meeting on Women’s Health. March 15-18, 2024. San Diego, CA.
Related Videos
Related Content