News|Articles|April 28, 2026

AI Models May Serve as Scalable Adjunct to Oncology Documentation Workflows

Listen

0:00 / 0:00

Key Takeaways

Advanced LLMs demonstrated high sensitivity for error detection in complex oncology documentation, including discrepancies likely to affect management, such as incorrect chemotherapy regimens or discordant laboratory values.
In simulated discharge summaries, Gemini 2.5 Pro detected 97.8% of injected errors and GPT o4-mini-high detected 87.8%, compared with 47.8% mean detection by oncology specialists.
Correction suggestions were variably accurate, with stronger performance on clear internal inconsistencies than on nuanced clinical ambiguities requiring expert judgment.
Synthetic-vignette evaluation limits external validity, and performance may shift with institutional documentation styles, EHR configurations, and real-world ambiguity.
Implementation into EHR workflows could enable chart audits and real-time inconsistency flagging, but requires rigorous validation for bias, reproducibility, data security, and false-positive mitigation.

Study finds Gemini and GPT catch oncology chart errors, improving documentation accuracy and patient safety with clinician oversight.

Large language models (LLMs) may serve as a valuable supplement to oncology documentation workflows by detecting and correcting documentation errors in clinical records, according to a recent study published in JCO Clinical Cancer Informatics.¹

In a 2-phase evaluation, investigators assessed the performance of contemporary advanced LLMs, namely Google’s Gemini 2.5 Pro and OpenAI’s GPT o4-mini, in identifying errors within complex hematology/oncology documentation. Across 1000 synthetic clinical scenarios, models demonstrated the ability to detect and, in some cases, correct inconsistencies involving diagnoses, treatment plans, and laboratory data.

“Advanced LLMs can serve as powerful assistants for clinical documentation reviews, substantially reducing the risk of oversight and clinician workload,” Peter May, MD, MPH and colleagues wrote in the publication.¹ “Integrating LLM‐driven error flagging into electronic health record workflows offers a promising strategy for enhancing documentation accuracy, treatment quality, and patient safety in oncology.”

Key Findings: An Ability to Detect and Correct

The study found that the LLMs were able to identify a substantial proportion of documentation errors across simulated oncology cases, with performance exceeding that of human reviewers in several scenarios. Within complex discharge summaries, Gemini 2.5 Pro and GPT o4-mini-high identified 97.8% and 87.8% of injected errors, respectively, compared with a mean detection rate of 47.8% among human oncology specialists. In contrast, Gemma 3 27B, a local LLM, demonstrated lower sensitivity, detecting 35.6% of errors. Error detection included clinically relevant discrepancies that could plausibly influence patient management, such as incorrect chemotherapy regimens or discordant laboratory values.

In addition to detection, LLMs demonstrated partial capability in proposing corrections. However, accuracy varied depending on the complexity of the case and the type of error. Straightforward inconsistencies were more reliably addressed than nuanced clinical ambiguities.

Importantly, the models maintained contextual coherence in most cases, suggesting potential utility as a decision-support adjunct rather than a standalone system. The authors noted that even partial error detection could reduce cognitive burden on clinicians and mitigate risk in high-volume oncology practices.

Study Methodology

The analysis included 2 distinct phases. First, the authors evaluated LLM performance using standardized, synthetic oncology vignettes designed to reflect real-world documentation complexity. Second, the models were tested on more nuanced clinical scenarios to assess generalizability and robustness.

Performance metrics focused on sensitivity for error detection, qualitative accuracy of suggested corrections, and the ability to preserve clinically relevant context. The authors emphasized that oncology documentation presents particular challenges because of multimodal data inputs, evolving treatment regimens, and the need for precise staging and biomarker annotation.

Limitations and Clinical Implications

The reliance on synthetic clinical vignettes presents a potential study limitation, as it may not fully capture the variability and ambiguity of real-world oncology documentation. External validation in live clinical environments is necessary to establish generalizability. Additionally, performance may vary across different institutional documentation styles and electronic health record systems. The authors also highlighted the need for rigorous evaluation of bias, reproducibility, and data security before implementation into practice.

Overall, the findings highlight the potential of artificial intelligence (AI)-assisted review as a scalable approach to improving patient safety in high-risk oncology settings, where documentation inaccuracies can have downstream consequences for treatment decisions. Specifically, the integration of AI-assisted documentation review could provide an additional safety layer, with potential applications such as automated chart audits and real-time flagging of inconsistencies during documentation.

Despite its promise, the authors cautioned that LLM outputs still require clinician oversight. False positives and inappropriate corrections remain a concern, particularly in cases requiring nuanced clinical judgment. As such, these tools are best conceptualized as augmentative rather than autonomous systems.

REFERENCES

1. May P, Nokodian S, Nuernbergk C, et al. Artificial intelligence-assisted error detection in complex clinical documentation: Leveraging large language models to enhance patient safety in oncology. JCO Clin Cancer Inform. 2026;10:e2500194. doi:10.1200/CCI-25-00194

Stay up to date on practice-changing data in community practice.

AI Models May Serve as Scalable Adjunct to Oncology Documentation Workflows

Key Takeaways

Key Findings: An Ability to Detect and Correct

Study Methodology

Limitations and Clinical Implications

REFERENCES

1. May P, Nokodian S, Nuernbergk C, et al. Artificial intelligence-assisted error detection in complex clinical documentation: Leveraging large language models to enhance patient safety in oncology. JCO Clin Cancer Inform. 2026;10:e2500194. doi:10.1200/CCI-25-00194

Related Content

FAC-HCT Scale Integrating Frailty Stratifies Survival Before Allo-HCT

FDA Accepts sBLA for Mosunetuzumab Plus Polatuzumab Vedotin in LBCL

ASCO 2026 Breast Cancer Trials Affirm De-Escalation, Earlier Switching

NCCN Now Backs CSF Genomic Testing in Gliomas When Biopsy Isn't Possible

Tec/Dara Offers a Path Forward in Relapsed Myeloma When CAR T Is Not an Option

Latest CME

Community Oncology Connections™: Beyond Primary End Points – Digging Into Randomized and Real-World Data to Guide Challenging Treatment Decisions in HR+/HER2− Metastatic Breast Cancer | Missouri Oncology Society

Kansas Society of Clinical Oncology

Show Me the Data™: Redefining Treatment Paradigms in Triple-Negative Breast Cancer Across the Disease Continuum

Hot Seat: Which Agent, and When? Integrating Next-Generation Endocrine Strategies Into HR+/HER2– Breast Cancer Management

ADCs and Bispecific Antibodies Across Solid Tumors: Integrating New Targets, New Data, and New Decisions

Cases and Conversations: Optimizing Oral SERD-Based Therapy After CDK4/6 Inhibition in HR+/HER2– MBC

Community Oncology Connections™: Beyond Primary End Points – Digging Into Randomized and Real-World Data to Guide Challenging Treatment Decisions in HR+/HER2− Metastatic Breast Cancer | Wisconsin Association of Hematology and Oncology

Community Oncology Connections™: Clarifying Novel Treatment Pathways in HER2+ Breast Cancer – Rhode Island

Louisiana Oncology Society

Missouri Oncology Society

New Mexico Society of Clinical Oncology

Breast Cancer Tumor Board®: Advancing Care in Metastatic Breast Cancer - The Evolving Role of Antibody-Drug Conjugates

Medical Crossfire®: Integrating Next-Generation Endocrine Targeting Therapies to Improve Outcomes for Patients With HR+/HER2- Breast Cancer

25th Annual International Congress on the Future of Breast Cancer® East

25th Annual International Congress on the Future of Breast Cancer® West

Wisconsin Association of Hematology and Oncology

Indiana Oncology Society

Arkansas Association of Cancer Professionals

Rhode Island Oncology Society

Breast Cancer Tumor Board: Targeting TROP2 – Innovations in Triple-Negative Breast Cancer Treatment

Empire State Hematology & Oncology Society (New York)

South Carolina Oncology Society

Breaking Down the Rationale for Targeting TROP2 in TNBC

Expert Guidance on Frequently Asked Questions Regarding the Use of ADCs in TNBC

Dissecting Clinical Trial and Real-World Data for ADCs in TNBC

Establishing the Rationale for ADC and ICI Combinations in TNBC

Evaluating the Latest Data and Ongoing Trials for Novel ADC Approaches in TNBC

24th Annual International Congress on the Future of Breast Cancer® East

PER Resource Center: Integrating Novel Approaches in TNBC – New Avenues for TROP2-Targeting ADCs and Beyond – Nursing

Show Me the Data®: New and Emerging Roles for Oral SERD Therapy in the Treatment of ER+/HER2– Breast Cancer

Expert Roundtable and Panel Discussions: Current and Future Landscape of TNBC

(CME Track) Antibody–Drug Conjugates in Oncology: The Essentials of AE Management for Better Patient Outcomes

9th Annual School of Nursing Oncology™

Hot Seat: How Experts Are Integrating the Latest Practice-Changing Data Into Their Breast Cancer Clinics

Trending on Targeted Oncology - Immunotherapy, Biomarkers, and Cancer Pathways

ASCO 2026 Breast Cancer Trials Affirm De-Escalation, Earlier Switching

EHA 2026 Multiple Myeloma Highlights Have Worldwide Impact

Tec/Dara Offers a Path Forward in Relapsed Myeloma When CAR T Is Not an Option

FDA Grants Fast Track Designation to Givastomig for Frontline HER2-Negative Metastatic Gastric Cancer

Real-World Cilta-Cel Outcomes Extend Beyond Trial Populations in R/R Myeloma