Explainable AI for Prognostic Factor Identification in Colorectal Cancer: An Electronic Health Records Analysis

Authors

  • Amena Mahmoud Department of Computer Science, Faculty of Computers and Information, Kafrelsheikh University, Egypt and Department of Information and Communication Sciences, Faculty of Science and Technology, Sophia University, Japan. https://orcid.org/0009-0006-9909-0430 Author

DOI:

https://doi.org/10.59543/ijmscs.v3i.15123

Keywords:

Explainable AI, colorectal cancer, prognostic factors, electronic health records, machine learning, survival analysis.

Abstract

Colorectal cancer (CRC) prognosis remains challenging due to the disease's heterogeneity and the complex interplay of clinical, demographic, and molecular factors. This study leverages explainable artificial intelligence (XAI) and electronic health records (EHRs) to develop interpretable machine learning models for prognostic factor identification in CRC. Using a retrospective cohort of 8,247 patients, we extracted 1,247 features from EHRs, including demographic, laboratory, treatment, and natural language processing (NLP)-derived data. After rigorous feature selection, six machine learning models were evaluated, with XGBoost achieving the highest performance (C-index: 0.798, 95% CI: 0.785–0.811), significantly outperforming traditional Cox models (C-index: 0.742) and established prognostic scores. SHAP and LIME analyses identified both established prognostic factors (e.g., TNM stage, age) and novel predictors, such as temporal albumin trends and neutrophil-to-lymphocyte ratio (NLR), which accounted for 40% of the top prognostic features. Clinical validation by oncology experts confirmed the relevance and biological plausibility of these findings. The study demonstrates that XAI-enhanced models can improve prognostic accuracy while providing transparent, actionable insights, bridging the gap between complex machine learning outputs and clinical decision-making. These results highlight the potential of integrating comprehensive EHR data with XAI to advance precision oncology in CRC care.

Downloads

Published

2025-11-05

How to Cite

Amena Mahmoud. (2025). Explainable AI for Prognostic Factor Identification in Colorectal Cancer: An Electronic Health Records Analysis. International Journal of Mathematics, Statistics, and Computer Science, 3, 457-475. https://doi.org/10.59543/ijmscs.v3i.15123

Issue

Section

Articles