An Interpretable Hybrid AI Framework Combining Machine Learning and Large Language Models for Clinical Diabetes Prediction

Authors

  • Radha A Author
  • Mallika R.M Author

DOI:

https://doi.org/10.46647/rdems0205060

Keywords:

Diabetes mellitus; Machine learning; Large language model; Clinical decision support; Risk prediction; Explainable AI; GPT-4; Claude; Gemini

Abstract

The Hybrid System is comprised of three classifier models (Logistic Regression, Gradient Boosting, Random Forest) and uses 100 input parameters within the Diagnostic Schema to create a synthetic cohort of 1000 patients that are demographically, clinically, lifestyle-wise, and lab-wise representative of the real world population of patients. Accuracy, AUC, Precision, Recall, and F1 scores were used to evaluate the performance of the classifiers. The performance of the LLM (Large Language Model) part of the hybrid system was evaluated and compared across the GPT-4, Claude 3.5 Sonnet, and Gemini 1.5 Pro LLM platforms for tasks such as Structured Data Extraction, Schema Validation, and Explanation Generation. All three classifiers had an AUC greater than .960 and F1 Score greater than 0.969; Logistic Regression had the largest AUC (0.972) and most equally-balanced Precision (0.979) and Recall (0.984) values of the three classifiers. Additionally, the GPT-4 platform demonstrated the greatest degree of schema adherence (96.2%) and explanation faithfulness (94.8%) when compared to the other two LLM platforms, though Claude 3.5 Sonnet demonstrated a similar degree of both metrics (Schema Adherence = 95.6% and Explanation Faithfulness = 93.7%). The Hybrid System allows for the separation of the generative language abilities of the LLM from the numerical predictive abilities of the classifiers, thereby providing a means of supporting transparent clinical decision making. Therefore, the proposed hybrid system has the ability to leverage the performance advantages of calibrated ML classifiers and the enhanced clinical usability of an LLM-enabled interface to facilitate early risk assessment for diabetes while simultaneously providing transparency into the clinical decision-making process.

Downloads

Published

2026-05-14