Date of Award
12-11-2025
Degree Type
Thesis
Degree Name
Master of Science (M.S.)
Department
Computer Science
First Advisor
Manar Samad
Abstract
Understanding causal relationships between input variables and outcomes is critical for scientific advancement. The famous adage in medicine, "correlation does not imply causation," holds profound significance for understanding disease etiology. Although explainable machine learning (ML) has shown major advances in predictive modeling, it remains unclear how ML-derived important variables relate to causal variables. This thesis investigates the association between causal variables and those identified as correlated and important for ML-based prediction to bridge a critical knowledge gap in data science. Specifically, the thesis explores two research questions: when and to what extent (1) statistically correlated variables are also causal? and (2) variables important for ML-based predictions are also causal? To answer these questions, this work introduces a novel framework for nonlinear Causal Structure Discovery (CSD) to quantify causal strengths and enable CSD with mixed-type data. The proposed method is evaluated on real-world heart failure (HF) data sets and validated on 16 tabular data sets from diverse domains. from various domains. Comparative experiments demonstrate that the nonlinear CSD model identifies clinically meaningful causal variables than its linear counterpart. Findings show that important features of the ML classifiers are strongly associated with causal variables in male and female heart failure diagnoses. A similar association is observed in 11 out of 16 tabular data sets. Overall, nonlinear CSD models are more accurate, producing results more aligned with ML-based variable importance than the linear counterpart. This thesis demonstrates an effective method to provide causal explainability of machine learning through variable importance.
Recommended Citation
Hou, Yina, "Causal Structure Discovery for Explaining Important Features in Machine Learning" (2025). Tennessee State University Alumni Theses and Dissertations. 305.
https://digitalscholarship.tnstate.edu/alumni-etd/305
