Attention-Based Learning of Tabular Data
Abstract
Deep learning has tremendous success with text and image data using supervised learning. However, labeling real-world data is costly and often unavailable. Despite deep learning’s success, it struggles with tabular data classification and clustering. Tabular data is at the heart of many application domains where traditional machine learning thrives over deep learning. The overarching goal of this thesis is to bridge the gap of deep learning on tabular data. This is done by contributing to two objectives. First, we challenge the assumption of independent and identically distributed (i.i.d) data in machine learning. We use graph neural networks to capture relationships between samples. Additionally, we employ attention-based learning to prioritize specific features in tabular data. This approach relaxes the i.i.d assumption and enables us to leverage relationships between features and samples. Results reveal relaxing i.i.d assumptions beat traditional methods in six out of ten datasets. Second, we explore the feasibility of deep learning for clustering. To this end, we propose a novel deep clustering method that incorporates attention. Results from clustering accuracy tested on sixteen tabular datasets demonstrate the effectiveness of between-feature attention for deep clustering. Furthermore, our method outperforms existing deep clustering methods, bringing deep clustering closer to traditional methods. The results of this thesis show that relaxing i.i.d assumptions leads to improved representation learning of tabular data, as demonstrated by better classification and clustering performance. However, traditional machine learning remains competitive. We discuss the limitations of deep learning, our proposed method, and directions for future work to improve tabular data representation.
Subject Area
Computer science|Information science
Recommended Citation
Shourav Rabbani,
"Attention-Based Learning of Tabular Data"
(2024).
ETD Collection for Tennessee State University.
Paper AAI30996950.
https://digitalscholarship.tnstate.edu/dissertations/AAI30996950