In line with the increasing machine learning model inference accuracy, deep learning (DL) models have been increasingly applied to structured data for a wide spectrum of real-world applications, including product recommendations, online advertisement, healthcare analytics and risk analysis. However, unlike unstructured data, structured data is high-dimensional and sparse and therefore engenders a large number of parameters in DL, making DL models more prone to overfitting. To alleviate the overfitting problem, various regularization methods have been designed to constrain the model parameters as a means to control the model complexity. Unfortunately, these methods are often restricted to regularizing the parameter values directly without considering the intrinsic correlations and dependencies between attribute fields of structured data which is however key to effective structured data modeling. In this paper, we re-examine DL for structured data from a new perspective of attribute interactions. In particular, we seek to explicitly model and regularize the pairwise relationships between attribute fields of structured data, in a field-adaptive manner, via a proposed attentive and interpretable framework called ATT-Reg. Specifically, in this framework, a set of attentive weight matrices are introduced to each attribute field for modeling obviously different relationships with its neighboring attribute fields. Further, we derive from the Bayesian viewpoint a novel Attentive Regularization method for imposing adaptive regularization strengths on different pairs of attribute fields, based on the informativeness of their relationship, which is calculated using both data-driven information and functional dependency (FD) knowledge. Such adaptive regularization facilitates each attribute field to learn discriminative and diversified representations for more effective predictive analytics. We also develop a feature attribution method for supporting more interpretable predictions We validate the effectiveness of our ATT-Reg on six real-world datasets. Extensive experimental results show that ATT-Reg achieves significant improvement over state-of-the-art graph models, attentive models as well as regularization methods and supports an excellent degree of interpretation.
Read full abstract