What Kinds of Relational Features Are Useful for Statistical Learning?

Amrita Saha,Ashwin Srinivasan,Ganesh Ramakrishnan

doi:10.1007/978-3-642-38812-5_15

Abstract

A workmanlike, but nevertheless very effective combination of statistical and relational learning uses a statistical learner to construct models with features identified (quite often, separately) by a relational learner. This form of model-building has a long history in Inductive Logic Programming (ILP), with roots in the early 1990s with the LINUS system. Additional work has also been done in the field under the categories of propositionalisation and relational subgroup discovery, where a distinction has been made between elementary and non-elementary features, and statistical models have been constructed using one or the other kind of feature. More recently, constructing relational features has become an essential step in many model-building programs in the emerging area of Statistical Relational Learning (SRL). To date, not much work—theoretical or empirical—has been done on what kinds of relational features are sufficient to build good statistical models. On the face of it, the features that are needed are those that capture diverse and complex relational structure. This suggests that the feature-constructor should examine as rich a space as possible, in terms of relational descriptions. One example is the space of all possible features in first-order logic, given constraints of the problem being addressed. Practically, it may be intractable for a relational learner to search such a space effectively for features that may be potentially useful for a statistical learner. Additionally, the statistical learner may also be able to capture some kinds of complex structure by combining simpler features together. Based on these observations, we investigate empirically whether it is acceptable for a relational learner to examine a more restricted space of features than that actually necessary for the full statistical model. Specifically, we consider five sets of features, partially ordered by the subset relation, bounded on top by the set F d , the set of features corresponding to definite clauses subject to domain-specific restrictions; and bounded at the bottom by F e , the set of “elementary” features with substantial additional constraints. Our results suggest that: (a) For relational datasets used in the ILP literature, features from F d may not be required; and (b) Models obtained with a standard statistical learner with features from subsets of features are comparable to the best obtained to date.

Full Text