Class expression induction as concept space exploration: From DL-Foil to DL-Focl

Giuseppe Rizzo,Nicola Fanizzi,Claudia D’Amato

doi:10.1016/j.future.2020.02.071

Abstract

The Web of Data is one of the perspectives of the Semantic Web. In this context, concept learning services, supported by multirelational machine learning, have been integrated in various tools for knowledge engineers to carry out several tasks related to the construction, completion and maintenance of the knowledge bases: essentially they are used to elicit new candidate concept definitions (i.e. axioms regarding classes) to be incorporated in the knowledge bases possibly also as replacements for previous ones. Sundry reference approaches rely on a covering strategy to generalize input examples that can be regarded as a form of hill-climbing search that explores a huge discrete conceptual space. Methods adopting this strategy are known to be affected by an inherent problem of myopia. In particular, our DL-Foil has been shown to suffer from this problem as its algorithm is based on a stochastic yet informed exploration of the concept space, by means of a refinement operator, to generate partial descriptions iteratively. To tackle this problem and enhance the performance of our system we have introduced a series of extensions of the original DL-Foil algorithm, that have led to various releases of its spin-off DL-Focl. Essentially they aim at reducing the aforementioned problem through specific strategies grounded on either the integration of meta-heuristics, such as repeated hill-climbing and tabu search, or the employment of some form of lookahead. In this work, we present consolidated and extended releases of both DL-Foil and DL-Focl along various dimensions:better heuristics and stop conditions, more complex refinement operators with the possibility to perform the specialization adopting iterative deepening or lookahead strategies, improved versions of the algorithm based on the repeated hill-climbing strategy with new quality criteria and of the tabu search with a different policy for managing the local memory. All the implementations of these approaches have been extensively evaluated in three experimental sessions, involving various publicly available knowledge bases and fragments extracted from the Linked Data cloud, showing interesting results and indicating some lessons to be learned: our approaches outperformed a popular reference system from the DL-Learner framework on learning problems when the open-world semantics is explicitly considered. They also exhibited an analogous performance on a benchmark of datasets from contexts with an intended underlying closed-world semantics.

Full Text