Abstract

Data validation in cancer registration is a critical operation but is resource-intensive and has traditionally depended on proprietary software. Ontology-based AI is a novel approach utilising machine reasoning based on axioms formally described in description logic. This is a different approach from deep learning AI techniques but not exclusive of them. The advantage of the ontology approach lies in its ability to address a number of challenges concurrently. The disadvantages relate to computational costs, which increase with language expressivity and the size of data sets, and class containment restrictions imposed by description logics. Both these aspects would benefit from the availability of design patterns, which is the motivation behind this study. We modelled the European cancer registry data validation rules in description logic using a number of design patterns and showed the viability of the approach. Reasoning speeds are a limiting factor for large cancer registry data sets comprising many hundreds of thousands of records, but these can be offset to a certain extent by developing the ontology in a modular way. Data validation is also a highly parallelisable process. Important potential future work in this domain would be to identify and optimise reusable design patterns, paying particular attention to avoiding any unintended reasoning efficiency hotspots.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call