Abstract

Population-based cancer registry data provide a key epidemiological resource for monitoring cancer in defined populations. Validation of the data variables contributing to a common data set is necessary to remove statistical bias; the process is currently performed centrally. An ontology-based approach promises advantages in devolving the validation process to the registry level but the checks regarding multiple primary tumours have presented a hurdle. This work presents a solution by modelling the international rules for multiple primary cancers in description logic. Topography groupings described in the rules had to be further categorised in order to simplify the axioms. Description logic expressivity was constrained as far as possible for reasons of automatic reasoning performance. The axioms were consistently able to trap all the different types of scenarios signalling violation of the rules. Batch processing of many records were performed using the Web Ontology Language application programme interface. Performance issues were circumvented for large data sets using the software interface to perform the reasoning operations on the basis of the axioms encoded in the ontology. These results remove one remaining hurdle in developing a purely ontology-based solution for performing the European harmonised data-quality checks, with a number of inherent advantages including the formalisation and integration of the validation rules within the domain data model itself.

Highlights

  • Population-based cancer registries (CRs) play a pivotal role in the surveillance of cancer at population level [1]

  • Axiom (A2), which flags the fact that patient p1 contains and the morphology codes are identical, the couplet is an invalid multiple primary tumours (MPTs) and is classified as aa duplicate primary tumour. at least one invalid MPT

  • Axiom alerts user

Read more

Summary

Methods

A cancer patient may have multiple cancers but the condition for an MPT is that the cancers are independent of each other. For the purpose of reporting cancer incidence rates, tumours that are not independent are only counted once. The rules determining MPTs depend on topography (location of the tumour) and morphology (form/structure of the tumour), where topographies and morphologies are encoded according to the third edition of the International Classification of Diseases for Oncology (ICD-O-3 [27]). The process for determining a violation of the rules for MPTs is shown in the flow chart of Figure 1. All tumour permutations for a MPT patient need to be pairwise compared. TNM-O: Ontology support for staging of malignant tumours.

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call