Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e. g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system's recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.
Read full abstract