Abstract

Appropriate reference intervals are essential when using laboratory test results to guide medical decisions. Conventional approaches for the establishment of reference intervals rely on large samples from healthy and homogenous reference populations. However, this approach is associated with substantial financial and logistic challenges, subject to ethical restrictions in children, and limited in older individuals due to the high prevalence of chronic morbidities and medication. We implemented an indirect method for reference interval estimation, which uses mixed physiological and abnormal test results from clinical information systems, to overcome these restrictions. The algorithm minimizes the difference between an estimated parametrical distribution and a truncated part of the observed distribution, specifically, the Kolmogorov-Smirnov-distance between a hypothetical Gaussian distribution and the observed distribution of test results after Box-Cox-transformation. Simulations of common laboratory tests with increasing proportions of abnormal test results show reliable reference interval estimations even in challenging simulation scenarios, when <20% test results are abnormal. Additionally, reference intervals generated using samples from a university hospital’s laboratory information system, with a gradually increasing proportion of abnormal test results remained stable, even if samples from units with a substantial prevalence of pathologies were included. A high-performance open-source C++ implementation is available at https://gitlab.miracum.org/kosmic.

Highlights

  • Indirect methods use data from laboratory information systems, which contain both physiological and abnormal test results, to overcome the restrictions mentioned above[7,8,9]

  • A method developed by Arzideh et al.[9,16,17,18,19] has been used to establish reference intervals for adults[13] and children[8,10,11,12,20]. This method uses a truncation interval of the range of test results in the input dataset after Box-Cox transformation to estimate a distribution of supposedly physiological test results, and can estimate non-Gaussian distributions

  • We employ an approach based on previous works by Arzideh et al.[9,16,17,18,19] and our experiences in their application to pediatric and adult datasets[8,10,11,12,13]: This procedure is based on the assumption that the proportion of physiological samples in the input dataset can be modeled with a parametric distribution, and that a truncation interval T exists within the dataset, in which the proportion of abnormal test results is negligible

Read more

Summary

Introduction

Indirect methods use data from laboratory information systems, which contain both physiological and abnormal test results, to overcome the restrictions mentioned above[7,8,9]. As large numbers of test results are readily available from laboratory information systems, this enables the establishment of reference intervals specific to different populations, age-groups, analytical devices, and even batches and reagents Extensive experience with these methods exists in children, where unique ethical challenges limit access to blood samples to create reference intervals[8,10,11,12] and in challenging adult populations with a high proportion of patients with substantial morbidity and mortality[13]. The truncation interval, the Box-Cox transformation parameter λ, and the parameters of the Gaussian distribution μ and σ are estimated using an elaborate statistical process, which is implemented within a freely available software package (https://www.dgkl.de/verbandsarbeit/arbeitsgruppen/entscheidungsgrenzen-richtwerte/) Implementation using both Microsoft Excel and the R software environment requires human interaction and prevents integration into analysis pipelines, leads to technical difficulties, poor performance (reference interval estimation can take minutes), and the resulting lack of confidence intervals limits more widespread use and enhancement of this approach. To facilitate evaluation of the algorithm, a web-based application allows analysis of datasets without local installation of the provided tools

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call