Abstract

Measures of substance concentration in urine, serum or other biological matrices often have an assay limit of detection. When concentration levels fall below the limit, the exact measures cannot be obtained, and thus are left censored. Common practice for addressing the censoring issue is to delete or 'fill-in' the censored observations in data analysis, which often results in biased or non-efficient estimates. Assuming the concentration or transformed concentration follows a normal distribution, a Tobit regression model can be applied. When the study population is heterogeneous, for example due to the existence of a latent group of subjects who lack the substance, the problem becomes more challenging. In this paper, we conduct intensive simulation studies to investigate the statistical issues in analyzing censored data and compare different methods in which the data are treated either as a dependent variable or an independent variable. We also analyze triclosan data in the NHANES study and metabolites data in the Bogalusa Heart Study to illustrate the issues. Some guidelines for analyzing such censored data are provided.

Highlights

  • Measures of substance concentration in urine, serum or other biological matrices that fall below assay limit of detection are common in epidemiological and medical research (Nassan et al, 2017; Ferrero et al, 2017; Østergren et al, 2017; Kim et al, 2018; Zhao et al, 2018; Gomez et al, 2019; Maule et al, 2019)

  • We propose a mixture model for outcomes obtained from heterogeneous populations, and develop joint modeling for censored predictor, either from a single or heterogeneous populations

  • Because the Tobit model is inappropriate for modeling outcomes from heterogeneous populations, a mixture model is proposed for those cases (Moulton and Halsey, 1995; Taylor et al, 2001; Reisetter et al, 2017)

Read more

Summary

Introduction

Measures of substance concentration in urine, serum or other biological matrices that fall below assay limit of detection are common in epidemiological and medical research (Nassan et al, 2017; Ferrero et al, 2017; Østergren et al, 2017; Kim et al, 2018; Zhao et al, 2018; Gomez et al, 2019; Maule et al, 2019). When the concentration levels are under the detection limit (DL), denoted as L, accurate measures cannot be obtained Instead, their values are only partially known and left censored. Another common approach is the ’fill-in’ method where the censored observations are replaced by a constant such as L, or This approach is widely applied in epidemiological and medical research because of its simplicity of implementation, but often leads to biased estimates. In the above triclosan example, subjects having triclosan in their urine, namely the exposed group, but with concentration levels lower than L are censored. If the data from the exposure population follow a censored normal distribution, data from the whole sample follow a mixture of censored normal and degenerate distributions In this case, if a Tobit model is applied, biased estimates can occur. We conduct intensive simulation studies to investigate statistical issues for different methods and use two real data examples to illustrate the methods

As Dependent Variable
Tobit Model for Single Population
Mixture Model for Heterogenous Populations
As Independent Variable
Predictor From Heterogenous Populations
Simulation Studies
From a Single Exposure Population
From Heterogeneous Populations
Bogalusa Heart Study
Findings
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.