Abstract

Claims modeling is a classical actuarial task aimed to understand the claim distribution given a set of risk factors. Yet some risk factors may be subject to misrepresentation, giving rise to bias in the estimated risk effects. Motivated by the unique characteristics of real health insurance data, we propose a novel class of two-part aggregate loss models that can (a) account for the semi-continuous feature of aggregate loss data, (b) test and adjust for misrepresentation risk in insurance ratemaking, and (c) incorporate an arbitrary number of correctly measured risk factors. The unobserved status of misrepresentation is captured via a latent factor shared by the two regression models on the occurrence and size of aggregate losses. For the complex two-part model, we derive explicit iterative formulas for the expectation maximization algorithm adopted in parameter estimation. Analytical expressions are obtained for the observed Fisher information matrix, ensuring computational efficiency in large-sample inferences on risk effects. We perform extensive simulation studies to demonstrate the convergence and robustness of the estimators under model misspecification. We illustrate the practical usefulness of the models by two empirical applications based on real medical claims data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.