Abstract

In this paper, motivated by an important problem in evolutionary biology, we develop two sieve type estimators for distributions that are mixtures of a finite number of discrete atoms and continuous distributions under the framework of measurement error models. While there is a large literature on deconvolution problems, only two articles have previously addressed the problem taken up in our article, and they use relatively standard Fourier deconvolution. As a result the estimators suggested in those two articles are degraded seriously by boundary effects and negativity. A major contribution of our article is correct handling of boundary effects; our method is asymptotically unbiased at the boundaries, and also is guaranteed to be nonnegative. We use roughness penalization to improve the smoothness of the resulting estimator and reduce the estimation variance. We illustrate the performance of the proposed estimators via our real driving application in evolutionary biology and two simulation studies. Furthermore, we establish asymptotic properties of the proposed estimators.

Highlights

  • The research described in this paper is motivated primarily by an important application in evolutionary biology, where biologists are interested in estimating the distribution of virus mutation effects (Burch et al, 2007; Lee et al, 2010)

  • The target distribution of the mutation effect is a mixture of a pointmass at zero and a positively-supported continuous distribution, which has a non-smooth left boundary at the origin, i.e. the density is discontinuous at the left boundary of the support

  • Sieve type estimators have been proposed for deconvolution problems by Cordy and Thomas (1997), whose technique can be extended to our sieve estimator (Section 3.2) when degenerate distributions are used to approximate the continuous mixture component

Read more

Summary

Introduction

The research described in this paper is motivated primarily by an important application in evolutionary biology, where biologists are interested in estimating the distribution of virus mutation effects (Burch et al, 2007; Lee et al, 2010). There are only two recent works (van Es et al, 2008; Lee et al, 2010) which consider mixtures of one discrete atom and one continuous component in the context of measurement error models. An interesting direction for future work is to extend these methods, for example the boundary kernels of Zhang and Karunamuni (2009), to the problem of estimating discrete/continuous mixture distributions which is the focus of this paper. Sieve type estimators have been proposed for deconvolution problems by Cordy and Thomas (1997), whose technique can be extended to our sieve estimator (Section 3.2) when degenerate distributions are used to approximate the continuous mixture component. The online supplement (Lee et al, 2013) contains technical details of the optimization algorithm for our estimators and the proofs for the theorems

Description of the virus lineage data
Model and methodology
A standard sieve estimator
Parameter selection
Simulation studies
Simulation Study I
Simulation Study II
Estimation results
Validation of the exponential assumption
Findings
Consistency of the proposed estimators
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call