Abstract

This paper proposes a fast and accurate empirical saddlepoint approximation algorithm, EmpSPA, that is applicable for large-scale genome-wide association studies. Saddlepoint approximation (SPA) can utilize the entire cumulant-generating function (CGF), which considerably improves the accuracy to approximate the null distribution of a test statistic. However, in some cases, it is technically challenging to calculate the expression of CGF, which limits the usage of SPA. We propose an empirical approach to estimating the CGF, which bypasses the technical challenge, and thus, is of high potential to be extended to complex traits analysis. Furthermore, the asymptotic equivalence property between SPA and EmpSPA is established. Finally, numeric simulations demonstrate that EmpSPA and SPA perform similarly, both of which are more accurate than regular normal distribution approximation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call