Abstract

Exact distributions of run statistics are traditionally obtained using combinatorial methods, which, under certain situations, become very tedious. Run distributions of multiple object systems, although appearing frequently in applications from various fields, such as computational biology, are not commonly used, due in part to the lack of easy-to-use formulas. In this article, a method for evaluating partition functions of lattice models in the field of statistical mechanics is used to develop a systematic method to study various run statistics in multiple object systems. By using particular generating functions for the specified situation under study, many new distributions can be obtained in a unified and coherent way. The method makes it possible to manipulate formulas of run statistics by using binomial identities to obtain more general, yet simpler formulas. To illustrate the applications of the general method, the distributions of the total number of runs and the longest runs are investigated. Novel and general explicit formulas are derived for the distribution and moments of the total number of runs, and simple explicit formulas are derived for the distributions of the longest runs. In addition, some classical run statistics are recovered and generalized in the same unified way. As examples of applications to biological sequence analysis, the run statistics developed using the general method are applied to several protein sequences to examine their global and local features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.