Abstract

Let $X_1, X_2, \ldots$ be a sequence of independently and identically distributed integer-valued random variables. Let $Y_{t - m + 1,t}$ for $t = m, m + 1,\ldots$ denote a moving sum of $m$ consecutive $X_i$'s. Let $N_{m,T} = \max_{m \leq t \leq T} \{Y_{t - m + 1,t}\}$ and let $\tau_{k,m}$ be the waiting time until the moving sum of $X_i$'s in a scanning window of $m$ trials is as large as $k$. We derive tight bounds for the equivalent probabilities $P(\tau_{k,m} > T) = P(N_{m,T} < k)$. We apply the bounds for two problems in molecular biology: the distribution of the length of the longest almost-matching subsequence in aligned amino acid sequences and the distribution of the largest net charge within any $m$ consecutive positions in a charged alphabet string.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.