Abstract

Identifying changepoints is an important problem in molecular genetics. Our motivating example is from cancer genetics where interest focuses on identifying areas of a chromosome with an increased likelihood of a tumor suppressor gene. Loss of heterozygosity (LOH) is a binary measure of allelic loss in which abrupt changes in LOH frequency along the chromosome may identify boundaries indicative of a region containing a tumor suppressor gene. Our interest was on testing for the presence of multiple changepoints in order to identify regions of increased LOH frequency. A complicating factor is the substantial heterogeneity in LOH frequency across patients, where some patients have a very high LOH frequency while others have a low frequency. We develop a procedure for identifying multiple changepoints in heterogeneous binary data. We propose both approximate and full maximum-likelihood approaches and compare these two approaches with a naive approach in which we ignore the heterogeneity in the binary data. The methodology is used to estimate the pattern in LOH frequency on chromosome 13 in esophageal cancer patients and to isolate an area of inflated LOH frequency on chromosome 13 which may contain a tumor suppressor gene. Using simulations, we show that our approach works well and that it is robust to departures from some key modeling assumptions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call