Abstract

In this paper, we investigate the problem of signal motif discovery. In the formulation considered, the goal is to find an unknown pattern (or motif) repeated across multiple data sets. This problem formulation can be applied to various domains, such as DNA sequence alignment and matching, motif discovery in time-series, and object detection and localization in images. We take the approach of minimizing an objective that is the sum over a measure of the difference between a candidate instance in each signal collection (or set) and the unknown pattern. Additionally, a non-negative instance dependent penalty is introduced. The proposed general objective can be used to capture well-known problems (e.g., blind joint time-delay estimation). Due to the non-convex nature of the problem and often the integer programming flavor of the approach, brute-force solution is non-polynomial and computationally prohibitive for large scale problems. We propose an efficient polynomial time (quadratic in the number of instances) bipartite graph based approximation to solve the problem. We provide a theoretical analysis for the proposed solution including bounds on the gap from the optimal solution and conditions for optimality. In particular, we show that the objective value for the proposed solution is no more than twice the objective value of the optimal solution. To illustrate the merit of the proposed approach, we present qualitative and quantitative empirical analysis of the proposed approach on several applications and compare our method to appropriate alternatives.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call