Abstract

Significant progress has been made in the field of computer vision because of the development of supervised machine learning algorithms, which efficiently extract information from high-dimensional data such as images and videos. Such techniques are particularly effective at recognizing the presence or absence of entities in the domains where labeled data are abundant. However, supervised learning is not sufficient in applications where one needs to annotate each unique entity in crowded scenes respecting known domain-specific structures of those entities. This problem, known as data association, provides fertile ground for the application of combinatorial optimization. In this review paper, we present a unified framework based on column generation for some computer vision applications, namely multiperson tracking, multiperson pose estimation, and multicell segmentation, which can be formulated as set packing problems with a massive number of variables. To solve them, column generation algorithms are applied to circumvent the need to enumerate all variables explicitly. To enhance the solution process, we provide a general approach for applying subset-row inequalities to tighten the formulations and introduce novel dual-optimal inequalities to reduce the dual search space. The proposed algorithms and their enhancements are successfully applied to solve the three aforementioned computer vision problems and achieve superior performance over benchmark approaches. The common framework presented allows us to leverage operations research methodologies to efficiently tackle computer vision problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call