Abstract

We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe $\{1,\ldots ,n \}$ , find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor $1-1/e$ in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to $1-1/e$ , that use sublinear space $o(mn)$ . Our main results are: We also study the maximum k-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on N vertices. The goal is to find k vertices that cover the most number of distinct edges.

Highlights

  • The maximum set coverage problem is a classic NP-Hard problem that has a wide range of applications including facility and sensor allocation [33], information retrieval [6], influence maximization in marketing strategy design [29], and the blog monitoring problem where we want to choose a small number of blogs that cover a wide range of topics [41]

  • In the application considered by Saha and Getoor [41], the universe corresponds to n topics of interest to a reader, each subset corresponds to a blog that covers some of these topics, and the goal is to maximize the number of topics that the reader learns about if she can only choose k blogs. It is well-known that the greedy algorithm, which greedily picks the set that covers the most number of uncovered elements, is a 1 − 1/e approximation

  • We present polynomial time data stream algorithms that achieve a 1 − 1/e − approximation for arbitrarily small

Read more

Summary

Introduction

The maximum set coverage problem is a classic NP-Hard problem that has a wide range of applications including facility and sensor allocation [33], information retrieval [6], influence maximization in marketing strategy design [29], and the blog monitoring problem where we want to choose a small number of blogs that cover a wide range of topics [41]. In the application considered by Saha and Getoor [41], the universe corresponds to n topics of interest to a reader, each subset corresponds to a blog that covers some of these topics, and the goal is to maximize the number of topics that the reader learns about if she can only choose k blogs It is well-known that the greedy algorithm, which greedily picks the set that covers the most number of uncovered elements, is a 1 − 1/e approximation. The maximum vertex coverage problem is a special case of this problem in which the universe corresponds to the edges of a given graph and there is a set corresponding to each node of the graph that contains the subset of edges that are incident to that node For this problem, algorithms based on linear programming are known to achieve a 3/4 approximation for general graphs [1] and a 8/9 approximation for bipartite graphs [15]. Note that any algorithm for the dynamic graph stream model can be used in the streaming-set model; the streaming-set model is a special case in which there are no deletions and edges are grouped by endpoint

Related Work
Our Contributions
Algorithms for maximum k-set coverage
Other Algorithmic Results
Budgeted Maximum Coverage
Algorithms for Maximum k-Vertex Coverage
Algorithm for Near-Regular Hypergraphs
Lower Bounds
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call