Abstract
Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?.These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information asquickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of submodularity. We exploit submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs.We evaluate our approach on several large real-world problems,including a model of a water distribution network from the EPA, andreal blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.
Highlights
Submodular functionsWhat do we know about optimizingHill‐climbing submodular functions? reward a d b b aA hill‐climbing is near optimal (1-1/e (~63%) of optimal) But c e c d eAdd sensor with highest marginal gain– 1) this only works for unit cost case– 2) Hill‐climbing algorithm is slow At each iteration we need to re‐evaluate marginal gains It scales as O(|V|B)
We want to select a set of nodes to detect the process effectively
We must show that R is submodular: Benefit of adding a sensor to a small placement
Summary
Submodular functionsWhat do we know about optimizingHill‐climbing submodular functions? reward a d b b aA hill‐climbing (i.e., greedy) is near optimal (1-1/e (~63%) of optimal) But c e c d eAdd sensor with highest marginal gain– 1) this only works for unit cost case (each sensor/location costs the same)– 2) Hill‐climbing algorithm is slow At each iteration we need to re‐evaluate marginal gains It scales as O(|V|B). We want to select a set of nodes to detect the process effectively A budget B for sensors and data on how contaminations spread over the network: Select a subset of nodes A that maximize the expected reward subject to cost(A) < B Theorem [Nehmhauser et al ‘78]: If f is a function that is monotone and submodular, k‐step hill‐climbing produces set S for which f(S) is within (1‐1/e) of optimal.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.