Abstract

Modern communication networks operate under high expectations on performance and resilience mainly due to the continuous proliferation of nonelastic highly-distributed applications. In this context, closely monitoring the state, behavior, and performance of networking devices and their traffic as well as quickly troubleshooting problems as they arise is essential for the operation of network infrastructures. In this thesis, we make several contributions — based on in-band network telemetry and data plane programmability — that advance the discipline of network monitoring and operation. We formalize telemetry orchestration problems, prove their NP-Completeness, and propose polynomial computing time heuristic to efficiently solve real instances of these problems. We also design a system that combines in-band telemetry and in-network computation to enable the highly accurate and fine-grained detection and diagnosis of service-level objective violations. Finally, we introduce an approach that is able to recover from network link and node failures at data-plane timescales via policy-optimal paths. We also discuss opportunities and challenges for adapting this approach for other time-sensitive network management tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call