Abstract

Performance problem diagnosis is a critical part of network operations in ISPs. Service providers use a combination of approaches to troubleshoot performance of their networks, such as active monitoring infrastructure and data collection (SNMP, Netflow, router logs, table dumps, etc.) along with customer trouble tickets. Some of these approaches, however, do not scale to wide area inter-domain networks due to unavailability of such data; moreover, troubleshooting is either reactive (e.g., driven by customer complaints) or (typically) automated using static thresholds. In this article, we describe the design and implementation of a system for root cause analysis and localization of performance problems in ISP networks. Our approach works with legacy monitoring infrastructure (e.g., perfSONAR deployments) and does not need specialized active probing tools or network data. Our system provides a language for network operators to define performance problem signatures, and provides near-real-time performance diagnosis and localization. We describe our deployment of Pythia in perfSONAR monitors in production networks in Georgia, covering over 250 inter-domain paths.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call