The analysis and modeling of computer network traffic is a daunting task considering the amount of available data. This is quite obvious when considering the spatial dimension of the problem, since the number of interacting computers, gateways and switches can easily reach several thousands, even in a local area network (LAN) setting. This is also true for the time dimension: Willinger and Paxson (see Ann. Statist., vol.25, no.5, p.1856-66, 1997) cite the figures of 439 million packets and 89 gigabytes of data for a single week record of the activity of a university gateway in 1995. The complexity of the problem further increases when considering wide area network (WAN) data. In light of the above, it is clear that a notion of importance for modern network engineering is that of invariants, i.e., characteristics that are observed with some reproducibility and independently of the precise settings of the network under consideration. In this tutorial article, we focus on two such invariants related to the time dimension of the problem, namely, long-range dependence, or self-similarity, and heavy-tail marginal distributions.
Read full abstract