Telecommunications service providers (SP) should place survivability expectations by guaranteeing maximal allowed system downtime for service-level agreement (SLA)-differentiated services. Furthermore, SPs should continuously focus on utilizing network resources effectively, by considering the bounded network capacity and the growth of future data traffic. In order to improve different service availabilities and achieve high resource efficiency, we present a novel restoration scheme by jointly considering accumulated downtime and SLA requirements of faulty connections. While most past related works have focused on providing statistical guarantees on availability when a connection is provisioned, our current approach recognizes that, after a connection has been in existence, it could be “ahead” (or “behind”) its performance guarantee based on what network outages it might have experienced, so the resources allocated to it may be revised judiciously. When a link failure occurs, two sets of faulty connections are examined: (a) connections whose primary or restoration path is disrupted by the failure and (b) connections that are in the “down” state due to some previous failures (which have not been repaired yet). An affected connection is switched to its pre-computed or an alternate restoration path if necessary, when its accumulated downtime plus the link repair time will exceed its SLA requirement. The scheme provides differentiated restoration to existing connections upon a link failure in order to satisfy the connections’ availability requirements. We also propose an upgraded version of the scheme that incorporates both excess capacity and resource preemption into the scheme. Given the network capacities and the current network state including routing information for all existing connections, a faulty connection is restored to its restoration path as long as there is enough excess capacity along the path. Otherwise, when protection switching of a high-SLA connection fails due to limited bandwidth on some link(s), it preempts restoration capacity on each link from a low-SLA connection if both disrupted connections share the same restoration capacity and the availability requirement of the low-SLA connection is not violated. Finally, we report simulation results for a large carrier-scale network to show computational performance of our proposed algorithm. The results demonstrate that the algorithm achieves a high availability satisfaction rate and good resource utilization, as well as greatly reduces protection-switching overhead.
Read full abstract