Abstract

We explain how network failures were caused by a natural disaster, describe the restoration steps that were taken, and present lessons learned from the recovery. At 21:26 on December 26th (UTC+9), 2006, there was a serious undersea earthquake off the coast of Taiwan, which measured 7.1 on the Richter scale. This earthquake caused significant damage to submarine cable systems. The resulting fiber cable failures shut down communications in several countries in the Asia Pacific networks. In the first post-earthquake recovery step, BGP routers detoured traffic along redundant backup paths, which provided poor quality connection. Subsequently, operators engineered traffic to improve the quality of recovered communication. To avoid filling narrow-bandwidth links with detoured traffic, the operators had to change the BGP routing policy. Despite the routing-level first aid, a few institutions could not be directly connected to the R&E network community because they had only a single link to the network. For these single-link networks, the commodity link was temporarily used for connectivity. Then, cable connection configurations at the switches were changed to provide high bandwidth and next-generation Internet service. From the whole restoration procedure, we learned that redundant BGP routing information is useful for recovering connectivity but not for providing available bandwidth for the re-routed traffic load and that collaboration between operators is valuable in solving traffic engineering issues such as poor-quality re-routing and lost connections of single-link networks.

Highlights

  • As the Internet grows, networks become larger and more complex, and the number of components, such as routers, switches, and fiber cables, increases

  • Before the network failures from earthquake, Asia Pacific Research and Education (R&E) network operators thought that removing the useless routes was urgent, because routing became too complicated after Trans-Eurasian Information Network 2 (TEIN2) started

  • During the network failures caused by the 2006 earthquake, it was shown that there are still many challenges in fault-tolerant network management research

Read more

Summary

Introduction

As the Internet grows, networks become larger and more complex, and the number of components, such as routers, switches, and fiber cables, increases. The operators changed the BGP routing policy related to the congested ASs. In spite of the routing-level restoration, a few institutions were still not directly connected to the R&E network community because they had only a single link to the network. The fiber break caused by the Taiwan earthquake raised restoration issues related to BGP rerouting In such an emergency, the backup routes should be chosen based on available bandwidth and RTT. Since the fiber break required an urgent network recovery process, network operators configured re-routing based on their experience with bandwidth and RTT. From this experience, we have learned that redundant physical backup links and routes are important to providing bandwidth and connectivity and that the Quality-of-Service (QoS) after recovery is important.

Research and education network activities in the Asia Pacific area
Background of R&E network
Network failures caused by Taiwan earthquake
C: Commodity link connection
Network restoration methods
Lessons
Summary
10. References
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call