Abstract

Many geo-distributed services at web-scale companies still rely on databases (DBs) primarily optimized for single-site performance. At AT&T this is exemplified by services in the network control plane that rely on third-party software that uses DBs like MariaDB and PostgreSQL, which do not provide strict serializability across sites without a significant performance impact. Moreover, it is often impractical for these services to re-purpose their code to use newer DBs optimized for geo-distribution. In this paper, a novel drop-in solution for DB clustering across sites called Metric is presented that can be used by services without changing a single line of code. Metric leverages the single-site performance of an existing service's DB and combines it with a cross-site clustering solution based on an entry-consistent redo log that is specifically tailored for geo-distribution. Detailed correctness arguments are presented and extensive evaluations with various benchmarks show that Metric outperforms other solutions for the access patterns in our production use-cases where service replicas access different tables on different sites. In particular, Metric achieves up to 56% less latency and 5.2x higher throughput than MariaDB and PostgreSQL clustering, and up to 90% less latency and 26x higher throughput than CockroachDB and TiDB, systems that are designed to support geo-distribution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call