Abstract
As distributed databases expand in popularity, there is ever-growing research into new database architectures that are designed from the start with built-in self-tuning and self-healing features. In real world deployments, however, migration to these entirely new systems is impractical and the challenge is to keep massive fleets of existing databases available under constant software and hardware change. Apache Cassandra is one such existing database that helped to popularize scale-out distributed databases and it runs some of the largest existing deployments of any open-source distributed database.In this paper, we demonstrate the techniques needed to transform the typical, highly manual, Apache Cassandra deployment into a self-healing system. We start by composing specialized agents together to surface the needed signals for a self-healing deployment and to execute local actions. Then we show how to combine the signals from the agents into the cluster level control-planes required to safely iterate and evolve existing deployments without compromising database availability. Finally, we show how to create simulated models of the database’s behavior, allowing rapid iteration with minimal risk. With these systems in place, it is possible to create a truly self-healing database system within existing large-scale Apache Cassandra deployments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.