Rethinking Distributed Systems for the Datacenter
Today's most popular applications are deployed as massive-scale distributed systems in the datacenter. Keeping data consistent and available despite server failures and concurrent updates is a formidable challenge. Two well-known abstractions, strongly consistent replication and serializable transactions, can free developers from these challenges by transparently masking failures and treating complex updates as atomic units. Yet the conventional wisdom is that these techniques are too expensive to deploy in high-performance systems.
I will demonstrate a new approach to designing distributed systems that allows strongly consistent distributed systems to be built with little to no performance cost. Taking advantage of the properties and capabilities of the datacenter environment, we can co-design distributed protocols and the network layer. Specifically, I will describe two systems for state machine replication, Speculative Paxos and Network-Ordered Paxos, and one for distributed transaction processing, Eris, built using this approach. They are able to achieve 5-17x performance improvements over conventional designs. Moreover, they achieve performance within 2% of their weakly consistent alternatives, demonstrating that strong consistency and high performance are not incompatible.
Dan Ports is Research Assistant Professor in Computer Science and Engineering at the University of Washington, where he leads the distributed systems research group. His group's research focuses on building practical distributed systems with strong theoretical underpinnings. Prior to joining the faculty at UW in 2015, Dan received the Ph.D. from MIT (2012), where he was advised by Barbara Liskov, and completed a postdoc at UW CSE. His research has been recognized with best paper awards at NSDI and OSDI.