Building highly-available and fault-tolerant distributed systems is hard enough, but making them elastic is even harder. Elastic distributed systems enable operators to reduce costs (by deallocating resources when they are unnecessary) and increase overall throughput (by reallocating resources where they can be used more effectively).
In this talk Ben introduces elasticity with concrete examples from parallel computing before exploring some of the fundamentals of building elastic distributed systems, specifically addressing software architectural patterns that promote versus deter elasticity. In addition to what it takes to build elastic distributed systems I’ll discuss the compute infrastructure necessary for supporting them, drawing on some of the primitives provided by Apache Mesos for illustration.