Open source cluster management framework, designed to automate provisioning, management and scaling of distributed systems.

Its primary responsibility is:

  1. Resource management
  2. Fault tolerance
  3. Scalability
  4. Lifecycle management

It automates failure detection, leadership election and resource parition management