Kubernetes Deployments vs StatefulSets

Kubernetes

Kubernetes Problem Overview


I've been doing a lot of digging on Kubernetes, and I'm liking what I see a lot! One thing I've been unable to get a clear idea about is what the exact distinctions are between the Deployment and StatefulSet resources and in which scenarios would you use each (or is one generally preferred over the other).

Kubernetes Solutions


Solution 1 - Kubernetes

Deployments and ReplicationControllers are meant for stateless usage and are rather lightweight. StatefulSets are used when state has to be persisted. Therefore the latter use volumeClaimTemplates / claims on persistent volumes to ensure they can keep the state across component restarts.

So if your application is stateful or if you want to deploy stateful storage on top of Kubernetes use a StatefulSet.

If your application is stateless or if state can be built up from backend-systems during the start then use Deployments.

Further details about running stateful application can be found in 2016 kubernetes' blog entry about stateful applications

Solution 2 - Kubernetes

  • Deployment - You specify a PersistentVolumeClaim that is shared by all pod replicas. In other words, shared volume.

    The backing storage obviously must have ReadWriteMany or ReadOnlyMany accessMode if you have more than one replica pod.

  • StatefulSet - You specify a volumeClaimTemplates so that each replica pod gets a unique PersistentVolumeClaim associated with it. In other words, no shared volume.

    Here, the backing storage can have ReadWriteOnce accessMode.

    StatefulSet is useful for running things in cluster e.g Hadoop cluster, MySQL cluster, where each node has its own storage.

Solution 3 - Kubernetes

TL;DR

Deployment is a resource to deploy a stateless application, if using a PVC, all replicas will be using the same Volume and none of it will have its own state.

Persistence in Deployment with Replicas

Statefulsets is used for Stateful applications, each replica of the pod will have its own state, and will be using its own Volume.

Persistence in StatefulSets with Replicas

DaemonSet is a controller similar to ReplicaSet that ensures that the pod runs on all the nodes of the cluster. If a node is added/removed from a cluster, DaemonSet automatically adds/deletes the pod.

Persistence in Daemonsets

I have written about the detailed differences between Deployments, StatefulSets & Daemonsets, and how to deploy a sample application using these Resources K8s: Deployments vs StatefulSets vs DaemonSets.

Solution 4 - Kubernetes

StatefulSet

Use 'StatefulSet' with Stateful Distributed Applications, that require each node to have a persistent state. StatefulSet provides the ability to configure an arbitrary number of nodes, for a stateful application/component, through a configuration (replicas = N).

There are two kinds of stateful distributed applications: Master-Master and Master-Slave. All nodes in a Master-Master configuration and Slave nodes in a Master-Slave configuration can make use of a StatefulSet.
Examples:
Master-Slave -> Datanodes (slaves) in a Hadoop cluster
Master-Master -> Database nodes (master-master) in a Cassandra cluster

Each Pod (replica/node) in a StatefulSet has a Unique and Stable network identity. For example in a Cassandra StatefulSet with name as 'cassandra' and number of replica nodes as N, each Cassandra pod (node) has:

  • Ordinal Index for each pod: 0,1,..,N-1
  • Stable network id: cassandra-0, cassandra-1,.., cassandra-N-1
  • A separate persistent volume for each pod against a volume claim template i.e a separate storage for every pod (node)
  • Pods are created in the order 0 to N-1 and terminated in the reverse order N-1 to 0

Refer: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

Deployment

'Deployment' on the other hand is suitable for stateless applications/services where the nodes do not require any special identity. A load balancer can reach any node that it chooses. All nodes are equal. A Deployment is useful for creating any number of arbitrary nodes, through a configuration (replicas = N).

Solution 5 - Kubernetes

The difference between StatefulSet and deployment

StatefulSet is equivalent to a special deployment. Each pod in StatefulSet has a stable, unique network identifier that can be used to discover other members in the cluster. If the name of StatefulSet is Kafka, then the first pod is called Kafka-0, the second Kafka-1, and so on; the start and stop sequence of the pod copy controlled by the StatefulSet is controlled. When the nth pod is operated, the first N-1 pods are already running and ready Good state; the pod in the StatefulSet uses a stable persistent storage volume, implemented by PV or PVC. When deleting the pod, the storage volume associated with the StatefulSet is not deleted by default (for data security); the StatefulSet is bound to be bound to the PV volume. Used to store pod state data, and also used in conjunction with headless services, declared to belong to that headless service;

Solution 6 - Kubernetes

Comparing StatefulSets with ReplicaSets

Feature StatefulSets Deployment
State Statefull Stateless
Definition Stateful app: Stateful applications typically involve some database, such as Cassandra, MongoDB, or MySQL, and processes a read and/or write to it. Usually, frontend components have completely different scaling requirements than the backends, so we tend to scale them individually. Not to mention the fact that backends such as databases are usually much harder to scale compared to (stateless) frontend web servers. Yes, the term “stateless” means that no past data nor state is stored or needs to be persistent when a new container is created
Behaviour When a stateful pod instance dies (or the node it’s running on fails), the pod instance needs to be resurrected on another node, new instance get the same name, network identity, and state as the one it’s replacing. Pod replicas managed by a Deployment; they’re mostly stateless, they can be replaced with a completely new pod replica at any time.
Pod Mechanism Pods created by the StatefulSet aren’t exact replicas of each other. Each can have its own set of volumes—in other words, storage (and thus persistent state)—which differentiates it from its peers. When a Deployment replaces a pod, the new pod is a completely new pod with a new hostname and IP

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSS781View Question on Stackoverflow
Solution 1 - KubernetespagidView Answer on Stackoverflow
Solution 2 - KubernetesEmmanuel OsimosuView Answer on Stackoverflow
Solution 3 - KubernetesAli KahootView Answer on Stackoverflow
Solution 4 - KubernetesAnuragView Answer on Stackoverflow
Solution 5 - KubernetesAdler.LiuView Answer on Stackoverflow
Solution 6 - KubernetesGuptaView Answer on Stackoverflow