Kubernetes Pod Warning: 1 node(s) had volume node affinity conflict

DockerKubernetesPersistent Volumes

Docker Problem Overview


I try to set up Kubernetes cluster. I have Persistent Volume, Persistent Volume Claim and Storage class all set-up and running but when I wan to create pod from deployment, pod is created but it hangs in Pending state. After describe I get only this warning "1 node(s) had volume node affinity conflict." Can somebody tell me what I am missing in my volume configuration?

apiVersion: v1
kind: PersistentVolume
metadata:
  creationTimestamp: null
  labels:
    io.kompose.service: mariadb-pv0
  name: mariadb-pv0
spec:
  volumeMode: Filesystem
  storageClassName: local-storage
  local:
    path: "/home/gtcontainer/applications/data/db/mariadb"
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 2Gi
  claimRef:
    namespace: default
    name: mariadb-claim0
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/cvl-gtv-42.corp.globaltelemetrics.eu
            operator: In
            values:
            - master

status: {}

Docker Solutions


Solution 1 - Docker

The error "volume node affinity conflict" happens when the persistent volume claims that the pod is using are scheduled on different zones, rather than on one zone, and so the actual pod was not able to be scheduled because it cannot connect to the volume from another zone. To check this, you can see the details of all the Persistent Volumes. To check that, first get your PVCs:

$ kubectl get pvc -n <namespace>

Then get the details of the Persistent Volumes (not Volume claims)

$  kubectl get pv

Find the PVs, that correspond to your PVCs and describe them

$  kubectl describe pv <pv1> <pv2>

You can check the Source.VolumeID for each of the PV, most likely they will be different availability zone, and so your pod gives the affinity error. To fix this, create a storageclass for a single zone and use that storageclass in your PVC.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: region1storageclass
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  encrypted: "true" # if encryption required
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - eu-west-2b # this is the availability zone, will depend on your cloud provider
    # multi-az can be added, but that defeats the purpose in our scenario

Solution 2 - Docker

There a few things that can cause this error:

  1. Node isn’t labeled properly. I had this issue on AWS when my worker node didn’t have appropriate labels(master had them though) like that:

    failure-domain.beta.kubernetes.io/region=us-east-2

    failure-domain.beta.kubernetes.io/zone=us-east-2c

After patching the node with the labels, the “1 node(s) had volume node affinity conflict” error was gone, so PV, PVC with a pod were deployed successfully. The value of these labels is cloud provider specific. Basically, it is the job of the cloud provider(with —cloud-provider option defined in cube-controller, API-server, kubelet) to set those labels. If appropriate labels aren’t set, then check that your CloudProvider integration is correct. I used kubeadm, so it is cumbersome to set up but with other tools, kops, for instance, it is working right away.

  1. Based on your PV definition and the usage of nodeAffinity field, you are trying to use a local volume, (read here local volume description link, official docs), then make sure that you set "NodeAffinity field" like that(it worked in my case on AWS):

    nodeAffinity:

          required:
           nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - my-node  # it must be the name of your node(kubectl get nodes)
    

So that after creating the resource and running describe on it it will show up there like that:

         Required Terms:  
                    Term 0:  kubernetes.io/hostname in [your node name]

3. StorageClass definition(named local-storage, which is not posted here) must be created with volumeBindingMode set to WaitForFirstConsumer for local storage to work properly. Refer to the example here storage class local description, official doc to understand the reason behind that.

Solution 3 - Docker

0. If you didn't find the solution in other answers...

In our case the error happened on a AWS EKS cluster freshly provisioned with Pulumi (see full source here). The error drove me nuts, since I didn't change anything, just created a PersistentVolumeClaim as described in the Buildpacks Tekton docs:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: buildpacks-source-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

I didn't change anything else from the default EKS configuration and also didn't add/change any PersistentVolume or StorageClass (in fact I didn't even know how to do that). As the default EKS setup seems to rely on 2 nodes, I got the error:

0/2 nodes are available: 2 node(s) had volume node affinity conflict.

Reading through Sownak Roy's answer I got a first glue what to do - but didn't know how to do it. So for the folks interested here are all my steps to resolve the error:

1. Check EKS nodes failure-domain.beta.kubernetes.io labels

As described in the section Statefull applications in this post two nodes are provisioned on other AWS availability zones as the persistent volume (PV), which is created by applying our PersistendVolumeClaim described above.

To check that, you need to look into/describe your nodes with kubectl get nodes:

$ kubectl get nodes
NAME                                             STATUS   ROLES    AGE     VERSION
ip-172-31-10-186.eu-central-1.compute.internal   Ready    <none>   2d16h   v1.21.5-eks-bc4871b
ip-172-31-20-83.eu-central-1.compute.internal    Ready    <none>   2d16h   v1.21.5-eks-bc4871b

and then have a look at the Label section using kubectl describe node <node-name>:

$ kubectl describe node ip-172-77-88-99.eu-central-1.compute.internal
Name:               ip-172-77-88-99.eu-central-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=t2.medium
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=eu-central-1
                    failure-domain.beta.kubernetes.io/zone=eu-central-1b
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-172-77-88-99.eu-central-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=t2.medium
                    topology.kubernetes.io/region=eu-central-1
                    topology.kubernetes.io/zone=eu-central-1b
Annotations:        node.alpha.kubernetes.io/ttl: 0
...

In my case the node ip-172-77-88-99.eu-central-1.compute.internal has failure-domain.beta.kubernetes.io/region defined as eu-central-1 and the az with failure-domain.beta.kubernetes.io/zone to eu-central-1b.

And the other node defines failure-domain.beta.kubernetes.io/zone az eu-central-1a:

$ kubectl describe nodes ip-172-31-10-186.eu-central-1.compute.internal
Name:               ip-172-31-10-186.eu-central-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=t2.medium
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=eu-central-1
                    failure-domain.beta.kubernetes.io/zone=eu-central-1a
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-172-31-10-186.eu-central-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=t2.medium
                    topology.kubernetes.io/region=eu-central-1
                    topology.kubernetes.io/zone=eu-central-1a
Annotations:        node.alpha.kubernetes.io/ttl: 0
...

2. Check PersistentVolume's topology.kubernetes.io field

Now we should check the PersistentVolume automatically provisioned after we manually applied our PersistentVolumeClaim. Use kubectl get pv:

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                           STORAGECLASS   REASON   AGE
pvc-93650993-6154-4bd0-bd1c-6260e7df49d3   1Gi        RWO            Delete           Bound    default/buildpacks-source-pvc   gp2                     21d

followed by kubectl describe pv <pv-name>

$ kubectl describe pv pvc-93650993-6154-4bd0-bd1c-6260e7df49d3
Name:              pvc-93650993-6154-4bd0-bd1c-6260e7df49d3
Labels:            topology.kubernetes.io/region=eu-central-1
                   topology.kubernetes.io/zone=eu-central-1c
Annotations:       kubernetes.io/createdby: aws-ebs-dynamic-provisioner
...

The PersistentVolume was configured with the label topology.kubernetes.io/zone in az eu-central-1c, which makes our Pods complain about not finding their volume - since they are in a completely different az!

3. Add allowedTopologies to StorageClass

As stated in the Kubernetes docs one solution to the problem is to add a allowedTopologies configuration to the StorageClass. If you already provisioned a EKS cluster like me, you need to retrieve your already defined StorageClass with

kubectl get storageclasses gp2 -o yaml

Save it to a file called storage-class.yml and add a allowedTopologies section that matches your node's failure-domain.beta.kubernetes.io labels like this:

allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - eu-central-1a
    - eu-central-1b

The allowedTopologies configuration defines that the failure-domain.beta.kubernetes.io/zone of the PersistentVolume must be either in eu-central-1a or eu-central-1b - not eu-central-1c!

The full storage-class.yml looks like this:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp2
parameters:
  fsType: ext4
  type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - eu-central-1a
    - eu-central-1b

Apply the enhanced StorageClass configuration to your EKS cluster with

kubectl apply -f storage-class.yml

4. Delete PersistentVolumeClaim, add storageClassName: gp2 to it and re-apply it

In order to get things working again, we need to delete the PersistentVolumeClaim first.

To map the PersistentVolumeClaim to our previously define StorageClass we need to add storageClassName: gp2 to the PersistendVolumeClaim definition in our pvc.yml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: buildpacks-source-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi
  storageClassName: gp2

Finally re-apply the PersistentVolumeClaim with kubectl apply -f pvc.yml. This should resolve the error.

Solution 4 - Docker

The "1 node(s) had volume node affinity conflict" error is created by the scheduler because it can't schedule your pod to a node that conforms with the persistenvolume.spec.nodeAffinity field in your PersistentVolume (PV).

In other words, you say in your PV that a pod using this PV must be scheduled to a node with a label of kubernetes.io/cvl-gtv-42.corp.globaltelemetrics.eu = master, but this isn't possible for some reason.

There may be various reason that your pod can't be scheduled to such a node:

  • The pod has node affinities, pod affinities, etc. that conflict with the target node
  • The target node is tainted
  • The target node has reached its "max pods per node" limit
  • There exists no node with the given label

The place to start looking for the cause is the definition of the node and the pod.

Solution 5 - Docker

Great answer by Sownak Roy. I've had the same case of a PV being created in a different zone compared to the node that was supposed to use it. The solution I applied was based on Sownak's answer only in my case it was enough to specify the storage class without the "allowedTopologies" list, like this:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cloud-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
volumeBindingMode: WaitForFirstConsumer

Solution 6 - Docker

In my case, the root cause was that the persistent volume are in us-west-2c and the new worker nodes are relaunched to be in us-west-2a and us-west-2b. The solution is to either have more worker nodes so they are in more zones, or remove / widen node affinity for the application so that more worker nodes qualifies to be bounded to the persistent volume.

Solution 7 - Docker

Different case from GCP GKE. Assume that you are using regional cluster and you created two PVC. Both were created in different zones (you didn't notice).

In next step you are trying to run the pod which will have mounted both PVC to the same pod. You have to schedule that pod to specific node in specific zone but because your volumes are on different zones the k8s won't be able to schedule that and you will receive the following problem.

For example - two simple PVC(s) on the regional cluster (nodes in different zones):

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: disk-a
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: disk-b
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Next simple pod:

apiVersion: v1
kind: Pod
metadata:
  name: debug
spec:
  containers:
    - name: debug
      image: pnowy/docker-tools:latest
      command: [ "sleep" ]
      args: [ "infinity" ]
      volumeMounts:
        - name: disk-a
          mountPath: /disk-a
        - name: disk-b
          mountPath: /disk-b
  volumes:
    - name: disk-a
      persistentVolumeClaim:
        claimName: disk-a
    - name: disk-b
      persistentVolumeClaim:
        claimName: disk-b

Finally as a result it could happen that k8s won't be able schedule to pod because the volumes are on different zones.

Solution 8 - Docker

  1. Make sure the kubernetes node had the required label. You can verify the node labels using:
kubectl get nodes --show-labels

One of the kubernetes nodes should show you the name/ label of the persistent volume and your pod should be scheduled on the same node.

  1. Make sure the requested size in PersistentVolumeClaim is matching with the size of the PersistentVolume. If the size does not match, either correct the resources.requests.storage in PersistentVolumeClaim or delete the old PersistentVolume and create a new one with the correct size.

Verification steps:

  1. Describe your persistent volume:
kubectl describe pv postgres-br-proxy-pv-0

Output:

...
Node Affinity:
  Required Terms:
    Term 0:        postgres-br-proxy in [postgres-br-proxy-pv-0]
...
  1. Show node labels:
kubectl get nodes --show-labels

Output:

NAME    STATUS   ROLES    AGE   VERSION   LABELS
node3   Ready    <none>   19d   v1.17.6   postgres-br-proxy=postgres-br-proxy-pv-0

If you are not getting the persistent volume label on the node that your pod is using then the pod won't get scheduled.

Solution 9 - Docker

After some headache inducing investigation there are a few things that are needed to be checked:

Azure:

  • Does your cluster have more that one zone selected? (zone 1, 2, 3)
  • Does your default storage class have the correct storage provider? (ZRS Zone-Redundant-Storage)

If not:

  • change the storage class to use te correct provider
  • create backup of PV data
  • stop the deployment that is using the PVC (set replicas to 0)
  • delete the PVC and confirm that the associated PV is deleted.
  • re-apply the PVC config yaml (without reference to the old storageclass name)
  • start the deployment that is using the PVC (set replicas to 1)
  • manually import backupdata

Example storageclass for AKS:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: zone-redundant-storage
parameters:
  skuname: StandardSSD_ZRS
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

GKE:

  • Does your cluster have more than one zone selected? (Zone A, B, C)
  • Does your default storage class have replication-type parameter? (replication-type: regional-pd)

If not:

  • change the storage class to use te correct parameters
  • create backup of PV data
  • stop the deployment that is using the PVC (set replicas to 0)
  • delete the PVC and confirm that the associated PV is deleted.
  • re-apply the PVC config yaml (without reference to the old storageclass name)
  • start the deployment that is using the PVC (set replicas to 1)
  • manually import backupdata

Example storageclass for GKE:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard-regional-pd-storage
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-standard
  replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer

After that PV's will have redundancy across the selected zones allowing a pod to access PV from other nodes in different zones.

Solution 10 - Docker

One cause from this is when you have a definition like below (Kafka Zookeeper in this example) which is using multiple pvcs for one container. If they land on different nodes, you will get something like the following: ..volume node affinity conflict. The solution here is to use one pvc definition and use subPath on the volumeMount.

Problem

      ...
      volumeMounts:
        - mountPath: /data
          name: kafka-zoo-data
        - mountPath: /datalog
          name: kafka-zoo-datalog
  restartPolicy: Always
  volumes:
    - name: kafka-zoo-data
      persistentVolumeClaim:
        claimName: "zookeeper-data"
    - name: kafka-zoo-datalog
      persistentVolumeClaim:
        claimName: "zookeeper-datalog"

Resolved

      ...
      volumeMounts:
        - mountPath: /data
          subPath: data
          name: kafka-zoo-data
        - mountPath: /datalog
          subPath: datalog
          name: kafka-zoo-data
  restartPolicy: Always
  volumes:
    - name: kafka-zoo-data
      persistentVolumeClaim:
        claimName: "zookeeper-data"

Solution 11 - Docker

almost same problem described here... https://github.com/kubernetes/kubernetes/issues/61620

"If you're using local volumes, and the node crashes, your pod cannot be rescheduled to a different node. It must be scheduled to the same node. That is the caveat of using local storage, your Pod becomes bound forever to one specific node."

Solution 12 - Docker

Most likely you just reduced number of nodes in your kubernetes cluster and some "regions" are not available anymore...

Something worth mentioning... if your pod will be in different zone than persistent volume then:

  • your disc access times will drop significantly (your local persistent storage is not local anymore - even with Amazon's / Google's fiber hyper fast links it's still traffic across data centers)
  • you will be paying for "cross regional network" (on your AWS bill it is something that goes into "EC2-other" and only after drilling down Aws Bill you can spot that)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKrzysztofView Question on Stackoverflow
Solution 1 - DockerSownak RoyView Answer on Stackoverflow
Solution 2 - DockerAlexzView Answer on Stackoverflow
Solution 3 - DockerjonashacktView Answer on Stackoverflow
Solution 4 - DockerweibeldView Answer on Stackoverflow
Solution 5 - DockerErokosView Answer on Stackoverflow
Solution 6 - Dockerblueheart_2View Answer on Stackoverflow
Solution 7 - DockerPrzemek NowakView Answer on Stackoverflow
Solution 8 - DockerVishrantView Answer on Stackoverflow
Solution 9 - DockerNikView Answer on Stackoverflow
Solution 10 - Dockertjg184View Answer on Stackoverflow
Solution 11 - DockerjitendraView Answer on Stackoverflow
Solution 12 - DockerPiotrView Answer on Stackoverflow