Support k8s taints and tolerations, node affinity and node selectors #833

Closed
opened 2024-06-03 02:33:49 +00:00 by dboreham · 12 comments
Owner

Often k8s deployment policy is guided by tags applied to pods that are then used to control (to a larger or lesser degree) which nodes the pods may be placed upon (node affinity). This is called "taints and tolerations" in k8s-speak, after the metadata used to specify said policy.

Task is to implement support in stack orchestrator for specifying pod toleration data.

See: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

Often k8s deployment policy is guided by tags applied to pods that are then used to control (to a larger or lesser degree) which nodes the pods may be placed upon (node affinity). This is called "taints and tolerations" in k8s-speak, after the metadata used to specify said policy. Task is to implement support in stack orchestrator for specifying pod toleration data. See: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
dboreham self-assigned this 2024-06-03 02:33:49 +00:00
Author
Owner
@srw
Author
Owner
Metadata is added to this struct: https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1PodSpec.md , here: https://git.vdb.to/cerc-io/stack-orchestrator/src/branch/main/stack_orchestrator/deploy/k8s/cluster_info.py#L365
dboreham changed title from Support k8s taints and tolerations to Support k8s taints and tolerations, node affinity and node selectors 2024-06-03 02:45:15 +00:00
Author
Owner
https://medium.com/saas-infra/taints-and-tolerations-node-affinity-and-node-selector-explained-f329653c2bc6
Author
Owner

For clarity: SO will only support adding the relevant metadata to pod descriptors it creates. It won't address the other side of this feature: specifying the metadata on the nodes.

For clarity: SO will only support adding the relevant metadata to pod descriptors it creates. It won't address the other side of this feature: specifying the metadata on the nodes.
Author
Owner

Example pod metadata:

apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: security
            operator: In
            values:
            - S1
        topologyKey: topology.kubernetes.io/zone
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: security
              operator: In
              values:
              - S2
          topologyKey: topology.kubernetes.io/zone
  containers:
  - name: with-pod-affinity
    image: registry.k8s.io/pause:2.0
Example pod metadata: ``` apiVersion: v1 kind: Pod metadata: name: with-pod-affinity spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: topology.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: security operator: In values: - S2 topologyKey: topology.kubernetes.io/zone containers: - name: with-pod-affinity image: registry.k8s.io/pause:2.0 ```
Author
Owner
apiVersion: apps/v1
kind: Deployment
metadata:
  name: application-server
...
spec:
  template:
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - database
            topologyKey: topology.kubernetes.io/zone
            # Only Pods from a given rollout are taken into consideration when calculating pod affinity.
            # If you update the Deployment, the replacement Pods follow their own affinity rules
            # (if there are any defined in the new Pod template)
            matchLabelKeys:
            - pod-template-hash
``` apiVersion: apps/v1 kind: Deployment metadata: name: application-server ... spec: template: spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - database topologyKey: topology.kubernetes.io/zone # Only Pods from a given rollout are taken into consideration when calculating pod affinity. # If you update the Deployment, the replacement Pods follow their own affinity rules # (if there are any defined in the new Pod template) matchLabelKeys: - pod-template-hash ```
Author
Owner
apiVersion: v1
kind: Pod
metadata:
  labels:
    # Assume that all relevant Pods have a "tenant" label set
    tenant: tenant-a
...
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      # ensure that pods associated with this tenant land on the correct node pool
      - matchLabelKeys:
          - tenant
        topologyKey: node-pool
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      # ensure that pods associated with this tenant can't schedule to nodes used for another tenant
      - mismatchLabelKeys:
        - tenant # whatever the value of the "tenant" label for this Pod, prevent
                 # scheduling to nodes in any pool where any Pod from a different
                 # tenant is running.
        labelSelector:
          # We have to have the labelSelector which selects only Pods with the tenant label,
          # otherwise this Pod would hate Pods from daemonsets as well, for example,
          # which aren't supposed to have the tenant label.
          matchExpressions:
          - key: tenant
            operator: Exists
        topologyKey: node-pool
``` apiVersion: v1 kind: Pod metadata: labels: # Assume that all relevant Pods have a "tenant" label set tenant: tenant-a ... spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: # ensure that pods associated with this tenant land on the correct node pool - matchLabelKeys: - tenant topologyKey: node-pool podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: # ensure that pods associated with this tenant can't schedule to nodes used for another tenant - mismatchLabelKeys: - tenant # whatever the value of the "tenant" label for this Pod, prevent # scheduling to nodes in any pool where any Pod from a different # tenant is running. labelSelector: # We have to have the labelSelector which selects only Pods with the tenant label, # otherwise this Pod would hate Pods from daemonsets as well, for example, # which aren't supposed to have the tenant label. matchExpressions: - key: tenant operator: Exists topologyKey: node-pool ```
Author
Owner
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-cache
spec:
  selector:
    matchLabels:
      app: store
  replicas: 3
  template:
    metadata:
      labels:
        app: store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: redis-server
        image: redis:3.2-alpine
``` apiVersion: apps/v1 kind: Deployment metadata: name: redis-cache spec: selector: matchLabels: app: store replicas: 3 template: metadata: labels: app: store spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - store topologyKey: "kubernetes.io/hostname" containers: - name: redis-server image: redis:3.2-alpine ```
Author
Owner
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: web-store
  replicas: 3
  template:
    metadata:
      labels:
        app: web-store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-store
            topologyKey: "kubernetes.io/hostname"
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: web-app
        image: nginx:1.16-alpine
``` apiVersion: apps/v1 kind: Deployment metadata: name: web-server spec: selector: matchLabels: app: web-store replicas: 3 template: metadata: labels: app: web-store spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - web-store topologyKey: "kubernetes.io/hostname" podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - store topologyKey: "kubernetes.io/hostname" containers: - name: web-app image: nginx:1.16-alpine ```
Author
Owner

Toleration spec examples:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  tolerations:
  - key: "example-key"
    operator: "Exists"
    effect: "NoSchedule"
Toleration spec examples: ``` apiVersion: v1 kind: Pod metadata: name: nginx labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent tolerations: - key: "example-key" operator: "Exists" effect: "NoSchedule" ```
Author
Owner
tolerations:
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoSchedule"
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoExecute"
``` tolerations: - key: "key1" operator: "Equal" value: "value1" effect: "NoSchedule" - key: "key1" operator: "Equal" value: "value1" effect: "NoExecute" ```
Author
Owner
Initial version of this feature is merged, doc here: https://git.vdb.to/cerc-io/stack-orchestrator/src/branch/main/docs/k8s-deployment-enhancements.md
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/stack-orchestrator#833
No description provided.