deployment's yaml detailed explanation

1. Introduction to the basic writing method of yaml

1. Basic grammar

  • Case Sensitive
  • Use indentation to indicate hierarchical relationships
  • Tabs are not allowed for indentation, only spaces are allowed
  • The number of spaces indented is not important, as long as the elements of the same level are left-aligned

2. Data type

  • object
  • The object key-value pair uses the colon structure to represent key: value, and a space must be added after the colon
key: 
     child-key1: value1
     child-key2: value2
  • array
  • Lines starting with - represent an array
- value1
- value2
- value2
  • scalar
    String, Boolean, Integer, Float, Null, Time, Date

2. Deployment deployment file details

The deployment contains a total of 5 attributes

  1. apiVersion: the version number of the resource
  2. kind: the type of resource
  3. metadata: Metadata information of the resource
  4. spec: The specification and expected state of the resource
  5. status: the actual status of the resource

complete sample message

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2020-11-04T09:00:37Z"
  generation: 1
  labels:
    app: gbavqfbyfltzqfuu-serving
    version: v1
  name: gbavqfbyfltzqfuu-serving-v1
  namespace: aipaas-modelserving
  resourceVersion: "121103285"
  selfLink: /apis/extensions/v1beta1/namespaces/aipaas-modelserving/deployments/gbavqfbyfltzqfuu-serving-v1
  uid: 34db5072-1e7c-11eb-b71b-fa163efea19e
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: gbavqfbyfltzqfuu-serving
      version: v1
  strategy:
    rollingUpdate:
      maxSurge: 100%
      maxUnavailable: 100%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: gbavqfbyfltzqfuu-serving
        version: v1
    spec:
      containers:
      - env:
        - name: VECLIB_MAXIMUM_THREADS
          value: "1"
        - name: MKL_NUM_THREADS
          value: "1"
        - name: NUMEXPR_NUM_THREADS
          value: "1"
        - name: NVIDIA_VISIBLE_DEVICES
          value: none
        - name: OPENBLAS_NUM_THREADS
          value: "1"
        - name: OMP_NUM_THREADS
          value: "1"
        image: 172.16.1.222:10004/library/pmml-serving:v1.2
        imagePullPolicy: IfNotPresent
        name: gbavqfbyfltzqfuu-serving-v1
        ports:
        - containerPort: 8888
          name: aipaas
          protocol: TCP
        readinessProbe:
          failureThreshold: 2
          httpGet:
            httpHeaders:
            - name: Authorization
              value: Bearer eyJhbG......
            path: /openscoring/model/serving
            port: 8888
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 3
        resources:
          limits:
            cpu: "1"
            memory: "2147483648"
          requests:
            cpu: 250m
            memory: "536870912"
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      nodeSelector:
        node: worker
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2020-11-04T09:00:37Z"
    lastUpdateTime: "2020-11-04T09:00:37Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2020-11-04T09:00:37Z"
    lastUpdateTime: "2020-11-04T09:01:36Z"
    message: ReplicaSet "gbavqfbyfltzqfuu-serving-v1-7bd89cd5c9" has successfully
      progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

1. Introduction to metadata

metadata:
  annotations: 							# Custom Notes List
  generation: 
  labels: 									# label, used to identify the resource
    app: 
    version: 
  name: 										# deployment name, the name cannot be repeated under the same namespace
  namespace: 								# The namespace to which deploymetn belongs
  resourceVersion:
  selfLink:
  uid:

In the online service scenario, there will be multiple versions of model services at the same time, each version of model service will correspond to a deployment, and all versions of model services will share a service. At this time, associate the deployment with the service through labels:app:, and distinguish different deployments under the same service through app+version.

2.spec introduction

  1. spec.progressDeadlineSeconds

Optional field, indicating how many seconds the deployment controller waits to determine (via deployment status) that the deployment process is stuck, unit: second

  1. spec.replicas

Optional field, specify the desired number of pod s, the default is 1

  1. spec.revisionHistoryLimit

Optional field, used to specify the number of old ReplicaSet s that can be retained, and the rest will be garbage collected in the background for historical version rollback

  1. spec.selector

Optional field used to specify the range of pod s managed by the deployment

  1. spec.strategy

The strategy used to specify the new pod to replace the old pod, including RollingUpdate and Recreate:

  • RollingUpdate
    • Update pod s using rolling
    • Specify the maximum number of unavailable pod s during the upgrade process through the configuration item maxUnavailable; this value can be either an absolute value or a percentage; the absolute value calculated by the percentage is rounded down
    • The maximum number of pod s that can exceed the expected number is specified through the configuration item maxSurge; the value can be either an absolute value or a percentage; the absolute value calculated by the percentage is rounded up;
  • Recreate
    All existing pods will be killed before new pods are created
  1. spec.template
    Required field, set the style of the pod controlled by the deployment, it has exactly the same schema as the pod, it is a nested type, and does not need the apiVersion and kind fields.
spec.template: 
    metadata:
      creationTimestamp: null
      labels:
        app: 
        version: 
    spec:
      containers:
      - env:                      # The list of environment variables in the container, see 6.1 for details
        image:                    # The name of the image corresponding to the container
        imagePullPolicy:          # For the strategy of pulling images from containers, see 6.2 for details
        name:                     # the name of the container
        ports:                    # A list of port numbers that need to be exposed inside the container
        - containerPort: 8888     # The port number 
          name: aipaas            # port name
          protocol: TCP           # Port protocol, support TCP and UDP, default TCP
        readinessProbe:           # Health check, see 6.3 for details
        resources:                # Resource configuration, see 6.4 for details
        terminationMessagePath: /dev/termination-log #Log save path
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst     # DNS policy
      nodeSelector:               # Pod scheduling strategy, see 6.5 for details
      restartPolicy:              # Container restart strategy, see 6.6 for details
      schedulerName:              # Scheduler, see 6.7 for details
      terminationGracePeriodSeconds: # Container deletion strategy, see 6.8 for details
      securityContext: {}

6.1 Environment variables

spec.containers.env: 
    - name: VECLIB_MAXIMUM_THREADS
      value: "1"
    - name: MKL_NUM_THREADS
      value: "1"
    - name: NUMEXPR_NUM_THREADS
      value: "1"
    - name: OPENBLAS_NUM_THREADS
      value: "1"
    - name: OMP_NUM_THREADS 
      value: "1"    
    - name: NVIDIA_VISIBLE_DEVICES
      value: none
    - name: ConCurrencyFlag
      value: "false"
    - name: SERVER_PROCESS_NUM
      value: "1"

VECLIB_MAXIMUM_THREADS, MKL_NUM_THREADS, NUMEXPR_NUM_THREADS, OPENBLAS_NUM_THREADS, OMP_NUM_THREADS These five environment variables are used to control the number of multi-threaded threads, and their values ​​are equal to the number of cpu s in the pod.

NVIDIA_VISIBLE_DEVICES is configured for gpu. When the pod does not contain gpu, add this environment variable and set it to none.

ConCurrencyFlag and SERVER_PROCESS_NUM are MPS-related environment variables, see 6.9 for details

6.2 Strategies for containers to pull images

Always: the image will be pulled from the mirror warehouse every time
Never: Only use local mirrors
IfNotPresent: The local mirror is used first. If the local mirror does not exist, the warehouse mirror will be pulled

6.3 Health Check

livenessProbe: When the health check fails, the container will be restarted directly
readinessProbe: When the health check fails, it will stop sending traffic to the container

spec.readinessProbe: 
    httpGet:
      httpHeaders:
      - name: Authorization
        value: Bearer xxxxxxx # token
      path: /health           # request path
      port: 8888              # request port
      scheme: HTTP            # request protocol
    initialDelaySeconds: 30   # How long to perform the first health check after the container is started, in seconds
    periodSeconds: 30         # Health monitoring time period, in seconds, once every 10 seconds by default
    successThreshold: 1       # It takes several times from detection error to success before the health detection is considered successful, the default is 1 time
    failureThreshold: 2       # The health check is considered to have failed after several failed checks, and the default is 3 times
    timeoutSeconds: 3         # Health check response timeout, in seconds, default is 1 second

6.4 Resource information

spec.resources: 
    limits:                     # Set resource caps
      cpu: "1"                  # cpu, the unit is core
      memory: "2147483648"      # Memory, the unit is Mib/Gib, if no unit is added, the default is byte
      nvidia.com/gpu: "1"       # gpu
    requests:                   # Set resource required values
      cpu: 250m                 # If there is less than one cpu, you need to add m
      memory: "536870912"       # Memory
      nvidia.com/gpu: "1"       # gpu

6.5 pod scheduling strategy

spec.nodeSelector: 
    node: worker  # The pod will be scheduled to the node with the worker label

6.6 Restart strategy

  • Always : No matter how the pod terminates, it will be restarted

  • Never: No matter how the pod terminates, it will not be restarted

  • OnFailure: The pod will only restart if it exits with a non-zero exit code

  • pec.restartPolicy: Always

  • 6.7 Scheduler

After pre-selection and optimization scoring, K8S will select the node with the highest score to run the pod. If there are multiple nodes with the highest score in the end, the Scheduler will randomly select a node from among them to run the pod

spec.schedulerName: default-scheduler

6.8 Graceful deletion

spec.terminationGracePeriodSeconds: 30

The pod upgrade (delete) process:

  1. K8S will first start a new pod
  2. When a new pod enters the Ready state, K8S will create an Endpoint and include the new pod in load balancing
  3. K8S removes the Endpoint related to the old pod, and sets the status of the old pod to Terminating. At this time, no new requests will reach the old pod.
  4. At the same time, K8S will send a SIGTERM signal to the old pod and wait for terminationGracePeriodSeconds for such a long time. (default is 30 seconds)
  5. After exceeding the waiting time of terminationGracePeriodSeconds, K8S will forcibly end the old pod
  6. Therefore, terminationGracePeriodSeconds should set an appropriate value, at least to ensure that all existing requests can be processed correctly and return to the program to handle the SIGTERM signal, and to ensure that all transactions are completed before closing the program

3.status introduction

Indicates the actual state of the K8S object in the current cluster, often controlled by the Controller of the resource

status: 
  conditions:
  - lastTransitionTime: "2020-10-27T01:06:52Z"
    lastUpdateTime: "2020-10-27T01:06:52Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2020-10-27T01:06:52Z"
    lastUpdateTime: "2020-10-27T01:07:52Z"
    message: ReplicaSet "uvzobilkwkmsfqca-serving-v1-d9c5f7bdf" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  availableReplicas: 1
  observedGeneration: 1    # observed instance
  readyReplicas: 1         # ready instance
  replicas: 1              # total number of instances
  updatedReplicas: 1       # updated instance

5. Introduction to MPS

Online services using gpu can increase the utilization of gpu by turning on MPS

# spec.containers.env
- name: ConCurrencyFlag
  value: "True"
- name: SERVER_PROCESS_NUM
  value: "1"

lifecycle:
   preStop:
     exec:
       command:
         - /bin/bash
         - /model_serving/model_server/stop_mps.sh

securityContext: 
    capabilities:
      add:
        - SYS_ADMIN
      procMount: Default   

Tags: Operation & Maintenance

Posted by marvelade on Sat, 04 Feb 2023 15:18:13 +0530