1. Introduction to the basic writing method of yaml
1. Basic grammar
- Case Sensitive
- Use indentation to indicate hierarchical relationships
- Tabs are not allowed for indentation, only spaces are allowed
- The number of spaces indented is not important, as long as the elements of the same level are left-aligned
2. Data type
- object
- The object key-value pair uses the colon structure to represent key: value, and a space must be added after the colon
key: child-key1: value1 child-key2: value2
- array
- Lines starting with - represent an array
- value1 - value2 - value2
- scalar
String, Boolean, Integer, Float, Null, Time, Date
2. Deployment deployment file details
The deployment contains a total of 5 attributes
- apiVersion: the version number of the resource
- kind: the type of resource
- metadata: Metadata information of the resource
- spec: The specification and expected state of the resource
- status: the actual status of the resource
complete sample message
apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" creationTimestamp: "2020-11-04T09:00:37Z" generation: 1 labels: app: gbavqfbyfltzqfuu-serving version: v1 name: gbavqfbyfltzqfuu-serving-v1 namespace: aipaas-modelserving resourceVersion: "121103285" selfLink: /apis/extensions/v1beta1/namespaces/aipaas-modelserving/deployments/gbavqfbyfltzqfuu-serving-v1 uid: 34db5072-1e7c-11eb-b71b-fa163efea19e spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: gbavqfbyfltzqfuu-serving version: v1 strategy: rollingUpdate: maxSurge: 100% maxUnavailable: 100% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: gbavqfbyfltzqfuu-serving version: v1 spec: containers: - env: - name: VECLIB_MAXIMUM_THREADS value: "1" - name: MKL_NUM_THREADS value: "1" - name: NUMEXPR_NUM_THREADS value: "1" - name: NVIDIA_VISIBLE_DEVICES value: none - name: OPENBLAS_NUM_THREADS value: "1" - name: OMP_NUM_THREADS value: "1" image: 172.16.1.222:10004/library/pmml-serving:v1.2 imagePullPolicy: IfNotPresent name: gbavqfbyfltzqfuu-serving-v1 ports: - containerPort: 8888 name: aipaas protocol: TCP readinessProbe: failureThreshold: 2 httpGet: httpHeaders: - name: Authorization value: Bearer eyJhbG...... path: /openscoring/model/serving port: 8888 scheme: HTTP initialDelaySeconds: 30 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 3 resources: limits: cpu: "1" memory: "2147483648" requests: cpu: 250m memory: "536870912" terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst nodeSelector: node: worker restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 status: availableReplicas: 1 conditions: - lastTransitionTime: "2020-11-04T09:00:37Z" lastUpdateTime: "2020-11-04T09:00:37Z" message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available - lastTransitionTime: "2020-11-04T09:00:37Z" lastUpdateTime: "2020-11-04T09:01:36Z" message: ReplicaSet "gbavqfbyfltzqfuu-serving-v1-7bd89cd5c9" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing observedGeneration: 1 readyReplicas: 1 replicas: 1 updatedReplicas: 1
1. Introduction to metadata
metadata: annotations: # Custom Notes List generation: labels: # label, used to identify the resource app: version: name: # deployment name, the name cannot be repeated under the same namespace namespace: # The namespace to which deploymetn belongs resourceVersion: selfLink: uid:
In the online service scenario, there will be multiple versions of model services at the same time, each version of model service will correspond to a deployment, and all versions of model services will share a service. At this time, associate the deployment with the service through labels:app:, and distinguish different deployments under the same service through app+version.
2.spec introduction
- spec.progressDeadlineSeconds
Optional field, indicating how many seconds the deployment controller waits to determine (via deployment status) that the deployment process is stuck, unit: second
- spec.replicas
Optional field, specify the desired number of pod s, the default is 1
- spec.revisionHistoryLimit
Optional field, used to specify the number of old ReplicaSet s that can be retained, and the rest will be garbage collected in the background for historical version rollback
- spec.selector
Optional field used to specify the range of pod s managed by the deployment
- spec.strategy
The strategy used to specify the new pod to replace the old pod, including RollingUpdate and Recreate:
- RollingUpdate
- Update pod s using rolling
- Specify the maximum number of unavailable pod s during the upgrade process through the configuration item maxUnavailable; this value can be either an absolute value or a percentage; the absolute value calculated by the percentage is rounded down
- The maximum number of pod s that can exceed the expected number is specified through the configuration item maxSurge; the value can be either an absolute value or a percentage; the absolute value calculated by the percentage is rounded up;
- Recreate
All existing pods will be killed before new pods are created
- spec.template
Required field, set the style of the pod controlled by the deployment, it has exactly the same schema as the pod, it is a nested type, and does not need the apiVersion and kind fields.
spec.template: metadata: creationTimestamp: null labels: app: version: spec: containers: - env: # The list of environment variables in the container, see 6.1 for details image: # The name of the image corresponding to the container imagePullPolicy: # For the strategy of pulling images from containers, see 6.2 for details name: # the name of the container ports: # A list of port numbers that need to be exposed inside the container - containerPort: 8888 # The port number name: aipaas # port name protocol: TCP # Port protocol, support TCP and UDP, default TCP readinessProbe: # Health check, see 6.3 for details resources: # Resource configuration, see 6.4 for details terminationMessagePath: /dev/termination-log #Log save path terminationMessagePolicy: File dnsPolicy: ClusterFirst # DNS policy nodeSelector: # Pod scheduling strategy, see 6.5 for details restartPolicy: # Container restart strategy, see 6.6 for details schedulerName: # Scheduler, see 6.7 for details terminationGracePeriodSeconds: # Container deletion strategy, see 6.8 for details securityContext: {}
6.1 Environment variables
spec.containers.env: - name: VECLIB_MAXIMUM_THREADS value: "1" - name: MKL_NUM_THREADS value: "1" - name: NUMEXPR_NUM_THREADS value: "1" - name: OPENBLAS_NUM_THREADS value: "1" - name: OMP_NUM_THREADS value: "1" - name: NVIDIA_VISIBLE_DEVICES value: none - name: ConCurrencyFlag value: "false" - name: SERVER_PROCESS_NUM value: "1"
VECLIB_MAXIMUM_THREADS, MKL_NUM_THREADS, NUMEXPR_NUM_THREADS, OPENBLAS_NUM_THREADS, OMP_NUM_THREADS These five environment variables are used to control the number of multi-threaded threads, and their values are equal to the number of cpu s in the pod.
NVIDIA_VISIBLE_DEVICES is configured for gpu. When the pod does not contain gpu, add this environment variable and set it to none.
ConCurrencyFlag and SERVER_PROCESS_NUM are MPS-related environment variables, see 6.9 for details
6.2 Strategies for containers to pull images
Always: the image will be pulled from the mirror warehouse every time
Never: Only use local mirrors
IfNotPresent: The local mirror is used first. If the local mirror does not exist, the warehouse mirror will be pulled
6.3 Health Check
livenessProbe: When the health check fails, the container will be restarted directly
readinessProbe: When the health check fails, it will stop sending traffic to the container
spec.readinessProbe: httpGet: httpHeaders: - name: Authorization value: Bearer xxxxxxx # token path: /health # request path port: 8888 # request port scheme: HTTP # request protocol initialDelaySeconds: 30 # How long to perform the first health check after the container is started, in seconds periodSeconds: 30 # Health monitoring time period, in seconds, once every 10 seconds by default successThreshold: 1 # It takes several times from detection error to success before the health detection is considered successful, the default is 1 time failureThreshold: 2 # The health check is considered to have failed after several failed checks, and the default is 3 times timeoutSeconds: 3 # Health check response timeout, in seconds, default is 1 second
6.4 Resource information
spec.resources: limits: # Set resource caps cpu: "1" # cpu, the unit is core memory: "2147483648" # Memory, the unit is Mib/Gib, if no unit is added, the default is byte nvidia.com/gpu: "1" # gpu requests: # Set resource required values cpu: 250m # If there is less than one cpu, you need to add m memory: "536870912" # Memory nvidia.com/gpu: "1" # gpu
6.5 pod scheduling strategy
spec.nodeSelector: node: worker # The pod will be scheduled to the node with the worker label
6.6 Restart strategy
-
Always : No matter how the pod terminates, it will be restarted
-
Never: No matter how the pod terminates, it will not be restarted
-
OnFailure: The pod will only restart if it exits with a non-zero exit code
-
-
pec.restartPolicy: Always
-
-
6.7 Scheduler
After pre-selection and optimization scoring, K8S will select the node with the highest score to run the pod. If there are multiple nodes with the highest score in the end, the Scheduler will randomly select a node from among them to run the pod
spec.schedulerName: default-scheduler
6.8 Graceful deletion
spec.terminationGracePeriodSeconds: 30
The pod upgrade (delete) process:
- K8S will first start a new pod
- When a new pod enters the Ready state, K8S will create an Endpoint and include the new pod in load balancing
- K8S removes the Endpoint related to the old pod, and sets the status of the old pod to Terminating. At this time, no new requests will reach the old pod.
- At the same time, K8S will send a SIGTERM signal to the old pod and wait for terminationGracePeriodSeconds for such a long time. (default is 30 seconds)
- After exceeding the waiting time of terminationGracePeriodSeconds, K8S will forcibly end the old pod
- Therefore, terminationGracePeriodSeconds should set an appropriate value, at least to ensure that all existing requests can be processed correctly and return to the program to handle the SIGTERM signal, and to ensure that all transactions are completed before closing the program
3.status introduction
Indicates the actual state of the K8S object in the current cluster, often controlled by the Controller of the resource
status: conditions: - lastTransitionTime: "2020-10-27T01:06:52Z" lastUpdateTime: "2020-10-27T01:06:52Z" message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available - lastTransitionTime: "2020-10-27T01:06:52Z" lastUpdateTime: "2020-10-27T01:07:52Z" message: ReplicaSet "uvzobilkwkmsfqca-serving-v1-d9c5f7bdf" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing availableReplicas: 1 observedGeneration: 1 # observed instance readyReplicas: 1 # ready instance replicas: 1 # total number of instances updatedReplicas: 1 # updated instance
5. Introduction to MPS
Online services using gpu can increase the utilization of gpu by turning on MPS
# spec.containers.env - name: ConCurrencyFlag value: "True" - name: SERVER_PROCESS_NUM value: "1" lifecycle: preStop: exec: command: - /bin/bash - /model_serving/model_server/stop_mps.sh securityContext: capabilities: add: - SYS_ADMIN procMount: Default