K8s binary node expansion scheme
Based on the following linked documents, the k8s environment is expanded
Link: https://pan.baidu.com/s/1_DopTLhdWujC3_6Ed3gyxg Extraction code: eu2g
There is no document on the network that you can directly copy and use. However, most authors do it according to their own needs. Binary is not as simple as kubeadm. One or two commands are basically done, but it is basically a one-time copy of the deployment document of the cluster at that time. When expanding the node node, we should first consider several issues.
1. Mandatory services on node
docker, cni plug-in (calio, fan), kubelet, Kube proxy
2. if they directly establish a relationship with the k8s master
That is, how do kubelet and Kube proxy establish a connection with apiserver?
From the following example, we can see that the configuration files of all components are recorded to establish a connection with apiserver, and Kube proxy and kubelet are no exception.
[root@master kubernetes]# find . | grep -nr 6443 scheduler.kubeconfig:5: server: https://192.168.0.110:6443 kubelet.kubeconfig:5: server: https://192.168.0.110:6443 kube-proxy.kubeconfig:5: server: https://192.168.0.110:6443 bootstrap-kubelet.kubeconfig:5: server: https://192.168.0.110:6443 admin.kubeconfig:5: server: https://192.168.0.110:6443 controller-manager.kubeconfig:5: server: https://192.168.0.110:6443
In addition, the certificate file is required to establish a connection, because we can see that https://192.168.0.110:6443 , HTTPS protocol is used here. In addition, we all know the communication principle of k8s binary. ssl certificate and tls protocol. If you don't understand them, please Baidu first.
3. what if a single master node is expanded to multiple master nodes
It's also simple. It's all in one. I'll add the solution another day. It's not very different from the previous deployment method of a single master node. It's just a question before capacity expansion. Why should we have more masters? I'm sure everyone knows what high availability is. What's the core problem of high availability? We need to understand the two core components, apiserver and etcd. Apiserver is the hub of all components, and etcd stores the metadata of k8s cluster
The apiserver needs to consider that all components and any interaction information must be carried out through the apiserver. If the apiserver is overwhelmed and dies, the cluster will collapse. Therefore, the benefits of multiple masters are obvious. In fact, how to do it is very simple. For example, I use haproxy+keeplived to realize polling through the ip:6443 port of the master node. Will the problem be solved soon.
Etcd is mainly responsible for data redundancy. Even if I hang up a master node, at least one node's etcd stores cluster related information. Etcd currently provides HTTP through port 2379 by default API Service, 2380 port and peer communication. You can directly establish the etcd cluster relationship in the configuration, as shown below
adopt etcd The configuration file of initial-cluster: 'k8s-master01=https://192.168.0.107:2380,k8s-master02=https://192.168.0.108:2380,k8s-master03=https://192.168.0.109:2380
1. initialize system configuration
1.1 configuring hosts
192.168.0.110 master1 192.168.0.102 node1 192.168.0.63 node2 192.168.0.55 node3
1.2 configure yum source
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo yum install -y yum-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
1.3 installation of necessary tools
yum install wget jq psmisc vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git wget jq psmisc -y
1.4 node shutdown firewalld, dnsmasq, selinux
systemctl disable --now firewalld systemctl disable --now dnsmasq systemctl disable --now NetworkManager setenforce 0 sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
1.5 the node closes the swap partition and fstab notes swap
swapoff -a && sysctl -w vm.swappiness=0 sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
1.6 node synchronization time
rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm yum install ntpdate -y ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime echo 'Asia/Shanghai' >/etc/timezone ntpdate time2.aliyun.com # Join crontab */5 * * * * /usr/sbin/ntpdate time2.aliyun.com
1.7 node configuration limit:
ulimit -SHn 65535 vim /etc/security/limits.conf # Add the following at the end * soft nofile 65535 * hard nofile 65535 * soft nproc 65535 * hard nproc 65535 * soft memlock unlimited * hard memlock unlimited 
1.8 secret free login
#master node execution
ssh-copy-id -i .ssh/id_rsa.pub 192.168.0.55
1.9 copying node installation files
#master node execution
scp -r /root/k8s-ha-install/ node3:/root/
1.10 kernel upgrade
yum update -y --exclude=kernel* cd /root wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm cd /root && yum localinstall -y kernel-ml* grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)" grubby --default-kernel reboot uname -a
1.11 change host name
hostname node3 hostnamectl set-hostname node3 bash
1.12 installing ipvsadm
#All nodes are configured with ipvs module, and NF in kernel 4.19+ version_ conntrack_ IPv4 has been changed to nf_conntrack, NF is used below 4.18_ conntrack_ IPv4 is enough
yum install ipvsadm ipset sysstat conntrack libseccomp -y modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack vim /etc/modules-load.d/ipvs.conf # Add the following ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp ip_vs_sh nf_conntrack ip_tables ip_set xt_set ipt_set ipt_rpfilter ipt_REJECT ipip #Start service systemctl enable --now systemd-modules-load.service #Check if it is loaded [root@node3 ~]# lsmod | grep -e ip_vs -e nf_conntrack nf_conntrack_netlink 40960 0 nfnetlink 16384 3 nf_conntrack_netlink,ip_set ip_vs_ftp 16384 0 nf_nat 32768 2 nf_nat_ipv4,ip_vs_ftp ip_vs_sed 16384 0 ip_vs_nq 16384 0 ip_vs_fo 16384 0 ip_vs_dh 16384 0 ip_vs_lblcr 16384 0 ip_vs_lblc 16384 0 ip_vs_wlc 16384 0 ip_vs_lc 16384 0 ip_vs_sh 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 13 ip_vs 151552 37 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed,ip_vs_ftp nf_conntrack 143360 6 xt_conntrack,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink,ip_vs nf_defrag_ipv6 20480 1 nf_conntrack nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 3 nf_conntrack,nf_nat,ip_vs
1.13 enable some necessary kernel parameters in the k8s cluster, and configure the k8s kernel on all nodes:
cat <<EOF > /etc/sysctl.d/k8s.conf net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.netfilter.nf_conntrack_max=2310720 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl =15 net.ipv4.tcp_max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.tcp_timestamps = 0 net.core.somaxconn = 16384 EOF #Execute command sysctl --system && sysctl -p
2. install Docker
yum install docker-ce-19.03.* -y [root@node3 ~]# vim /etc/docker/daemon.json #Change the text as follows. It is recommended to copy from other nodes { "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "insecure-registries": ["https://hub.test.com:18443"] } #Start docker systemctl daemon-reload && systemctl enable --now docker
3. preparation before installing core services
mkdir -p /var/lib/kubelet /var/log/kubernetes /etc/systemd/system/kubelet.service.d /etc/kubernetes/manifests/ /etc/kubernetes/pki /opt/cni/bin #Execute on the master node scp /usr/local/bin/kube{let,-proxy} node3:/usr/local/bin/ #Copy related certificates to the node node [root@master1 ~]# for FILE in etcd-ca.pem etcd.pem etcd-key.pem;do scp /etc/etcd/ssl/$FILE node3:/etc/etcd/ssl/; done etcd-ca.pem 100% 1367 672.2KB/s 00:00 etcd.pem 100% 1452 1.3MB/s 00:00 etcd-key.pem 100% 1675 1.3MB/s 00:00 [root@master1 ~]# for FILE in pki/ca.pem pki/ca-key.pem pki/front-proxy-ca.pem bootstrap-kubelet.kubeconfig; do scp /etc/kubernetes/$FILE node3:/etc/kubernetes/${FILE}; done ca.pem 100% 1411 966.2KB/s 00:00 ca-key.pem 100% 1675 1.3MB/s 00:00 front-proxy-ca.pem 100% 1143 899.7KB/s 00:00 bootstrap-kubelet.kubeconfig 100% 2300 139.2KB/s 00:00
4. configure and install kubelet
vim /usr/lib/systemd/system/kubelet.service [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=docker.service Requires=docker.service [Service] ExecStart=/usr/local/bin/kubelet Restart=always StartLimitInterval=0 RestartSec=10 [Install] WantedBy=multi-user.target vim /etc/systemd/system/kubelet.service.d/10-kubelet.conf [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig" Environment="KUBELET_SYSTEM_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin" Environment="KUBELET_CONFIG_ARGS=--config=/etc/kubernetes/kubelet-conf.yml --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.2" Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' " ExecStart= ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_SYSTEM_ARGS $KUBELET_EXTRA_ARGS vim /etc/kubernetes/kubelet-conf.yml apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration address: 0.0.0.0 port: 10250 readOnlyPort: 10255 authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd cgroupsPerQOS: true clusterDNS: - 10.96.0.10 clusterDomain: cluster.local containerLogMaxFiles: 5 containerLogMaxSize: 10Mi contentType: application/vnd.kubernetes.protobuf cpuCFSQuota: true cpuManagerPolicy: none cpuManagerReconcilePeriod: 10s enableControllerAttachDetach: true enableDebuggingHandlers: true enforceNodeAllocatable: - pods eventBurst: 10 eventRecordQPS: 5 evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% evictionPressureTransitionPeriod: 5m0s failSwapOn: true fileCheckFrequency: 20s hairpinMode: promiscuous-bridge healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 20s imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 imageMinimumGCAge: 2m0s iptablesDropBit: 15 iptablesMasqueradeBit: 14 kubeAPIBurst: 10 kubeAPIQPS: 5 makeIPTablesUtilChains: true maxOpenFiles: 1000000 maxPods: 110 nodeStatusUpdateFrequency: 10s oomScoreAdj: -999 podPidsLimit: -1 registryBurst: 10 registryPullQPS: 5 resolvConf: /etc/resolv.conf rotateCertificates: true runtimeRequestTimeout: 2m0s serializeImagePulls: true staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 4h0m0s syncFrequency: 1m0s volumeStatsAggPeriod: 1m0s
#Start service
systemctl daemon-reload systemctl enable --now kubelet #master execution [root@master1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready <none> 29d v1.20.0 node1 Ready <none> 29d v1.20.0 node2 Ready <none> 29d v1.20.0 node3 Ready <none> 3m v1.20.0
Problem sorting:
[root@node3 ~]# journalctl -xefu kubelet | grep fai Jun 02 19:33:39 node3 kubelet[24182]: F0602 19:33:39.786231 24182 server.go:269] failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs" Jun 02 19:33:39 node3 systemd[1]: Unit kubelet.service entered failed state. Jun 02 19:33:39 node3 systemd[1]: kubelet.service failed. Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.914011 24431 kubelet.go:1271] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.922192 24431 nodelease.go:49] failed to get node "node3" when trying to set owner ref to the node lease: nodes "node3" not found Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.993972 24431 eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "node3" not found Jun 02 19:33:55 node3 kubelet[24431]: W0602 19:33:55.518361 24431 driver-call.go:149] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds, args: [init], error: fork/exec /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds: no such file or directory, output: "" ##Check /etc/docker/daemon JSON file configuration, preferably copied directly from other healthy node nodes
Jun 02 19:36:34 node3 kubelet[26467]: E0602 19:36:34.012055 26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 10s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)" Jun 02 19:36:42 node3 kubelet[26467]: E0602 19:36:42.628358 26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 10s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)" Jun 02 19:36:59 node3 kubelet[26467]: E0602 19:36:59.097043 26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 20s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)" Jun 02 19:37:02 node3 kubelet[26467]: E0602 19:37:02.628171 26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 20s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)" #It doesn't matter because I didn't install Kube proxy, so the cni plug-in CrashLoopBackOff
#There are some strange logs, but if kubelet can be started normally, don't worry about them first, because the dependency problems like Kube proxy and cni can be solved by restarting after installation
5. configure and install Kube proxy
master node operation
scp /etc/kubernetes/kube-proxy.kubeconfig node3:/etc/kubernetes/kube-proxy.kubeconfig scp kube-proxy/kube-proxy.conf node3:/etc/kubernetes/kube-proxy.conf scp kube-proxy/kube-proxy.service node3:/usr/lib/systemd/system/kube-proxy.service
Node3 node starts Kube proxy
systemctl daemon-reload systemctl enable --now kube-proxy
6.cni
I use calico here. Because similar system plug-ins are deployed in the daemonset mode, you can directly apply
kubectl apply -f calico-etcd.yaml kubectl get po -n kube-system
If the container status is abnormal, you can use kubectl describe or logs to view the container log