K8s binary node expansion scheme

K8s binary node expansion scheme

Based on the following linked documents, the k8s environment is expanded
Link: https://pan.baidu.com/s/1_DopTLhdWujC3_6Ed3gyxg Extraction code: eu2g

There is no document on the network that you can directly copy and use. However, most authors do it according to their own needs. Binary is not as simple as kubeadm. One or two commands are basically done, but it is basically a one-time copy of the deployment document of the cluster at that time. When expanding the node node, we should first consider several issues.

1. Mandatory services on node

docker, cni plug-in (calio, fan), kubelet, Kube proxy

2. if they directly establish a relationship with the k8s master

That is, how do kubelet and Kube proxy establish a connection with apiserver?

From the following example, we can see that the configuration files of all components are recorded to establish a connection with apiserver, and Kube proxy and kubelet are no exception.

[root@master kubernetes]# find . | grep -nr 6443
scheduler.kubeconfig:5:    server: https://192.168.0.110:6443
kubelet.kubeconfig:5:    server: https://192.168.0.110:6443
kube-proxy.kubeconfig:5:    server: https://192.168.0.110:6443
bootstrap-kubelet.kubeconfig:5:    server: https://192.168.0.110:6443
admin.kubeconfig:5:    server: https://192.168.0.110:6443
controller-manager.kubeconfig:5:    server: https://192.168.0.110:6443

In addition, the certificate file is required to establish a connection, because we can see that https://192.168.0.110:6443 , HTTPS protocol is used here. In addition, we all know the communication principle of k8s binary. ssl certificate and tls protocol. If you don't understand them, please Baidu first.

3. what if a single master node is expanded to multiple master nodes

It's also simple. It's all in one. I'll add the solution another day. It's not very different from the previous deployment method of a single master node. It's just a question before capacity expansion. Why should we have more masters? I'm sure everyone knows what high availability is. What's the core problem of high availability? We need to understand the two core components, apiserver and etcd. Apiserver is the hub of all components, and etcd stores the metadata of k8s cluster

The apiserver needs to consider that all components and any interaction information must be carried out through the apiserver. If the apiserver is overwhelmed and dies, the cluster will collapse. Therefore, the benefits of multiple masters are obvious. In fact, how to do it is very simple. For example, I use haproxy+keeplived to realize polling through the ip:6443 port of the master node. Will the problem be solved soon.

Etcd is mainly responsible for data redundancy. Even if I hang up a master node, at least one node's etcd stores cluster related information. Etcd currently provides HTTP through port 2379 by default API Service, 2380 port and peer communication. You can directly establish the etcd cluster relationship in the configuration, as shown below

adopt etcd The configuration file of

initial-cluster: 'k8s-master01=https://192.168.0.107:2380,k8s-master02=https://192.168.0.108:2380,k8s-master03=https://192.168.0.109:2380

1. initialize system configuration

1.1 configuring hosts

192.168.0.110   master1
192.168.0.102   node1
192.168.0.63    node2
192.168.0.55    node3

1.2 configure yum source

curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo

1.3 installation of necessary tools

yum install wget jq psmisc vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git  wget jq psmisc -y

1.4 node shutdown firewalld, dnsmasq, selinux

systemctl disable --now firewalld 
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager

setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config

1.5 the node closes the swap partition and fstab notes swap

swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

1.6 node synchronization time

rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm
yum install ntpdate -y

ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time2.aliyun.com
# Join crontab
*/5 * * * * /usr/sbin/ntpdate time2.aliyun.com

1.7 node configuration limit:

ulimit -SHn 65535

vim /etc/security/limits.conf
# Add the following at the end
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
* soft memlock unlimited
* hard memlock unlimited 

1.8 secret free login

#master node execution

ssh-copy-id -i .ssh/id_rsa.pub 192.168.0.55 

1.9 copying node installation files

#master node execution

scp -r /root/k8s-ha-install/ node3:/root/

1.10 kernel upgrade

yum update -y --exclude=kernel* 

cd /root
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm
cd /root && yum localinstall -y kernel-ml*

grub2-set-default  0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
grubby --default-kernel

reboot 

uname -a

1.11 change host name

hostname node3
hostnamectl set-hostname node3
bash

1.12 installing ipvsadm

#All nodes are configured with ipvs module, and NF in kernel 4.19+ version_ conntrack_ IPv4 has been changed to nf_conntrack, NF is used below 4.18_ conntrack_ IPv4 is enough

yum install ipvsadm ipset sysstat conntrack libseccomp -y


modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
vim /etc/modules-load.d/ipvs.conf 
	# Add the following
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip



#Start service
systemctl enable --now systemd-modules-load.service

#Check if it is loaded
[root@node3 ~]#  lsmod | grep -e ip_vs -e nf_conntrack
nf_conntrack_netlink    40960  0 
nfnetlink              16384  3 nf_conntrack_netlink,ip_set
ip_vs_ftp              16384  0 
nf_nat                 32768  2 nf_nat_ipv4,ip_vs_ftp
ip_vs_sed              16384  0 
ip_vs_nq               16384  0 
ip_vs_fo               16384  0 
ip_vs_dh               16384  0 
ip_vs_lblcr            16384  0 
ip_vs_lblc             16384  0 
ip_vs_wlc              16384  0 
ip_vs_lc               16384  0 
ip_vs_sh               16384  0 
ip_vs_wrr              16384  0 
ip_vs_rr               16384  13 
ip_vs                 151552  37 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed,ip_vs_ftp
nf_conntrack          143360  6 xt_conntrack,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink,ip_vs
nf_defrag_ipv6         20480  1 nf_conntrack
nf_defrag_ipv4         16384  1 nf_conntrack
libcrc32c              16384  3 nf_conntrack,nf_nat,ip_vs

1.13 enable some necessary kernel parameters in the k8s cluster, and configure the k8s kernel on all nodes:

cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720

net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF


#Execute command
sysctl --system && sysctl -p

2. install Docker

yum install docker-ce-19.03.* -y

[root@node3 ~]# vim /etc/docker/daemon.json 
#Change the text as follows. It is recommended to copy from other nodes
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
      "max-size": "100m"
  },
  "insecure-registries": ["https://hub.test.com:18443"]
}


#Start docker
systemctl daemon-reload && systemctl enable --now docker

3. preparation before installing core services

mkdir -p /var/lib/kubelet /var/log/kubernetes /etc/systemd/system/kubelet.service.d /etc/kubernetes/manifests/  /etc/kubernetes/pki /opt/cni/bin

#Execute on the master node
scp /usr/local/bin/kube{let,-proxy} node3:/usr/local/bin/ 

#Copy related certificates to the node node
[root@master1 ~]# for FILE in etcd-ca.pem etcd.pem etcd-key.pem;do  scp /etc/etcd/ssl/$FILE node3:/etc/etcd/ssl/; done
etcd-ca.pem                                                                                                                                                  100% 1367   672.2KB/s   00:00    
etcd.pem                                                                                                                                                     100% 1452     1.3MB/s   00:00    
etcd-key.pem                                                                                                                                                 100% 1675     1.3MB/s   00:00    
[root@master1 ~]# for FILE in pki/ca.pem pki/ca-key.pem pki/front-proxy-ca.pem bootstrap-kubelet.kubeconfig; do  scp /etc/kubernetes/$FILE node3:/etc/kubernetes/${FILE}; done
ca.pem                                                                                                                                                       100% 1411   966.2KB/s   00:00    
ca-key.pem                                                                                                                                                   100% 1675     1.3MB/s   00:00    
front-proxy-ca.pem                                                                                                                                           100% 1143   899.7KB/s   00:00    
bootstrap-kubelet.kubeconfig                                                                                                                                 100% 2300   139.2KB/s   00:00   

4. configure and install kubelet

vim  /usr/lib/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=docker.service
Requires=docker.service

[Service]
ExecStart=/usr/local/bin/kubelet

Restart=always
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target



vim /etc/systemd/system/kubelet.service.d/10-kubelet.conf

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig"
Environment="KUBELET_SYSTEM_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_CONFIG_ARGS=--config=/etc/kubernetes/kubelet-conf.yml --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.2"
Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' "
ExecStart=
ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_SYSTEM_ARGS $KUBELET_EXTRA_ARGS



vim /etc/kubernetes/kubelet-conf.yml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.pem
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
contentType: application/vnd.kubernetes.protobuf
cpuCFSQuota: true
cpuManagerPolicy: none
cpuManagerReconcilePeriod: 10s
enableControllerAttachDetach: true
enableDebuggingHandlers: true
enforceNodeAllocatable:
- pods
eventBurst: 10
eventRecordQPS: 5
evictionHard:
  imagefs.available: 15%
  memory.available: 100Mi
  nodefs.available: 10%
  nodefs.inodesFree: 5%
evictionPressureTransitionPeriod: 5m0s
failSwapOn: true
fileCheckFrequency: 20s
hairpinMode: promiscuous-bridge
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 20s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMinimumGCAge: 2m0s
iptablesDropBit: 15
iptablesMasqueradeBit: 14
kubeAPIBurst: 10
kubeAPIQPS: 5
makeIPTablesUtilChains: true
maxOpenFiles: 1000000
maxPods: 110
nodeStatusUpdateFrequency: 10s
oomScoreAdj: -999
podPidsLimit: -1
registryBurst: 10
registryPullQPS: 5
resolvConf: /etc/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 2m0s
serializeImagePulls: true
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
volumeStatsAggPeriod: 1m0s

#Start service

systemctl daemon-reload
systemctl enable --now kubelet

#master execution
[root@master1 ~]# kubectl get nodes
NAME      STATUS   ROLES    AGE     VERSION
master1   Ready    <none>   29d     v1.20.0
node1     Ready    <none>   29d     v1.20.0
node2     Ready    <none>   29d     v1.20.0
node3     Ready    <none>   3m      v1.20.0

Problem sorting:

[root@node3 ~]# journalctl -xefu kubelet | grep fai
Jun 02 19:33:39 node3 kubelet[24182]: F0602 19:33:39.786231   24182 server.go:269] failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
Jun 02 19:33:39 node3 systemd[1]: Unit kubelet.service entered failed state.
Jun 02 19:33:39 node3 systemd[1]: kubelet.service failed.
Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.914011   24431 kubelet.go:1271] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.922192   24431 nodelease.go:49] failed to get node "node3" when trying to set owner ref to the node lease: nodes "node3" not found
Jun 02 19:33:54 node3 kubelet[24431]: E0602 19:33:54.993972   24431 eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "node3" not found
Jun 02 19:33:55 node3 kubelet[24431]: W0602 19:33:55.518361   24431 driver-call.go:149] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds, args: [init], error: fork/exec /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds: no such file or directory, output: ""
##Check /etc/docker/daemon JSON file configuration, preferably copied directly from other healthy node nodes
Jun 02 19:36:34 node3 kubelet[26467]: E0602 19:36:34.012055   26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 10s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"
Jun 02 19:36:42 node3 kubelet[26467]: E0602 19:36:42.628358   26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 10s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"
Jun 02 19:36:59 node3 kubelet[26467]: E0602 19:36:59.097043   26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 20s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"
Jun 02 19:37:02 node3 kubelet[26467]: E0602 19:37:02.628171   26467 pod_workers.go:191] Error syncing pod 5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5 ("calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "back-off 20s restarting failed container=calico-node pod=calico-node-7zvxw_kube-system(5a0942a7-1b9e-46a1-b4d0-e25afc5eabd5)"

#It doesn't matter because I didn't install Kube proxy, so the cni plug-in CrashLoopBackOff

#There are some strange logs, but if kubelet can be started normally, don't worry about them first, because the dependency problems like Kube proxy and cni can be solved by restarting after installation

5. configure and install Kube proxy

master node operation

 scp /etc/kubernetes/kube-proxy.kubeconfig node3:/etc/kubernetes/kube-proxy.kubeconfig
 scp kube-proxy/kube-proxy.conf node3:/etc/kubernetes/kube-proxy.conf
 scp kube-proxy/kube-proxy.service node3:/usr/lib/systemd/system/kube-proxy.service

Node3 node starts Kube proxy

systemctl daemon-reload
systemctl enable --now kube-proxy

6.cni

I use calico here. Because similar system plug-ins are deployed in the daemonset mode, you can directly apply

kubectl apply -f calico-etcd.yaml

kubectl  get po -n kube-system

If the container status is abnormal, you can use kubectl describe or logs to view the container log

Tags: Linux Docker Kubernetes Container

Posted by mlschutz on Thu, 02 Jun 2022 22:06:27 +0530