k8s自1.20版本开始,系统内核要求4.19及以上。如系统内核版本过低,请升级。
基础环境配置
检查唯一性
确保每个节点上 MAC 地址和 product_uuid 的唯一性。
- 您可以使用命令
ip link
或ifconfig -a
来获取网络接口的 MAC 地址 - 可以使用
sudo cat /sys/class/dmi/id/product_uuid
命令对 product_uuid 校验
一般来讲,硬件设备会拥有唯一的地址,但是有些虚拟机的地址可能会重复。Kubernetes 使用这些值来唯一确定集群中的节点。 如果这些值在每个节点上不唯一,可能会导致安装失败。
编辑hosts
master和worker都需要编辑
[root@master-all ~]# cat /etc/hosts
192.168.230.202 master-all
192.168.230.203 Node-1
关闭Swap(交换分区)
master和worker都需要执行
临时关闭
[root@master-all ~]# swapoff -a
永久关闭,需重启
[root@master-all ~]# sed -i 's/\(.*[[:space:]]swap[[:space:]].*\)/#\1/g' /etc/fstab
关闭selinux
master和worker都需要执行
临时关闭
[root@master-all ~]# setenforce 0
永久关闭,需重启
[root@master-all ~]# sed -i 's/\(^SELINUX=\).*/\1disabled/' /etc/selinux/config
关闭防火墙
master和worker都需要执行
[root@master-all ~]# systemctl stop firewalld && systemctl disable firewalld
修改系统参数
master和worker都需要编辑
[root@master-all ~]# vim /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
[root@master-all ~]# modprobe br_netfilter
[root@master-all ~]# sysctl -p /etc/sysctl.d/k8s.conf
安装Docker
master和worker都需要安装
[root@master-all ~]# yum -y install docker
[root@master-all ~]# systemctl enable docker && systemctl start docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
此处使用阿里云镜像站
[root@Node-1 ~]# vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://n275i9sm.mirror.aliyuncs.com"]
}
[root@master-all ~]# systemctl daemon-reload
[root@master-all ~]# systemctl restart docker
配置yum源
master和worker都需配置国内源repo文件/etc/yum.repos.d/kubernetes.repo
[root@master-all ~]# vim /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kube*
安装Kubernetes组件
master和worker都需安装组件。
-disableexcludes=kubernetes: 只从指定repo中获取
[root@master-all ~]# yum -y install kubelet kubeadm kubectl --disableexcludes=kubernetes
[root@master-all ~]# systemctl enable kubelet && systemctl start kubelet
如需指定版本号
yum -y install kubelet-1.20.15 kubeadm-1.20.15 kubectl-1.20.15 --disableexcludes=kubernetes
Master安装
初始化配置
修改初始化配置时,需要保证配置文件内容正确,否则影响初始化。
[root@master-all ~]# kubeadm config print init-defaults > kubeadm-init.yaml
[root@master-all ~]# vim kubeadm-init.yaml
将advertiseAddress: 1.2.3.4修改为本机IP地址
将imageRepository: k8s.gcr.io修改为imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
# 默认token
token: abcdef.0123456789abcdef
# token有效时长
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 监听地址
advertiseAddress: 192.168.230.202
# 监听端口
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master-all
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
# 镜像仓库
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
# 版本
kubernetesVersion: v1.18.0
networking:
dnsDomain: cluster.local
# 默认pod地址
serviceSubnet: 10.96.0.0/12
scheduler: {}
拉取images镜像
[root@master-all ~]# kubeadm config images pull --config kubeadm-init.yaml
W0411 17:59:25.394894 1717 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7
初始化k8s
需要确保selinux规则放通或者selinux关闭。
[root@master-all ~]# kubeadm init --config kubeadm-init.yaml
W0411 18:00:39.037355 2000 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-all kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.230.202]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-all localhost] and IPs [192.168.230.202 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-all localhost] and IPs [192.168.230.202 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0411 18:00:41.149242 2000 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0411 18:00:41.149789 2000 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 15.004506 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master-all as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master-all as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.230.202:6443 --token abcdef.0123456789abcdef
--discovery-token-ca-cert-hash sha256:d3cfcef4ced07d6f648074a15026599db6671a780f9032ec425c3fa75fce8bda
如需清除缓存可执行:
kubeadm reset
拷贝授权文件
[root@master-all ~]# mkdir -p $HOME/.kube
[root@master-all ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master-all ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
网络配置
配置calico网络
需要注意的是, calico.yaml中的IP和kubeadm-init.yaml需要保持一致, 要么初始化前修改kubeadm-init.yaml, 要么初始化后修改calico.yaml.
[root@master-all ~]# wget --no-check-certificate https://docs.projectcalico.org/v3.11/manifests/calico.yaml
[root@master-all ~]# vim calico.yaml
- name: CALICO_IPV4POOL_CIDR
value: "10.96.0.0/12"
开始安装
[root@master-all ~]# kubectl apply -f calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
等待几分钟后查看Node网络状态
[root@master-all ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-all Ready master 12m v1.18.1
[root@master-all ~]# kubectl get pod -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-75d56dfc47-6hxnj 1/1 Running 0 10m 10.109.177.70 master-all <none> <none>
kube-system calico-node-tpz8m 1/1 Running 1 10m 192.168.230.202 master-all <none> <none>
kube-system coredns-546565776c-4pdj9 1/1 Running 2 12m 10.109.177.71 master-all <none> <none>
kube-system coredns-546565776c-6bcw8 1/1 Running 2 12m 10.109.177.72 master-all <none> <none>
kube-system etcd-master-all 1/1 Running 2 12m 192.168.230.202 master-all <none> <none>
kube-system kube-apiserver-master-all 1/1 Running 2 12m 192.168.230.202 master-all <none> <none>
kube-system kube-controller-manager-master-all 1/1 Running 1 12m 192.168.230.202 master-all <none> <none>
kube-system kube-proxy-ln9kx 1/1 Running 2 12m 192.168.230.202 master-all <none> <none>
kube-system kube-scheduler-master-all 1/1 Running 2 12m 192.168.230.202 master-all <none> <none>
Node Worker节点
node节点环境请参照 基础环境配置
获取token
新安装master产生的token有效期为24小时,如token已过期可使用kubeadm token create
命令重新生成。
[root@cloud-master ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
abcdef.0123456789abcdef 21h 2023-03-23T12:49:21+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
[root@cloud-master ~]# kubeadm token create
ifh484.gycxuxm7lykgcgsc
获取hash
通过执行openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
命令获取hash值
[root@cloud-master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
75cd976ed97df824b12a7bdb2675d719a2f786da365e3187439b7adb1c8d560c
添加node节点至master
\<master-ip\>:master节点的IP地址
\<master-port\>:master节点的端口
\<token\>:master节点上生成的token
\<hash\>:master节点上生成的hash值
在取得token以及hash后,执行如下命令:
kubeadm join <master-ip>:<master-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
[root@Node-1 ~]# kubeadm join 192.168.230.202:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:d3cfcef4ced07d6f648074a15026599db6671a780f9032ec425c3fa75fce8bda
W0411 18:15:25.273383 1583 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
-
Certificate signing request was sent to apiserver and a response was received.
-
The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
拷贝授权文件
[root@Node-1 ~]# mkdir -p $HOME/.kube
[root@Node-1 ~]# cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config
验证
[root@Node-1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-all Ready master 16m v1.18.1
node-1 Ready <none> 113s v1.18.1
遇到的问题
报错:
failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
解决方法
[root@master-all ~]# sed -i 's/\(.*native.cgroupdriver=\)systemd\(.*\)/\1cgroupfs\2/' /lib/systemd/system/docker.service
[root@master-all ~]# systemctl daemon-reload
[root@master-all ~]# systemctl restart docker
报错:
[root@master-all ~]# kubectl get pod
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决方法:
[root@master-all ~]# mkdir -p $HOME/.kube
[root@master-all ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
#如果非root用户需chown授权
报错:
[root@Node-1 ~]# kubeadm join 192.168.230.101:6443 --token n3qdmn.btk974cnv1bfskbt \
> --discovery-token-ca-cert-hash sha256:859476404c5ed23aef3d78e3b8fc09684dde7fbf8ce0ad446774162ccec2e0d1
W0319 09:28:32.847414 24070 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方法:
[root@Node-1 ~]# vim /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
[root@Node-1 ~]# sysctl -p /etc/sysctl.d/k8s.conf
docker info命令执行失败
[root@cloud-master ~]# kubeadm config print init-defaults > kubeadm-init.yaml
W0322 11:02:47.598862 1997 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': exit status 1
解决办法
[root@cloud-master ~]# docker info
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
[root@cloud-master ~]# systemctl start docker
节点NotReady状态
查看节点状态得知均为NotReady
状态,coredns
pod状态为Pending,kubelet
日志存在报错。
[root@cloud-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
cloud-master NotReady control-plane,master 3h11m v1.20.15
cloud-node1 NotReady <none> 49s v1.20.15
cloud-node2 NotReady <none> 46s v1.20.15
[root@cloud-master ~]# kubectl get pod -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-54d67798b7-r8bcx 0/1 Pending 0 3h11m <none> <none> <none> <none>
kube-system coredns-54d67798b7-tvgdh 0/1 Pending 0 3h11m <none> <none> <none> <none>
kube-system etcd-cloud-master 1/1 Running 1 3h11m 192.168.230.201 cloud-master <none> <none>
kube-system kube-apiserver-cloud-master 1/1 Running 1 3h11m 192.168.230.201 cloud-master <none> <none>
kube-system kube-controller-manager-cloud-master 1/1 Running 1 3h11m 192.168.230.201 cloud-master <none> <none>
kube-system kube-proxy-hm2gr 1/1 Running 0 102s 192.168.230.203 cloud-node2 <none> <none>
kube-system kube-proxy-lsddr 1/1 Running 1 3h11m 192.168.230.201 cloud-master <none> <none>
kube-system kube-proxy-npdg6 1/1 Running 0 105s 192.168.230.202 cloud-node1 <none> <none>
kube-system kube-scheduler-cloud-master 1/1 Running 1 3h11m 192.168.230.201 cloud-master <none> <none>
[root@cloud-master ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Wed 2023-03-22 13:00:54 CST; 3h 1min ago
Docs: https://kubernetes.io/docs/
Main PID: 981 (kubelet)
CGroup: /system.slice/kubelet.service
└─981 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_c...
`Mar 22 16:01:31 cloud-master kubelet[981]: W0322 16:01:31.649411 981 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
Mar 22 16:01:33 cloud-master kubelet[981]: E0322 16:01:33.763948 981 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized`
解决办法:安装网络插件Flannel、Calico、Weave Net、Cilium、Kube-router、Romana、Canal、Contiv、Nuage Networks VSP等