1 前言 {#1-%E5%89%8D%E8%A8%80}
随着 DataEase 每月版本迭代,吸引了越来越多的用户体验功能,大量用户不单单提出了优化建议,也提出部署方案的优化。
从一开始只有一种部署方案那就是本地模式,后来衍变出三种部署方案:精简模式,本地模式,集群模式。
本篇文章详细说明集群部署中的重要组件遇到的问题以及解决方案。
2 MySQL 部署 {#2-mysql-%E9%83%A8%E7%BD%B2}
2.1 安装报错:Public key for mysql-community-libs-compat-5.7.37-1.el7.x86_64.rpm is not installed {#2.1-%E5%AE%89%E8%A3%85%E6%8A%A5%E9%94%99%EF%BC%9Apublic-key-for-mysql-community-libs-compat-5.7.37-1.el7.x86_64.rpm-is-not-installed}
报错原因:MySQL 的 GPG 升级了,需要重新获取。
解决方法:
rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022
#再重新执行
yum install mysql-server
2.2 查看状态报错:Slave_IO_Running:Connecting {#2.2-%E6%9F%A5%E7%9C%8B%E7%8A%B6%E6%80%81%E6%8A%A5%E9%94%99%EF%BC%9Aslave_io_running%EF%BC%9Aconnecting}
报错原因:
- 网络不通。
- 防火墙端口未开放。
解决方法:
- 网络不通:用ping指令尝试是否可以接收到返回的数据。
- 开放指定端口号。
2.3 执行命令报错:依赖检测失败 {#2.3-%E6%89%A7%E8%A1%8C%E5%91%BD%E4%BB%A4%E6%8A%A5%E9%94%99%EF%BC%9A%E4%BE%9D%E8%B5%96%E6%A3%80%E6%B5%8B%E5%A4%B1%E8%B4%A5}
报错原因:检查到依赖之间的关系不符合标准
解决方法:直接跳过检查依赖关系,并且强制安装
#原命令
rpm -ivh mysql-community-libs-5.7.12-1.el6.x86_64.rpm
#新命令,去除检查
rpm -ivh mysql-community-libs-5.7.12-1.el6.x86_64.rpm --force --nodeps
2.4 启动 MySQL 失败,报错 Failed to start SYSV:MySQL database server... {#2.4-%E5%90%AF%E5%8A%A8-mysql-%E5%A4%B1%E8%B4%A5%EF%BC%8C%E6%8A%A5%E9%94%99-failed-to-start-sysv%EF%BC%9Amysql-database-server%E2%80%A6}
报错原因:未创建启动文件夹与未授权
解决方法:
mkdir -p /var/run/mysqld/
chown mysql.mysql /var/run/mysqld/
3 Redis 部署 {#3-redis-%E9%83%A8%E7%BD%B2}
3.1 部署报错:-bash: docker-compose: command not found {#3.1-%E9%83%A8%E7%BD%B2%E6%8A%A5%E9%94%99%EF%BC%9A-bash%3A-docker-compose%3A-command-not-found}
报错原因:没有安装 docker-compose 程序。
解决方法:
#先检查是否已有 pip3
pip3 -V
#安装 pip3
yum -y install epel-release
yum -y install python3-pip
#升级 pip
pip3 install --upgrade pip -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
#安装 Docker-Compose: pip3 install docker-compose -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
3.2 部署报错:docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) {#3.2-%E9%83%A8%E7%BD%B2%E6%8A%A5%E9%94%99%EF%BC%9Adocker.errors.dockerexception%3A-error-while-fetching-server-api-version%3A-(%E2%80%98connection-aborted.%E2%80%99%2C-filenotfounderror(2%2C-%E2%80%98no-such-file-or-directory%E2%80%99))}
报错原因: Docker 没有启动或者 Docker 没有安装。
解决方法:
#启动 Docker
systemctl start docker
#查看 docker 进程
ps -ef | grep docker
#执行 docker-compose up -d
#安装 Docker
wget https://download.docker.com/linux/static/stable/x86_64/docker-18.06.3-ce.tgz && tar -zxvf docker-18.06.3-ce.tgz && cp docker/* /usr/bin/
#在/etc/systemd/system/目录下新增docker.service文件,内容如下,这样可以将docker注册为service服务
vi /etc/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
the default is not to use systemd for cgroups because the delegate issues still
exists and systemd currently does not support the cgroup feature set required
for containers run by docker
ExecStart=/usr/bin/dockerd --selinux-enabled=false --insecure-registry=127.0.0.1
ExecReload=/bin/kill -s HUP $MAINPID
Having non-zero Limit*s causes performance problems due to accounting overhead
in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
Uncomment TasksMax if your systemd version supports it.
Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
kill only the docker process, not all processes in the cgroup
KillMode=process
restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
#授权 Docker 服务并且启动
chmod +x /etc/systemd/system/docker.service && systemctl daemon-reload && systemctl start docker && systemctl enable docker.service && systemctl status docker
4 Kettle 部署 {#4-kettle-%E9%83%A8%E7%BD%B2}
4.1 执行 Mount a 命令报错:mount.nsf Connection Refused {#4.1-%E6%89%A7%E8%A1%8C-mount-a-%E5%91%BD%E4%BB%A4%E6%8A%A5%E9%94%99%EF%BC%9Amount.nsf-connection-refused}
报错原因:端口不通,无法连接 NFS 服务端的 111 端口和 2049 端口。
解决方法:
关闭 NFS 防火墙或者防火墙开启并且映射出 111 与 2049 端口。
#查看防火状态
systemctl status firewalld
service iptables status
#暂时关闭防火墙
systemctl stop firewalld
service iptables stop
#永久关闭防火墙 systemctl disable firewalld chkconfig iptables off
5 Keepalived 部署 {#5-keepalived-%E9%83%A8%E7%BD%B2}
5.1 部署报错:ERROR: cannot verify www.keepalived.org's certificate, issued by '/C=US/O=Let's Encrypt/CN=R3':Issued certificate has expired. To connect to www.keepalived.org insecurely, use `--no-check-certificate' {#5.1-%E9%83%A8%E7%BD%B2%E6%8A%A5%E9%94%99%EF%BC%9Aerror%3A-cannot-verify-www.keepalived.org%E2%80%99s-certificate%2C-issued-by-%E2%80%98%2Fc%3Dus%2Fo%3Dlet%E2%80%99s-encrypt%2Fcn%3Dr3%E2%80%99%3Aissued-certificate-has-expired.-to-connect-to-www.keepalived.org-insecurely%2C-use-%60%E2%80%93no-check-certificate%E2%80%99}
报错原因:进行对 https 站点的请求要检查相关证书。
解决方法:
#跳过证书检测
yum install -y ca-certificates