高可用
Patroni+Etcd常用命令
所有配置了etcd服务的服务器上用root启动服务:
systemctl start etcd
如需开机自启动etcd服务,用root执行以下命令:
systemctl enable etcd
查询etcd运行状 态:
$HALO_BASE/product/shield/etcd/v3.5.2/etcdctl endpoint status --cluster=true -w table
增加etcd节点:
etcdctl member add etcd_node4 --peer-urls="http://192.168.63.138:2380"
删除etcd节点:
etcdctl member remove ID
在所有主备服务器上用root启动patroni服务:
systemctl start patroni
如需开机自启动patroni服务,用root执行以下命令:
systemctl enable patroni
主备节点查询:
patronictl list
停止自动主备切换:
patronictl pause
开启自动主备切换:
patronictl resume
手动主备切换:
patronictl switchover
Master [node1]:
Candidate ['node2', 'node3'] []: node2
When should the switchover take place (e.g. 2022-05-10T13:57 ) [now]:
Current cluster topology
重启其中一个节点:
patronictl restart halo-cluster node3
When should the restart take place (e.g. 2022-05-10T13:58) [now]:
Are you sure you want to restart members node3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []:
Success: restart on member node3
改VIP 的ip值
vi /u01/app/halo/product/shield/patroni/scripts/patroni_callback.sh
增加Patroni从节点
为高可用环境增加从节点
搭建新从库流复制
pg_basebackup -F p -X stream -v -P -h 192.168.63.134 -p 1921 -U replica -D /data/halo -R -C --slot d
配置新从节点Patroni服务
[root@D ~]# su - halo -c "/u01/app/halo/product/shield/patroni/conf/patroni_config.sh"
Starting to run patroni configuration setup
Please set HALO_BASE, HALO_HOME, PGDATA environment variables before proceed
Press y/Y to continue, any other key to cancel
y
Please input IP list where etcd cluster will be running
e.g. 192.168.1.1,192.168.1.2,192.168.1.3
192.168.63.134,192.168.63.135,192.168.63.136,192.168.63.138
Input current node name
Default: D
Input VIP for the HA cluster
192.168.63.137
192.168.63.137 will be used as VIP
Input network interface to bind the VIP
Default: ens33
ens33 will be used as network interface
Input VIP netmask
Default: 255.255.255.0
255.255.255.0 will be used as VIP netmask
Input VIP broadcast address
Default: 192.168.63.255
192.168.63.255 will be used as VIP broadcast address
Initialize python ...
Python initializing done.
Initializing done.
Following steps to be done manually.
1. Add the following line into the end of .bash_profile of user halo
export PATH=/u01/app/halo/product/shield/patroni/python/bin:\$PATH
export PATRONICTL_CONFIG_FILE=/u01/app/halo/product/shield/patroni/conf/patroni_halo.yml
2. Run following commands as root
ln -s /u01/app/halo/product/shield/patroni/conf/patroni.service /usr/lib/systemd/system/patroni.service
3. Add the following line to /etc/sudoers
halo ALL=(ALL) NOPASSWD: /usr/sbin/ip, /usr/bin/arping, /usr/sbin/iptables
使用root用户启动Patroni服务
systemctl start patroni
主从节点查看Patoni状态
[root@D ~]# su - halo -c "patronictl list"
+--------------+---------------------+---------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+ Cluster: halo-cluster (7278223077580441725) -+---------+----+-----------+
| D | 192.168.63.138:1921 | Replica | running | 8 | 0 |
| halo_node431 | 192.168.63.134:1921 | Leader | running | 8 | |
| halo_node768 | 192.168.63.135:1921 | Replica | running | 8 | 0 |
| halo_node864 | 192.168.63.136:1921 | Replica | running | 8 | 0 |
+--------------+---------------------+---------+---------+----+-----------+
删除Patroni从节点
Systemctl stop patroni --直接停止Patroni服务
ETCD集群添加、删除节点
Etcd 最少需要三个节点且为奇数来进行 leader 选举。一般可 以和Halo数据库部署在相同的服务器上。
现有集群中添加新从节点
[halo@A ~]$ etcdctl member add etcd_node4 --peer-urls="http://192.168.63.138:2380"
Member 733ab0a74571fc23 added to cluster 5a516dc7fa11b364
ETCD_NAME="etcd_node4"
ETCD_INITIAL_CLUSTER="etcd_node4=http://192.168.63.138:2380,etcd_node1=http://192.168.63.134:2380,etcd_node3=http://192.168.63.136:2380,etcd_node2=http://192.168.63.135:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.63.138:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
新节点的ETCD配置文件必须包括以上输出内容
查看当前集群信息
[halo@A ~]$ etcdctl member list --write-out=table
+------------------+-----------+------------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+-----------+------------+----------------------------+----------------------------+------------+
| 733ab0a74571fc23 | unstarted | | http://192.168.63.138:2380 | | false |
| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |
| b159f239465eb878 | started | etcd_node3 | http://192.168.63.136:2380 | http://192.168.63.136:2379 | false |
| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |
+------------------+-----------+------------+----------------------------+----------------------------+------------+
新从节点配置ETCD
[root@D conf]# su - halo -c "/u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd_config.sh"
Starting to run etcd configuration setup
Please set HALO_BASE environment variables before proceed
Press y/Y to continue, any other key to cancel
y
Please input IP list where etcd cluster will be running
e.g. 192.168.1.1,192.168.1.2,192.168.1.3
192.168.63.134,192.168.63.135,192.168.63.136,192.168.63.138
192.168.63.138 will be used as node IP
Initializing done.
Following steps to be done manually.
Run following commands as root
ln -s /u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd.service /usr/lib/systemd/system/etcd.service
修改新节点ETCD配置文件内容
[root@D conf]# cat /u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd.conf
ETCD_DATA_DIR="/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node4.etcd"
ETCD_LISTEN_PEER_URLS="http://192.168.63.138:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.63.138:2379,http://127.0.0.1:2379"
ETCD_NAME="etcd_node4"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.63.138:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.63.138:2379"
ETCD_INITIAL_CLUSTER="etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380"
ETCD_INITIAL_CLUSTER_TOKEN="halo-etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new" --配置需要修改为existing
# ETCD_ENABLE_V2="true"
ETCD_LOG_OUTPUTS="/u01/app/halo/product/shield/etcd/v3.5.2/logs/etcd.log"
ETCD_LOG_LEVEL="warn"
ETCD_ENABLE_LOG_ROTATION="true"
ETCD_LOG_ROTATION_CONFIG_JSON='{"maxsize": 50, "maxage": 10, "maxbackups": 0, "localtime": false, "compress": true}'
启动新节点ETCD服务
systemctl start etcd
再次查看集群状态
[halo@A ~]$ etcdctl member list --write-out=table
+------------------+---------+------------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+------------+----------------------------+----------------------------+------------+
| 733ab0a74571fc23 | started | etcd_node4 | http://192.168.63.138:2380 | http://192.168.63.138:2379 | false |
| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |
| b159f239465eb878 | started | etcd_node3 | http://192.168.63.136:2380 | http://192.168.63.136:2379 | false |
| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |
+------------------+---------+------------+----------------------------+----------------------------+------------+
删除原ETCD节点
[halo@A ~]$ etcdctl member remove b159f239465eb878
Member b159f239465eb878 removed from cluster 5a516dc7fa11b364
[halo@A ~]$ etcdctl member list --write-out=table
+------------------+---------+------------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+------------+----------------------------+----------------------------+------------+
| 733ab0a74571fc23 | started | etcd_node4 | http://192.168.63.138:2380 | http://192.168.63.138:2379 | false |
| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |
| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |
+------------------+---------+------------+----------------------------+----------------------------+------------+
ETCD备份、恢复
备份ETCD集群
etcdctl --endpoints="http://192.168.63.134:2379" --debug snapshot save /tmp/etcd-snapshot-`date +%Y%m%d`.db
恢复备份
恢复前准备工作
需要停止集群所有ETCD服务
systemctl stop etcd
移除各节点原有ETCD存储目录下数据
[halo@A data]$cd /u01/app/halo/product/shield/etcd/v3.5.2/data
[halo@A data]$ mv etcd_node1.etcd etcd_node1.etcd.bak
拷贝ETCD备份文件
scp etcd-snapshot-20230918.db root@192.168.63.135:/tmp/
scp etcd-snapshot-20230918.db root@192.168.63.138:/tmp/
各节点上恢复备份
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \
--name etcd_node1 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.134:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node1.etcd
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \
--name etcd_node2 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.135:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node2.etcd
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \
--name etcd_node4 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.138:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node4.etcd
恢复完成后,依次启动ETCD服务
systemctl start etcd
检查 ETCD 集群状态
etcdctl --endpoints=http://192.168.63.134:2379,http://192.168.63.135:2379,http://192.168.63.138:2379 endpoint health
http://192.168.63.134:2379 is healthy: successfully committed proposal: took = 9.491216ms
http://192.168.63.138:2379 is healthy: successfully committed proposal: took = 9.750314ms
http://192.168.63.135:2379 is healthy: successfully committed proposal: took = 9.46672ms
注意:备份ETCD集群时,只需要备份一个ETCD就行,恢复时,拿同一份备份数据恢复。