跳到主要内容
版本:1.0.14

高可用

Patroni+Etcd常用命令

所有配置了etcd服务的服务器上用root启动服务:

systemctl start etcd

如需开机自启动etcd服务,用root执行以下命令:

systemctl enable etcd

查询etcd运行状态:

$HALO_BASE/product/shield/etcd/v3.5.2/etcdctl endpoint status --cluster=true -w table

增加etcd节点:

etcdctl member add etcd_node4 --peer-urls="http://192.168.63.138:2380"

删除etcd节点:

etcdctl member remove ID

在所有主备服务器上用root启动patroni服务:

systemctl start patroni

如需开机自启动patroni服务,用root执行以下命令:

systemctl enable patroni

主备节点查询:

patronictl list

停止自动主备切换:

patronictl pause

开启自动主备切换:

patronictl resume

手动主备切换:

patronictl switchover

Master [node1]:

Candidate ['node2', 'node3'] []: node2

When should the switchover take place (e.g. 2022-05-10T13:57 ) [now]:

Current cluster topology

重启其中一个节点:

patronictl restart halo-cluster node3
When should the restart take place (e.g. 2022-05-10T13:58) [now]:
Are you sure you want to restart members node3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []:
Success: restart on member node3

改VIP 的ip值
vi /u01/app/halo/product/shield/patroni/scripts/patroni_callback.sh

增加Patroni从节点

为高可用环境增加从节点

搭建新从库流复制

pg_basebackup -F p -X stream -v -P -h 192.168.63.134 -p 1921 -U replica -D /data/halo -R -C --slot d

配置新从节点Patroni服务

[root@D ~]# su - halo -c "/u01/app/halo/product/shield/patroni/conf/patroni_config.sh"

Starting to run patroni configuration setup

Please set HALO_BASE, HALO_HOME, PGDATA environment variables before proceed

Press y/Y to continue, any other key to cancel

y

Please input IP list where etcd cluster will be running

e.g. 192.168.1.1,192.168.1.2,192.168.1.3

192.168.63.134,192.168.63.135,192.168.63.136,192.168.63.138

Input current node name

Default: D

Input VIP for the HA cluster

192.168.63.137

192.168.63.137 will be used as VIP

Input network interface to bind the VIP

Default: ens33

ens33 will be used as network interface

Input VIP netmask

Default: 255.255.255.0

255.255.255.0 will be used as VIP netmask

Input VIP broadcast address

Default: 192.168.63.255

192.168.63.255 will be used as VIP broadcast address

Initialize python ...

Python initializing done.

Initializing done.

Following steps to be done manually.

1. Add the following line into the end of .bash_profile of user halo

export PATH=/u01/app/halo/product/shield/patroni/python/bin:\$PATH

export PATRONICTL_CONFIG_FILE=/u01/app/halo/product/shield/patroni/conf/patroni_halo.yml

2. Run following commands as root

ln -s /u01/app/halo/product/shield/patroni/conf/patroni.service /usr/lib/systemd/system/patroni.service

3. Add the following line to /etc/sudoers

halo ALL=(ALL) NOPASSWD: /usr/sbin/ip, /usr/bin/arping, /usr/sbin/iptables

使用root用户启动Patroni服务

systemctl start patroni

主从节点查看Patoni状态

[root@D ~]# su - halo -c "patronictl list"

+--------------+---------------------+---------+---------+----+-----------+

| Member | Host | Role | State | TL | Lag in MB |

+ Cluster: halo-cluster (7278223077580441725) -+---------+----+-----------+

| D | 192.168.63.138:1921 | Replica | running | 8 | 0 |

| halo_node431 | 192.168.63.134:1921 | Leader | running | 8 | |

| halo_node768 | 192.168.63.135:1921 | Replica | running | 8 | 0 |

| halo_node864 | 192.168.63.136:1921 | Replica | running | 8 | 0 |

+--------------+---------------------+---------+---------+----+-----------+

删除Patroni从节点

Systemctl stop patroni --直接停止Patroni服务

ETCD集群添加、删除节点

Etcd 最少需要三个节点且为奇数来进行 leader 选举。一般可以和Halo数据库部署在相同的服务器上。

现有集群中添加新从节点

[halo@A ~]$ etcdctl member add etcd_node4 --peer-urls="http://192.168.63.138:2380"

Member 733ab0a74571fc23 added to cluster 5a516dc7fa11b364

ETCD_NAME="etcd_node4"

ETCD_INITIAL_CLUSTER="etcd_node4=http://192.168.63.138:2380,etcd_node1=http://192.168.63.134:2380,etcd_node3=http://192.168.63.136:2380,etcd_node2=http://192.168.63.135:2380"

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.63.138:2380"

ETCD_INITIAL_CLUSTER_STATE="existing"

新节点的ETCD配置文件必须包括以上输出内容

查看当前集群信息

[halo@A ~]$ etcdctl member list --write-out=table

+------------------+-----------+------------+----------------------------+----------------------------+------------+

| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |

+------------------+-----------+------------+----------------------------+----------------------------+------------+

| 733ab0a74571fc23 | unstarted | | http://192.168.63.138:2380 | | false |

| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |

| b159f239465eb878 | started | etcd_node3 | http://192.168.63.136:2380 | http://192.168.63.136:2379 | false |

| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |

+------------------+-----------+------------+----------------------------+----------------------------+------------+

新从节点配置ETCD

[root@D conf]# su - halo -c "/u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd_config.sh"

Starting to run etcd configuration setup

Please set HALO_BASE environment variables before proceed

Press y/Y to continue, any other key to cancel

y

Please input IP list where etcd cluster will be running

e.g. 192.168.1.1,192.168.1.2,192.168.1.3

192.168.63.134,192.168.63.135,192.168.63.136,192.168.63.138

192.168.63.138 will be used as node IP

Initializing done.

Following steps to be done manually.

Run following commands as root

ln -s /u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd.service /usr/lib/systemd/system/etcd.service

修改新节点ETCD配置文件内容

[root@D conf]# cat /u01/app/halo/product/shield/etcd/v3.5.2/conf/etcd.conf

ETCD_DATA_DIR="/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node4.etcd"

ETCD_LISTEN_PEER_URLS="http://192.168.63.138:2380"

ETCD_LISTEN_CLIENT_URLS="http://192.168.63.138:2379,http://127.0.0.1:2379"

ETCD_NAME="etcd_node4"

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.63.138:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://192.168.63.138:2379"

ETCD_INITIAL_CLUSTER="etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380"

ETCD_INITIAL_CLUSTER_TOKEN="halo-etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new" --配置需要修改为existing

# ETCD_ENABLE_V2="true"

ETCD_LOG_OUTPUTS="/u01/app/halo/product/shield/etcd/v3.5.2/logs/etcd.log"

ETCD_LOG_LEVEL="warn"

ETCD_ENABLE_LOG_ROTATION="true"

ETCD_LOG_ROTATION_CONFIG_JSON='{"maxsize": 50, "maxage": 10, "maxbackups": 0, "localtime": false, "compress": true}'

启动新节点ETCD服务

systemctl start etcd

再次查看集群状态

[halo@A ~]$ etcdctl member list --write-out=table

+------------------+---------+------------+----------------------------+----------------------------+------------+

| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |

+------------------+---------+------------+----------------------------+----------------------------+------------+

| 733ab0a74571fc23 | started | etcd_node4 | http://192.168.63.138:2380 | http://192.168.63.138:2379 | false |

| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |

| b159f239465eb878 | started | etcd_node3 | http://192.168.63.136:2380 | http://192.168.63.136:2379 | false |

| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |

+------------------+---------+------------+----------------------------+----------------------------+------------+

删除原ETCD节点

[halo@A ~]$ etcdctl member remove b159f239465eb878

Member b159f239465eb878 removed from cluster 5a516dc7fa11b364

[halo@A ~]$ etcdctl member list --write-out=table

+------------------+---------+------------+----------------------------+----------------------------+------------+

| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |

+------------------+---------+------------+----------------------------+----------------------------+------------+

| 733ab0a74571fc23 | started | etcd_node4 | http://192.168.63.138:2380 | http://192.168.63.138:2379 | false |

| a0220f1a96be1741 | started | etcd_node1 | http://192.168.63.134:2380 | http://192.168.63.134:2379 | false |

| ff561bac97ec17fc | started | etcd_node2 | http://192.168.63.135:2380 | http://192.168.63.135:2379 | false |

+------------------+---------+------------+----------------------------+----------------------------+------------+

ETCD备份、恢复

备份ETCD集群

etcdctl --endpoints="http://192.168.63.134:2379" --debug snapshot save /tmp/etcd-snapshot-`date +%Y%m%d`.db

恢复备份

恢复前准备工作
需要停止集群所有ETCD服务

systemctl stop etcd

移除各节点原有ETCD存储目录下数据
[halo@A data]$cd /u01/app/halo/product/shield/etcd/v3.5.2/data
[halo@A data]$ mv etcd_node1.etcd etcd_node1.etcd.bak

拷贝ETCD备份文件
scp etcd-snapshot-20230918.db root@192.168.63.135:/tmp/
scp etcd-snapshot-20230918.db root@192.168.63.138:/tmp/

各节点上恢复备份
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \
--name etcd_node1 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.134:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node1.etcd
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \

--name etcd_node2 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.135:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node2.etcd
etcdctl snapshot restore /tmp/etcd-snapshot-20230918.db \

--name etcd_node4 \
--initial-cluster "etcd_node1=http://192.168.63.134:2380,etcd_node2=http://192.168.63.135:2380,etcd_node3=http://192.168.63.136:2380,etcd_node4=http://192.168.63.138:2380" \
--initial-cluster-token halo-etcd-cluster \
--initial-advertise-peer-urls http://192.168.63.138:2380 \
--data-dir=/u01/app/halo/product/shield/etcd/v3.5.2/data/etcd_node4.etcd

恢复完成后,依次启动ETCD服务
systemctl start etcd

检查 ETCD 集群状态
etcdctl --endpoints=http://192.168.63.134:2379,http://192.168.63.135:2379,http://192.168.63.138:2379 endpoint health
http://192.168.63.134:2379 is healthy: successfully committed proposal: took = 9.491216ms
http://192.168.63.138:2379 is healthy: successfully committed proposal: took = 9.750314ms
http://192.168.63.135:2379 is healthy: successfully committed proposal: took = 9.46672ms

注意:备份ETCD集群时,只需要备份一个ETCD就行,恢复时,拿同一份备份数据恢复。