搭建pgreplication集群

作者：wen-1225 | 来源：互联网 | 2023-09-25 18:59

[1]https:wiki.postgresql.orgwikiBinary_Replication_Tutorial#5_Minutes_to_Simple_Replicatio

[1] https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial#5_Minutes_to_Simple_Replication

[2] https://wiki.postgresql.org/wiki/Hot_Standby

[3] postgresql96 remark

[4] https://www.postgresql.org/docs/13/runtime-config-replication.html, 背下来

hot standby

本节参考[2].

primary:

initdb -D /mnt/sdb/pg13data/12345 # vi postgresql.conf port = 12345 wal_level = hot_standby # In releases prior to 9.6, this parameter also allowed the values archive and hot_standby. These are still accepted but mapped to replica. -- [PostgreSQL: Documentation: 14: 20.5. Write Ahead Log](https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL) archive_mode = on archive_command = 'cp -i %p /mnt/sdb/pg13data/12345-archive/%f' # -i, 如果真有重复会怎样todo max_wal_senders = 3 mkdir /mnt/sdb/pg13data/12345-archive pg_ctl -D /mnt/sdb/pg13data/12345 -l /mnt/sdb/pg13data/12345.log start

standby, 在同一机器:

# https://www.postgresql.org/docs/13/app-pgbasebackup.html pg_basebackup -D /mnt/sdb/pg13data/12345-standby -w -R --wal-method=stream --dbname="host=127.0.0.1 user=wang port=12345" # vi postgresql.conf hot_standby = on port = 12346 # standby_mode = 'on' # The standby_mode setting has been removed. A standby.signal file in the data directory is used instead. See Standby Server Operation for details. -- [PostgreSQL: Documentation: 13: O.1. recovery.conf file merged into postgresql.conf](https://www.postgresql.org/docs/13/recovery-config.html) restore_command = 'cp -i /mnt/sdb/pg13data/12345-archive/%f %p' pg_ctl -D /mnt/sdb/pg13data/12345-standby/ -l /mnt/sdb/pg13data/12345-standby.log start psql -p 12346 postgres postgres=# select datid,pid,application_name,client_addr,wait_event_type,wait_event,state,query from pg_stat_activity where application_name <> 'psql'; datid | pid | application_name | client_addr | wait_event_type | wait_event | state | query -------+-------+------------------+-------------+-----------------+-------------------+-------+------- | 15927 | | | Activity | RecoveryWalStream | | | 15932 | | | Activity | BgWriterHibernate | | | 15930 | | | Activity | CheckpointerMain | | | 15937 | | | Activity | WalReceiverMain | | (4 rows)

standby log:

2022-02-09 17:50:59.764 CST [14490] LOG: database system is shut down 2022-02-09 17:51:06.168 CST [15314] LOG: starting PostgreSQL 13.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2022-02-09 17:51:06.169 CST [15314] LOG: listening on IPv6 address "::1", port 12346 2022-02-09 17:51:06.169 CST [15314] LOG: listening on IPv4 address "127.0.0.1", port 12346 2022-02-09 17:51:06.176 CST [15314] LOG: listening on Unix socket "/tmp/.s.PGSQL.12346" 2022-02-09 17:51:06.183 CST [15315] LOG: database system was shut down in recovery at 2022-02-09 17:50:59 CST cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/00000002.history’: No such file or directory 2022-02-09 17:51:06.186 CST [15315] LOG: entering standby mode 2022-02-09 17:51:06.198 CST [15315] LOG: restored log file "000000010000000000000004" from archive 2022-02-09 17:51:06.236 CST [15315] LOG: consistent recovery state reached at 0/40000A0 2022-02-09 17:51:06.236 CST [15315] LOG: redo starts at 0/40000A0 2022-02-09 17:51:06.237 CST [15314] LOG: database system is ready to accept read only connections 2022-02-09 17:51:06.248 CST [15315] LOG: restored log file "000000010000000000000005" from archive cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/000000010000000000000006’: No such file or directory 2022-02-09 17:51:06.270 CST [15315] LOG: invalid record length at 0/6000060: wanted 24, got 0 2022-02-09 17:51:06.274 CST [15326] LOG: started streaming WAL from primary at 0/6000000 on timeline 1

warm standby

Hot Standby is identical to Warm Standby, except that the Standby is available to run read-only queries. This offers all of the advantages of Warm Standby, plus the ability to distribute some business workload to the Standby server(s). Hot Standby by itself requires Log Shipping.[1]

standby设置:

hot_standby = off

这个选项需要重启standby. reload不管用.

log:

2022-04-19 11:07:15.187 CST [30472] LOG: received SIGHUP, reloading configuration files 2022-04-19 11:07:15.189 CST [30472] LOG: parameter "hot_standby" cannot be changed without restarting the server

warm standby无法连接, 提示错误:

$ psql -p 12346 postgres psql: error: FATAL: the database system is starting up

测试standby不在线一段时间后, 再启动standby

先把standby停止.

primary设置, 并reload, 否则会等待standby确认:

synchronous_commit = on # synchronization level; synchronous_standby_names = '' # standby servers that provide sync rep

pg_ctl -D 12345 -l 12345.log reload pgbench -p12345 -i -s10 postgres pgbench -p12345 -c2 -j1 -T180 -P1 postgres

可以看到有wal被archive了

$ ll 12345-archive/ | wc 402 3611 28501 $ ll 12345-archive/ | wc 403 3620 28572 $ ll 12345-archive/ | wc 406 3647 28785

启动standby: pg_ctl -D 12345-standby -l 12345-standby.log restart

standby.log

2022-04-19 11:28:31.250 CST [30877] LOG: listening on Unix socket "/tmp/.s.PGSQL.12346" 2022-04-19 11:28:31.267 CST [30878] LOG: database system was shut down in recovery at 2022-04-19 11:17:11 CST cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/00000002.history’: No such file or directory 2022-04-19 11:28:31.283 CST [30878] LOG: entering standby mode 2022-04-19 11:28:31.314 CST [30878] LOG: restored log file "000000010000000100000082" from archive 2022-04-19 11:28:31.332 CST [30878] DEBUG: got WAL segment from archive 2022-04-19 11:28:31.332 CST [30878] DEBUG: checkpoint record is at 1/82000028 2022-04-19 11:28:31.332 CST [30878] DEBUG: redo record is at 1/82000028; shutdown true 2022-04-19 11:28:31.333 CST [30878] DEBUG: next transaction ID: 61904; next OID: 29111 2022-04-19 11:28:31.333 CST [30878] DEBUG: next MultiXactId: 1; next MultiXactOffset: 0 2022-04-19 11:28:31.333 CST [30878] DEBUG: oldest unfrozen transaction ID: 478, in database 1 2022-04-19 11:28:31.333 CST [30878] DEBUG: oldest MultiXactId: 1, in database 1 2022-04-19 11:28:31.333 CST [30878] DEBUG: commit timestamp Xid oldest/newest: 0/0 2022-04-19 11:28:31.333 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1 2022-04-19 11:28:31.333 CST [30878] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1 2022-04-19 11:28:31.333 CST [30878] DEBUG: starting up replication slots 2022-04-19 11:28:31.340 CST [30878] DEBUG: resetting unlogged relations: cleanup 1 init 0 2022-04-19 11:28:31.341 CST [30878] LOG: consistent recovery state reached at 1/820000A0 2022-04-19 11:28:31.341 CST [30878] LOG: redo starts at 1/820000A0 2022-04-19 11:28:31.367 CST [30878] LOG: restored log file "000000010000000100000083" from archive 2022-04-19 11:28:31.385 CST [30878] DEBUG: got WAL segment from archive 2022-04-19 11:28:31.385 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1 2022-04-19 11:28:31.385 CST [30878] CONTEXT: WAL redo at 1/83000028 for XLOG/CHECKPOINT_SHUTDOWN: redo 1/83000028; tli 1; prev tli 1; fpw true; xid 0:61904; oid 29111; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown 2022-04-19 11:28:31.415 CST [30878] LOG: restored log file "000000010000000100000084" from archive 2022-04-19 11:28:31.434 CST [30878] DEBUG: got WAL segment from archive 2022-04-19 11:28:31.434 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1 2022-04-19 11:28:31.434 CST [30878] CONTEXT: WAL redo at 1/84000028 for XLOG/CHECKPOINT_SHUTDOWN: redo 1/84000028; tli 1; prev tli 1; fpw true; xid 0:61904; oid 29111; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown 2022-04-19 11:28:31.602 CST [30878] LOG: restored log file "000000010000000100000085" from archive 2022-04-19 11:28:31.613 CST [30878] DEBUG: got WAL segment from archive ...此处省略N个restored log file... 2022-04-19 11:28:34.070 CST [30878] LOG: restored log file "000000010000000100000094" from archive 2022-04-19 11:28:34.098 CST [30878] DEBUG: got WAL segment from archive cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/000000010000000100000095’: No such file or directory 2022-04-19 11:28:34.364 CST [30906] LOG: started streaming WAL from primary at 1/95000000 on timeline 1 2022-04-19 11:33:31.603 CST [30882] DEBUG: performing replication slot checkpoint

restore_command, archive_cleanup_command

如果主备不在同一天机器, 应该使用scp等命令. 或其他方式可以让standby访问primary的archive.

primary:

-archive_command = 'cp -i %p /mnt/sdb/pg13data/12345-archive/%f' +archive_command = 'scp %p vm101:/mnt/sdb/pg13data/12345-archive/%f'

standby:

restore_command = 'cp -i /mnt/sdb/pg13data/12345-archive/%f %p' +archive_cleanup_command = 'pg_archivecleanup /mnt/sdb/pg13data/12345-archive %r' # command to execute at every restartpoint

standby启动后, 会先找archive, 再找pg_wal, 再启用流复制.

archive由standby来清理. 如果有多个standby, 不能这么做, 因为各个standby同步进度不一样.

pg_archivecleanup is designed to be used as an archive_cleanup_command to clean up WAL file archives when running as a standby server (see Section 26.2).

-- PostgreSQL: Documentation: 13: pg_archivecleanup

The pg_archivecleanup utility is designed specifically to be used with archive_cleanup_command in typical single-standby configurations.

-- PostgreSQL: Documentation: 13: 26.2. Log-Shipping Standby Servers

replication slot

https://www.postgresql.org/docs/13/view-pg-replication-slots.html

https://www.postgresql.org/docs/13/warm-standby.html#STREAMING-REPLICATION-SLOTS

primary创建slot:

standby配置, reload:

primary_slot_name = 'slot1'

primary:

使用slot,在stadnby不在线时, wal不会被archive

-rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000DD -rw-------. 1 fang fang 16777216 Apr 20 14:36 0000000100000001000000DE -rw-------. 1 fang fang 16777216 Apr 20 14:36 0000000100000001000000DF -rw-------. 1 fang fang 16777216 Apr 20 14:37 0000000100000001000000E0 -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E1 -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E2 -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E3 -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E4 drwx------. 2 fang fang 4096 Apr 20 14:39 archive_status -rw-------. 1 fang fang 16777216 Apr 20 14:42 0000000100000001000000E5 $ ll -lrt 12345/pg_wal/ | wc 39 344 2717

把standby起来后, 等待同步完成, 在primary做一次checkpoint, 就可以把wal归档:

-rw-------. 1 fang fang 16777216 Apr 20 14:29 0000000100000001000000EC -rw-------. 1 fang fang 16777216 Apr 20 14:30 0000000100000001000000E9 -rw-------. 1 fang fang 16777216 Apr 20 14:33 0000000100000001000000E7 -rw-------. 1 fang fang 16777216 Apr 20 14:34 0000000100000001000000EB -rw-------. 1 fang fang 16777216 Apr 20 14:34 0000000100000001000000F0 -rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000F1 -rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000EF -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000ED -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000F4 -rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E8 -rw-------. 1 fang fang 16777216 Apr 20 15:56 0000000100000001000000F6 drwx------. 2 fang fang 4096 Apr 20 15:58 archive_status -rw-------. 1 fang fang 16777216 Apr 20 15:58 0000000100000001000000E6 $ ll -lrt 12345/pg_wal/ | wc 20 173 1368

restartpoint

The checkpointer process and the background writer process are active during recovery. The checkpointer process will perform restartpoints (similar to checkpoints on the primary) and the background writer process will perform normal block cleaning activities. This can include updates of the hint bit information stored on the standby server. The CHECKPOINT command is accepted during recovery, though it performs a restartpoint rather than a new checkpoint.

-- PostgreSQL: Documentation: 13: 26.5. Hot Standby

If executed during recovery, the CHECKPOINT command will force a restartpoint (see Section 29.4) rather than writing a new checkpoint.

-- PostgreSQL: Documentation: 13: CHECKPOINT

In archive recovery or standby mode, the server periodically performs restartpoints, which are similar to checkpoints in normal operation: the server forces all its state to disk, updates the pg_control file to indicate that the already-processed WAL data need not be scanned again, and then recycles any old log segment files in the pg_wal directory. Restartpoints can't be performed more frequently than checkpoints in the master because restartpoints can only be performed at checkpoint records. A restartpoint is triggered when a checkpoint record is reached if at least checkpoint_timeout seconds have passed since the last restartpoint, or if WAL size is about to exceed max_wal_size. However, because of limitations on when a restartpoint can be performed, max_wal_size is often exceeded during recovery, by up to one checkpoint cycle's worth of WAL. (max_wal_size is never a hard limit anyway, so you should always leave plenty of headroom to avoid running out of disk space.)

-- PostgreSQL: Documentation: 13: 29.4. WAL Configuration

Continuous Archiving in Standby

https://www.postgresql.org/docs/13/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY

sync and async replication

[3]p248

If synchronous_standby_names is empty, the only meaningful settings are on and off; remote_apply, remote_write and local all provide the same local synchronization level as on.

-- PostgreSQL: Documentation: 13: 19.5. Write Ahead Log

todo

synchronous_standby_names = '*',到底是什么意思? 任何一个standby确认了就行? 代码搜sync_standby_priority, SyncRepStandbyNames, SyncRepGetCandidateStandbys. SyncRepInitConfig-->SyncRepGetStandbyPriority; 看了代码,大致逻辑是,根据_names来确定priority,第一个是1,第二个是2,..., replication walsender启动时,会检查自己的名字在第一个,即确定自己的priority. 如果_names是'*', 那么所有walsender的priority都是1. 找不到自己的名字, 0. 解析_names还有一个专门的grammar: syncrep_gram.y.

The special entry * matches any standby name.

-- PostgreSQL: Documentation: 13: 19.6. Replication

推荐阅读

replace
PTArchiver工作原理详解与应用分析

PTArchiver工作原理及其应用分析本文详细解析了PTArchiver的工作机制，探讨了其在数据归档和管理中的应用。PTArchiver通过高效的压缩算法和灵活的存储策略，实现了对大规模数据的高效管理和长期保存。文章还介绍了其在企业级数据备份、历史数据迁移等场景中的实际应用案例，为用户提供了实用的操作建议和技术支持。 ... [详细]

蜡笔小新 2024-11-11 13:40:49
string
IOS Run loop详解

为什么80%的码农都做不了架构师？转自http:blog.csdn.netztp800201articledetails9240913感谢作者分享Objecti ... [详细]

蜡笔小新 2024-11-13 12:14:35
string
《程序员的数学》：一本适合初学者的数学指南

结城浩（1963年7月出生），日本资深程序员和技术作家，居住在东京武藏野市。他开发了著名的YukiWiki软件，并在杂志上发表了大量程序入门文章和技术翻译作品。结城浩著有30多本关于编程和数学的书籍，其中许多被翻译成英文和韩文。 ... [详细]

蜡笔小新 2024-11-12 21:47:50
char
网站访问全流程解析

本文详细介绍了从用户在浏览器中输入一个域名（如www.yy.com）到页面完全展示的整个过程，包括DNS解析、TCP连接、请求响应等多个步骤。 ... [详细]

蜡笔小新 2024-11-12 18:13:16
char
Pythonmysql数据库

importpymysql#一、直接连接mysql数据库'''coonpymysql.connect(host'192.168.*.*',u ... [详细]

蜡笔小新 2024-11-12 16:51:59
web
从0到1搭建大数据平台

从0到1搭建大数据平台 ... [详细]

蜡笔小新 2024-11-12 15:26:03
replace
javascript分页类支持页码格式

前端时间因为项目需要，要对一个产品下所有的附属图片进行分页显示，没考虑ajax一张张请求，所以干脆一次性全部把图片out，然 ... [详细]

蜡笔小新 2024-11-12 14:58:57
web
探讨HTTP隧道技术在RDP暴力破解中的应用

本文介绍了如何利用HTTP隧道技术在受限网络环境中绕过IDS和防火墙等安全设备，实现RDP端口的暴力破解攻击。文章详细描述了部署过程、攻击实施及流量分析，旨在提升网络安全意识。 ... [详细]

蜡笔小新 2024-11-12 12:08:47
char
php更新数据库字段的函数是,php更新数据库字段的函数是

php更新数据库字段的函数是,php更新数据库字段的函数是 ... [详细]

蜡笔小新 2024-11-12 11:37:31
bash
CentOS 7 中 iptables 过滤表实例与 NAT 表应用详解

在 CentOS 7 系统中，iptables 的过滤表和 NAT 表具有重要的应用价值。本文通过具体实例详细介绍了如何配置 iptables 的过滤表，包括编写脚本文件 `/usr/local/sbin/iptables.sh`，并使用 `iptables -F` 清空现有规则。此外，还深入探讨了 NAT 表的配置方法，帮助读者更好地理解和应用这些网络防火墙技术。 ... [详细]

蜡笔小新 2024-11-11 18:33:22
char
彻底清除数据库中的所有用户表（包括删除相关表结构）

在处理数据库中所有用户表的彻底清除时，目前尚未发现单一命令能够实现这一目标。因此，需要采用一种较为繁琐的方法来逐个删除相关表及其结构。具体操作可以通过编写PL/SQL脚本来实现，该脚本将动态生成并执行删除表的SQL语句。尽管这种方法相对复杂，但在缺乏更简便手段的情况下，仍是一种有效的解决方案。未来或许可以通过数据库管理工具或更高版本的数据库系统提供更简洁的处理方式。 ... [详细]

蜡笔小新 2024-11-11 18:30:49
web
技术分享：使用 Flask、AngularJS 和 Jinja2 构建高效前后端交互系统

技术分享：使用 Flask、AngularJS 和 Jinja2 构建高效前后端交互系统 ... [详细]

蜡笔小新 2024-11-11 15:24:24
string
您的数据库配置是否安全？DBSAT工具助您一臂之力！

本文探讨了Oracle提供的免费工具DBSAT，该工具能够有效协助用户检测和优化数据库配置的安全性。通过全面的分析和报告，DBSAT帮助用户识别潜在的安全漏洞，并提供针对性的改进建议，确保数据库系统的稳定性和安全性。 ... [详细]

蜡笔小新 2024-11-11 14:44:47
char
如何将TS文件转换为M3U8直播流：HLS与M3U8格式详解

在视频传输领域，MP4虽然常见，但在直播场景中直接使用MP4格式存在诸多问题。例如，MP4文件的头部信息（如ftyp、moov）较大，导致初始加载时间较长，影响用户体验。相比之下，HLS（HTTP Live Streaming）协议及其M3U8格式更具优势。HLS通过将视频切分成多个小片段，并生成一个M3U8播放列表文件，实现低延迟和高稳定性。本文详细介绍了如何将TS文件转换为M3U8直播流，包括技术原理和具体操作步骤，帮助读者更好地理解和应用这一技术。 ... [详细]

蜡笔小新 2024-11-11 12:12:04
string
Keepalived 的健康检查机制与配置方法

Keepalived 提供了多种强大且灵活的后端健康检查机制，包括 HTTP_GET、SSL_GET、TCP_CHECK、SMTP_CHECK 和 MISC_CHECK 等多种检测方法。这些健康检查功能确保了高可用性环境中的服务稳定性和可靠性。通过合理配置这些检查方式，可以有效监测后端服务器的状态，及时发现并处理故障，从而提高系统的整体性能和可用性。 ... [详细]

蜡笔小新 2024-11-09 15:46:26

wen-1225

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章