热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

搭建pgreplication集群

[1]https:wiki.postgresql.orgwikiBinary_Replication_Tutorial#5_Minutes_to_Simple_Replicatio

[1] https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial#5_Minutes_to_Simple_Replication

[2] https://wiki.postgresql.org/wiki/Hot_Standby

[3] postgresql96 remark

[4] https://www.postgresql.org/docs/13/runtime-config-replication.html, 背下来


hot standby

本节参考[2].

primary:

initdb -D /mnt/sdb/pg13data/12345
# vi postgresql.conf
port = 12345
wal_level = hot_standby # In releases prior to 9.6, this parameter also allowed the values archive and hot_standby. These are still accepted but mapped to replica. -- [PostgreSQL: Documentation: 14: 20.5. Write Ahead Log](https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL)
archive_mode = on
archive_command = 'cp -i %p /mnt/sdb/pg13data/12345-archive/%f' # -i, 如果真有重复会怎样todo
max_wal_senders = 3
mkdir /mnt/sdb/pg13data/12345-archive
pg_ctl -D /mnt/sdb/pg13data/12345 -l /mnt/sdb/pg13data/12345.log start

standby, 在同一机器:

# https://www.postgresql.org/docs/13/app-pgbasebackup.html
pg_basebackup -D /mnt/sdb/pg13data/12345-standby -w -R --wal-method=stream --dbname="host=127.0.0.1 user=wang port=12345"
# vi postgresql.conf
hot_standby = on
port = 12346
# standby_mode = 'on' # The standby_mode setting has been removed. A standby.signal file in the data directory is used instead. See Standby Server Operation for details. -- [PostgreSQL: Documentation: 13: O.1. recovery.conf file merged into postgresql.conf](https://www.postgresql.org/docs/13/recovery-config.html)
restore_command = 'cp -i /mnt/sdb/pg13data/12345-archive/%f %p'
pg_ctl -D /mnt/sdb/pg13data/12345-standby/ -l /mnt/sdb/pg13data/12345-standby.log start
psql -p 12346 postgres
postgres=# select datid,pid,application_name,client_addr,wait_event_type,wait_event,state,query from pg_stat_activity where application_name <> 'psql';
datid | pid | application_name | client_addr | wait_event_type | wait_event | state | query
-------+-------+------------------+-------------+-----------------+-------------------+-------+-------
| 15927 | | | Activity | RecoveryWalStream | |
| 15932 | | | Activity | BgWriterHibernate | |
| 15930 | | | Activity | CheckpointerMain | |
| 15937 | | | Activity | WalReceiverMain | |
(4 rows)

standby log:

2022-02-09 17:50:59.764 CST [14490] LOG: database system is shut down
2022-02-09 17:51:06.168 CST [15314] LOG: starting PostgreSQL 13.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2022-02-09 17:51:06.169 CST [15314] LOG: listening on IPv6 address "::1", port 12346
2022-02-09 17:51:06.169 CST [15314] LOG: listening on IPv4 address "127.0.0.1", port 12346
2022-02-09 17:51:06.176 CST [15314] LOG: listening on Unix socket "/tmp/.s.PGSQL.12346"
2022-02-09 17:51:06.183 CST [15315] LOG: database system was shut down in recovery at 2022-02-09 17:50:59 CST
cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/00000002.history’: No such file or directory
2022-02-09 17:51:06.186 CST [15315] LOG: entering standby mode
2022-02-09 17:51:06.198 CST [15315] LOG: restored log file "000000010000000000000004" from archive
2022-02-09 17:51:06.236 CST [15315] LOG: consistent recovery state reached at 0/40000A0
2022-02-09 17:51:06.236 CST [15315] LOG: redo starts at 0/40000A0
2022-02-09 17:51:06.237 CST [15314] LOG: database system is ready to accept read only connections
2022-02-09 17:51:06.248 CST [15315] LOG: restored log file "000000010000000000000005" from archive
cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/000000010000000000000006’: No such file or directory
2022-02-09 17:51:06.270 CST [15315] LOG: invalid record length at 0/6000060: wanted 24, got 0
2022-02-09 17:51:06.274 CST [15326] LOG: started streaming WAL from primary at 0/6000000 on timeline 1

warm standby


Hot Standby is identical to Warm Standby, except that the Standby is available to run read-only queries. This offers all of the advantages of Warm Standby, plus the ability to distribute some business workload to the Standby server(s). Hot Standby by itself requires Log Shipping.[1]


standby设置:

hot_standby = off

这个选项需要重启standby. reload不管用.

log:

2022-04-19 11:07:15.187 CST [30472] LOG: received SIGHUP, reloading configuration files
2022-04-19 11:07:15.189 CST [30472] LOG: parameter "hot_standby" cannot be changed without restarting the server

warm standby无法连接, 提示错误:

$ psql -p 12346 postgres
psql: error: FATAL: the database system is starting up

测试standby不在线一段时间后, 再启动standby

先把standby停止.

primary设置, 并reload, 否则会等待standby确认:

synchronous_commit = on # synchronization level;
synchronous_standby_names = '' # standby servers that provide sync rep

pg_ctl -D 12345 -l 12345.log reload
pgbench -p12345 -i -s10 postgres
pgbench -p12345 -c2 -j1 -T180 -P1 postgres

可以看到有wal被archive了

$ ll 12345-archive/ | wc
402 3611 28501
$ ll 12345-archive/ | wc
403 3620 28572
$ ll 12345-archive/ | wc
406 3647 28785

启动standby: pg_ctl -D 12345-standby -l 12345-standby.log restart

standby.log

2022-04-19 11:28:31.250 CST [30877] LOG: listening on Unix socket "/tmp/.s.PGSQL.12346"
2022-04-19 11:28:31.267 CST [30878] LOG: database system was shut down in recovery at 2022-04-19 11:17:11 CST
cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/00000002.history’: No such file or directory
2022-04-19 11:28:31.283 CST [30878] LOG: entering standby mode
2022-04-19 11:28:31.314 CST [30878] LOG: restored log file "000000010000000100000082" from archive
2022-04-19 11:28:31.332 CST [30878] DEBUG: got WAL segment from archive
2022-04-19 11:28:31.332 CST [30878] DEBUG: checkpoint record is at 1/82000028
2022-04-19 11:28:31.332 CST [30878] DEBUG: redo record is at 1/82000028; shutdown true
2022-04-19 11:28:31.333 CST [30878] DEBUG: next transaction ID: 61904; next OID: 29111
2022-04-19 11:28:31.333 CST [30878] DEBUG: next MultiXactId: 1; next MultiXactOffset: 0
2022-04-19 11:28:31.333 CST [30878] DEBUG: oldest unfrozen transaction ID: 478, in database 1
2022-04-19 11:28:31.333 CST [30878] DEBUG: oldest MultiXactId: 1, in database 1
2022-04-19 11:28:31.333 CST [30878] DEBUG: commit timestamp Xid oldest/newest: 0/0
2022-04-19 11:28:31.333 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1
2022-04-19 11:28:31.333 CST [30878] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2022-04-19 11:28:31.333 CST [30878] DEBUG: starting up replication slots
2022-04-19 11:28:31.340 CST [30878] DEBUG: resetting unlogged relations: cleanup 1 init 0
2022-04-19 11:28:31.341 CST [30878] LOG: consistent recovery state reached at 1/820000A0
2022-04-19 11:28:31.341 CST [30878] LOG: redo starts at 1/820000A0
2022-04-19 11:28:31.367 CST [30878] LOG: restored log file "000000010000000100000083" from archive
2022-04-19 11:28:31.385 CST [30878] DEBUG: got WAL segment from archive
2022-04-19 11:28:31.385 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1
2022-04-19 11:28:31.385 CST [30878] CONTEXT: WAL redo at 1/83000028 for XLOG/CHECKPOINT_SHUTDOWN: redo 1/83000028; tli 1; prev tli 1; fpw true; xid 0:61904; oid 29111; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown
2022-04-19 11:28:31.415 CST [30878] LOG: restored log file "000000010000000100000084" from archive
2022-04-19 11:28:31.434 CST [30878] DEBUG: got WAL segment from archive
2022-04-19 11:28:31.434 CST [30878] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1
2022-04-19 11:28:31.434 CST [30878] CONTEXT: WAL redo at 1/84000028 for XLOG/CHECKPOINT_SHUTDOWN: redo 1/84000028; tli 1; prev tli 1; fpw true; xid 0:61904; oid 29111; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown
2022-04-19 11:28:31.602 CST [30878] LOG: restored log file "000000010000000100000085" from archive
2022-04-19 11:28:31.613 CST [30878] DEBUG: got WAL segment from archive
...此处省略N个restored log file...
2022-04-19 11:28:34.070 CST [30878] LOG: restored log file "000000010000000100000094" from archive
2022-04-19 11:28:34.098 CST [30878] DEBUG: got WAL segment from archive
cp: cannot stat ‘/mnt/sdb/pg13data/12345-archive/000000010000000100000095’: No such file or directory
2022-04-19 11:28:34.364 CST [30906] LOG: started streaming WAL from primary at 1/95000000 on timeline 1
2022-04-19 11:33:31.603 CST [30882] DEBUG: performing replication slot checkpoint

restore_command, archive_cleanup_command

如果主备不在同一天机器, 应该使用scp等命令. 或其他方式可以让standby访问primary的archive.

primary:

-archive_command = 'cp -i %p /mnt/sdb/pg13data/12345-archive/%f'
+archive_command = 'scp %p vm101:/mnt/sdb/pg13data/12345-archive/%f'

standby:

restore_command = 'cp -i /mnt/sdb/pg13data/12345-archive/%f %p'
+archive_cleanup_command = 'pg_archivecleanup /mnt/sdb/pg13data/12345-archive %r' # command to execute at every restartpoint

standby启动后, 会先找archive, 再找pg_wal, 再启用流复制.

archive由standby来清理. 如果有多个standby, 不能这么做, 因为各个standby同步进度不一样.


pg_archivecleanup is designed to be used as an archive_cleanup_command to clean up WAL file archives when running as a standby server (see Section 26.2).

-- PostgreSQL: Documentation: 13: pg_archivecleanup



The pg_archivecleanup utility is designed specifically to be used with archive_cleanup_command in typical single-standby configurations.

-- PostgreSQL: Documentation: 13: 26.2. Log-Shipping Standby Servers



replication slot

https://www.postgresql.org/docs/13/view-pg-replication-slots.html

https://www.postgresql.org/docs/13/warm-standby.html#STREAMING-REPLICATION-SLOTS

primary创建slot:

postgres=# SELECT * FROM pg_create_physical_replication_slot('slot1');
slot_name | lsn
-----------+-----
slot1 |
(1 row)
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------
slot1 | | physical | | | f | f | | | | | | |

standby配置, reload:

primary_slot_name = 'slot1'

primary:

postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------
slot1 | | physical | | | f | t | 5254 | | | 1/BC890AF0 | | reserved |

使用slot,在stadnby不在线时, wal不会被archive

-rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000DD
-rw-------. 1 fang fang 16777216 Apr 20 14:36 0000000100000001000000DE
-rw-------. 1 fang fang 16777216 Apr 20 14:36 0000000100000001000000DF
-rw-------. 1 fang fang 16777216 Apr 20 14:37 0000000100000001000000E0
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E1
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E2
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E3
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E4
drwx------. 2 fang fang 4096 Apr 20 14:39 archive_status
-rw-------. 1 fang fang 16777216 Apr 20 14:42 0000000100000001000000E5
$ ll -lrt 12345/pg_wal/ | wc
39 344 2717

把standby起来后, 等待同步完成, 在primary做一次checkpoint, 就可以把wal归档:

-rw-------. 1 fang fang 16777216 Apr 20 14:29 0000000100000001000000EC
-rw-------. 1 fang fang 16777216 Apr 20 14:30 0000000100000001000000E9
-rw-------. 1 fang fang 16777216 Apr 20 14:33 0000000100000001000000E7
-rw-------. 1 fang fang 16777216 Apr 20 14:34 0000000100000001000000EB
-rw-------. 1 fang fang 16777216 Apr 20 14:34 0000000100000001000000F0
-rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000F1
-rw-------. 1 fang fang 16777216 Apr 20 14:35 0000000100000001000000EF
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000ED
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000F4
-rw-------. 1 fang fang 16777216 Apr 20 14:39 0000000100000001000000E8
-rw-------. 1 fang fang 16777216 Apr 20 15:56 0000000100000001000000F6
drwx------. 2 fang fang 4096 Apr 20 15:58 archive_status
-rw-------. 1 fang fang 16777216 Apr 20 15:58 0000000100000001000000E6
$ ll -lrt 12345/pg_wal/ | wc
20 173 1368

restartpoint

The checkpointer process and the background writer process are active during recovery. The checkpointer process will perform restartpoints (similar to checkpoints on the primary) and the background writer process will perform normal block cleaning activities. This can include updates of the hint bit information stored on the standby server. The CHECKPOINT command is accepted during recovery, though it performs a restartpoint rather than a new checkpoint.

-- PostgreSQL: Documentation: 13: 26.5. Hot Standby

If executed during recovery, the CHECKPOINT command will force a restartpoint (see Section 29.4) rather than writing a new checkpoint.

-- PostgreSQL: Documentation: 13: CHECKPOINT

In archive recovery or standby mode, the server periodically performs restartpoints, which are similar to checkpoints in normal operation: the server forces all its state to disk, updates the pg_control file to indicate that the already-processed WAL data need not be scanned again, and then recycles any old log segment files in the pg_wal directory. Restartpoints can't be performed more frequently than checkpoints in the master because restartpoints can only be performed at checkpoint records. A restartpoint is triggered when a checkpoint record is reached if at least checkpoint_timeout seconds have passed since the last restartpoint, or if WAL size is about to exceed max_wal_size. However, because of limitations on when a restartpoint can be performed, max_wal_size is often exceeded during recovery, by up to one checkpoint cycle's worth of WAL. (max_wal_size is never a hard limit anyway, so you should always leave plenty of headroom to avoid running out of disk space.)

-- PostgreSQL: Documentation: 13: 29.4. WAL Configuration


Continuous Archiving in Standby

https://www.postgresql.org/docs/13/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY


sync and async replication



[3]p248


If synchronous_standby_names is empty, the only meaningful settings are on and off; remote_apply, remote_write and local all provide the same local synchronization level as on.

-- PostgreSQL: Documentation: 13: 19.5. Write Ahead Log



todo

synchronous_standby_names = '*',到底是什么意思? 任何一个standby确认了就行? 代码搜sync_standby_priority, SyncRepStandbyNames, SyncRepGetCandidateStandbys. SyncRepInitConfig-->SyncRepGetStandbyPriority; 看了代码,大致逻辑是,根据_names来确定priority,第一个是1,第二个是2,..., replication walsender启动时,会检查自己的名字在第一个,即确定自己的priority. 如果_names是'*', 那么所有walsender的priority都是1. 找不到自己的名字, 0. 解析_names还有一个专门的grammar: syncrep_gram.y.


The special entry * matches any standby name.

-- PostgreSQL: Documentation: 13: 19.6. Replication




推荐阅读
  • Android实战——jsoup实现网络爬虫,糗事百科项目的起步
    本文介绍了Android实战中使用jsoup实现网络爬虫的方法,以糗事百科项目为例。对于初学者来说,数据源的缺乏是做项目的最大烦恼之一。本文讲述了如何使用网络爬虫获取数据,并以糗事百科作为练手项目。同时,提到了使用jsoup需要结合前端基础知识,以及如果学过JS的话可以更轻松地使用该框架。 ... [详细]
  • 本文比较了eBPF和WebAssembly作为云原生VM的特点和应用领域。eBPF作为运行在Linux内核中的轻量级代码执行沙箱,适用于网络或安全相关的任务;而WebAssembly作为图灵完备的语言,在商业应用中具有优势。同时,介绍了WebAssembly在Linux内核中运行的尝试以及基于LLVM的云原生WebAssembly编译器WasmEdge Runtime的案例,展示了WebAssembly作为原生应用程序的潜力。 ... [详细]
  • 向QTextEdit拖放文件的方法及实现步骤
    本文介绍了在使用QTextEdit时如何实现拖放文件的功能,包括相关的方法和实现步骤。通过重写dragEnterEvent和dropEvent函数,并结合QMimeData和QUrl等类,可以轻松实现向QTextEdit拖放文件的功能。详细的代码实现和说明可以参考本文提供的示例代码。 ... [详细]
  • Windows下配置PHP5.6的方法及注意事项
    本文介绍了在Windows系统下配置PHP5.6的步骤及注意事项,包括下载PHP5.6、解压并配置IIS、添加模块映射、测试等。同时提供了一些常见问题的解决方法,如下载缺失的msvcr110.dll文件等。通过本文的指导,读者可以轻松地在Windows系统下配置PHP5.6,并解决一些常见的配置问题。 ... [详细]
  • 本文详细介绍了在ASP.NET中获取插入记录的ID的几种方法,包括使用SCOPE_IDENTITY()和IDENT_CURRENT()函数,以及通过ExecuteReader方法执行SQL语句获取ID的步骤。同时,还提供了使用这些方法的示例代码和注意事项。对于需要获取表中最后一个插入操作所产生的ID或马上使用刚插入的新记录ID的开发者来说,本文提供了一些有用的技巧和建议。 ... [详细]
  • 个人学习使用:谨慎参考1Client类importcom.thoughtworks.gauge.Step;importcom.thoughtworks.gauge.T ... [详细]
  • 本文介绍了一些Java开发项目管理工具及其配置教程,包括团队协同工具worktil,版本管理工具GitLab,自动化构建工具Jenkins,项目管理工具Maven和Maven私服Nexus,以及Mybatis的安装和代码自动生成工具。提供了相关链接供读者参考。 ... [详细]
  • 本文由编程笔记#小编为大家整理,主要介绍了StartingzookeeperFAILEDTOSTART相关的知识,希望对你有一定的参考价值。下载路径:https://ar ... [详细]
  • 解决nginx启动报错epoll_wait() reported that client prematurely closed connection的方法
    本文介绍了解决nginx启动报错epoll_wait() reported that client prematurely closed connection的方法,包括检查location配置是否正确、pass_proxy是否需要加“/”等。同时,还介绍了修改nginx的error.log日志级别为debug,以便查看详细日志信息。 ... [详细]
  • Spring常用注解(绝对经典),全靠这份Java知识点PDF大全
    本文介绍了Spring常用注解和注入bean的注解,包括@Bean、@Autowired、@Inject等,同时提供了一个Java知识点PDF大全的资源链接。其中详细介绍了ColorFactoryBean的使用,以及@Autowired和@Inject的区别和用法。此外,还提到了@Required属性的配置和使用。 ... [详细]
  • CEPH LIO iSCSI Gateway及其使用参考文档
    本文介绍了CEPH LIO iSCSI Gateway以及使用该网关的参考文档,包括Ceph Block Device、CEPH ISCSI GATEWAY、USING AN ISCSI GATEWAY等。同时提供了多个参考链接,详细介绍了CEPH LIO iSCSI Gateway的配置和使用方法。 ... [详细]
  • 纠正网上的错误:自定义一个类叫java.lang.System/String的方法
    本文纠正了网上关于自定义一个类叫java.lang.System/String的错误答案,并详细解释了为什么这种方法是错误的。作者指出,虽然双亲委托机制确实可以阻止自定义的System类被加载,但通过自定义一个特殊的类加载器,可以绕过双亲委托机制,达到自定义System类的目的。作者呼吁读者对网上的内容持怀疑态度,并带着问题来阅读文章。 ... [详细]
  • 本文介绍了在MFC下利用C++和MFC的特性动态创建窗口的方法,包括继承现有的MFC类并加以改造、插入工具栏和状态栏对象的声明等。同时还提到了窗口销毁的处理方法。本文详细介绍了实现方法并给出了相关注意事项。 ... [详细]
  • Postgresql备份和恢复的方法及命令行操作步骤
    本文介绍了使用Postgresql进行备份和恢复的方法及命令行操作步骤。通过使用pg_dump命令进行备份,pg_restore命令进行恢复,并设置-h localhost选项,可以完成数据的备份和恢复操作。此外,本文还提供了参考链接以获取更多详细信息。 ... [详细]
  • JavaWeb中读取文件资源的路径问题及解决方法
    在JavaWeb开发中,读取文件资源的路径是一个常见的问题。本文介绍了使用绝对路径和相对路径两种方法来解决这个问题,并给出了相应的代码示例。同时,还讨论了使用绝对路径的优缺点,以及如何正确使用相对路径来读取文件。通过本文的学习,读者可以掌握在JavaWeb中正确找到和读取文件资源的方法。 ... [详细]
author-avatar
wen-1225
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有