ClickHouse 副本的目的主要是保障数据的高可用性,即使一台ClickHouse节点宕机,那么也可以从其他服务器获得相同的数据。官网地址如下:
https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
启动zk集群
[root@hadoop201 hui]# jps.sh
-----------hadoop201---------------
2885 Jps
2247 QuorumPeerMain
-----------hadoop202---------------
1559 QuorumPeerMain
2124 Jps
-----------hadoop203---------------
1561 QuorumPeerMain
2123 Jps
在hadoop201的/etc/clickhouse-server/config.d目录下创建一个名为metrika.xml的配置文件,内容如下:
[root@hadoop201 hui]# vim /etc/clickhouse-server/config.d/metrika.xml
<?xml version="1.0"?>
<yandex>
<zookeeper-servers>
<node index="1">
<host>hadoop201host>
<port>2181port>
node>
<node index="2">
<host>hadoop202host>
<port>2181port>
node>
<node index="3">
<host>hadoop203host>
<port>2181port>
node>
zookeeper-servers>
yandex>
修改文件所属组
[root@hadoop201 config.d]# chown clickhouse:clickhouse metrika.xml
分发
[root@hadoop201 hui]# sudo /home/hui/bin/sxync.sh /etc/clickhouse-server/config.d/metrika.xml
在 hadoop201的/etc/clickhouse-server/config.xml中增加 大约719行左右
<include_from>/etc/clickhouse-server/config.d/metrika.xmlinclude_from>
分发 config.xml
[root@hadoop201 hui]# /home/hui/bin/sxync.sh /etc/clickhouse-server/config.xml
分别在hadoop201 hadoop202和hadoop203 上启动ClickHouse服务
[hui@hadoop201 ~]$ chkall.sh sudo systemctl start clickhouse-server
在hadoop201和hadoop202上分别建表,副本只能同步数据,不能同步表结构,所以我们需要在每台机器上自己手动建表
hadoop201 :) ccreate table t_order_rep (
:-] id UInt32,
:-] sku_id String,
:-] total_amount Decimal(16,2),
:-] create_time Datetime
:-] ) engine =ReplicatedMergeTree('/clickhouse/tables/01/t_order_rep','rep_201')
:-] partition by toYYYYMMDD(create_time)
:-] primary key (id)
:-] order by (id,sku_id);
CREATE TABLE t_order_rep
(
`id` UInt32,
`sku_id` String,
`total_amount` Decimal(16, 2),
`create_time` Datetime
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/t_order_rep', 'rep_202')
PARTITION BY toYYYYMMDD(create_time)
PRIMARY KEY id
ORDER BY (id, sku_id)
Ok.
0 rows in set. Elapsed: 0.173 sec.
hadoop202 建表
hadoop202 :) ccreate table t_order_rep (
:-] id UInt32,
:-] sku_id String,
:-] total_amount Decimal(16,2),
:-] create_time Datetime
:-] ) engine =ReplicatedMergeTree('/clickhouse/tables/01/t_order_rep','rep_202')
:-] partition by toYYYYMMDD(create_time)
:-] primary key (id)
:-] order by (id,sku_id);
CREATE TABLE t_order_rep
(
`id` UInt32,
`sku_id` String,
`total_amount` Decimal(16, 2),
`create_time` Datetime
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/t_order_rep', 'rep_203')
PARTITION BY toYYYYMMDD(create_time)
PRIMARY KEY id
ORDER BY (id, sku_id)
ReplicatedMergeTree 中,第一个参数是分片的zk_path一般按照: /clickhouse/table/{shard}/{table_name} 的格式写,如果只有一个分片就写01即可。第二个参数是副本名称,相同的分片副本名称不能相同
hadoop201 :) ((102,'sku_002',2000.00,'2022-05-18 13:00:00'),
:-] ((102,'sku_002',12000.00,'2022-05-18 13:00:00'),
:-] ((102,'sku_002',600.00,'2022-05-19 12:00:00');
Syntax error: failed at position 2:
(102,'sku_002',2000.00,'2022-05-18 13:00:00'), (102,'sku_002',12000.00,'2022-05-18 13:00:00'), (102,'sku_002',600.00,'2022-05-19 12:00:00');
Expected one of: SELECT subquery, SELECT query, SELECT, list of elements, WITH, SELECT query, possibly with UNION, SELECT query, subquery, possibly with UNION
hadoop201 :)
hadoop201 :) iinsert into t_order_rep values
:-] ((101,'sku_001',1000.00,'2022-05-18 12:00:00'),
:-] ((102,'sku_002',2000.00,'2022-05-18 12:00:00'),
:-] ((103,'sku_004',2500.00,'2022-05-18 12:00:00'),
:-] ((104,'sku_002',2000.00,'2022-05-18 12:00:00'),
:-] ((105,'sku_003',600.00,'2022-05-19 12:00:00');(105,'sku_003',600.00,'2022-05-19 12:00:00');
INSERT INTO t_order_rep VALUES
Ok.
5 rows in set. Elapsed: 0.289 sec.
在hadoop202上执行select,可以查询出结果,说明副本配置正确
hadoop202 :) select * from t_order_rep;
SELECT *
FROM t_order_rep
┌──id─┬─sku_id──┬─total_amount─┬─────────create_time─┐
│ 105 │ sku_003 │ 600.00 │ 2022-05-19 12:00:00 │
└─────┴─────────┴──────────────┴─────────────────────┘
┌──id─┬─sku_id──┬─total_amount─┬─────────create_time─┐
│ 101 │ sku_001 │ 1000.00 │ 2022-05-18 12:00:00 │
│ 102 │ sku_002 │ 2000.00 │ 2022-05-18 12:00:00 │
│ 103 │ sku_004 │ 2500.00 │ 2022-05-18 12:00:00 │
│ 104 │ sku_002 │ 2000.00 │ 2022-05-18 12:00:00 │
└─────┴─────────┴──────────────┴─────────────────────┘
5 rows in set. Elapsed: 0.010 sec.