热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

GaussDBT分布式集群这样安装部署不踩坑

详述安装步骤以及安装过程



作者介绍

魏斌,新炬网络资深数据库专家,长期服务于运营商、金融、制造业及政企客户。从传统商业DB到开源分布式,均有涉猎及独到见解。职业以来扎根客户一线,对于紧急故障处置及性能问题优化具有丰富经验,尤善于灾备、多中心建设及异构数据迁移。


本文我们将带大家一起进行GaussDB T(旧称GaussDB 100)分布式集群的安装,本次安装示例以单点容灾部署2CN、2DN的集群安装进行。


大伙们,重头戏来了,我们一起来列队整齐划一,一步、两步……


环境介绍


  • 系统版本:RedHat7.5 X86 64

  • 数据库版本:GaussDB100 V1.0.0

  • 节点数:4个

  • 部署方案:



  • IP及主机名:



192.168.57.21 gaussdb11.localdomain  gaussdb11

192.168.57.22 gaussdb12.localdomain  gaussdb12

192.168.57.23 gaussdb13.localdomain  gaussdb13

192.168.57.24 gaussdb14.localdomain  gaussdb14


一、开启root用户远程登录权限并关闭selinux


1、编辑sshd_config文件



vi etc/ssh/sshd_config


2、修改PermitRootLogin配置,允许用户远程登录


可以使用以下两种方式实现:


1)注释掉"PermitRootLogin no"



#PermitRootLogin no


2)将PermitRootLogin改为yes



PermitRootLogin yes


3、修改Banner配置,去掉连接到系统时,系统提示的欢迎信息


注释掉"Banner"所在的行:



#Banner none


4、修改PasswordAuthentication配置,允许用户登录时进行密码鉴权,退出保存


将PasswordAuthentication改为yes:



PasswordAuthentication yes


5、重启sshd服务,并使用root用户身份重新登录



#service sshd restart


如果执行命令后返回提示信息Redirecting to bin/systemctl restart sshd.service,则执行如下命令:



#/bin/systemctl restart sshd.service


6、关闭selinux



#vi etc/selinux/config

SELINUX=disabled


二、关闭系统防火墙并disable



# systemctl stop firewalld.service

# systemctl disable firewalld.service



三、安装系统包


本次使用ISO介质配置yum源,用于数据库安装依赖包的安装。


在/etc/rc.local文件末尾写入一行:



mount dev/cdrom mnt


保证每次系统启动的时候都能把光盘里面的内容挂载到/mnt目录中。

   

1、配置yum源


将原先的yum源备份,新建一个yum源:



cd etc/yum.repos.d

mkdir bak

mv redhat* ./bak

vi iso.repo

[root@gaussdb11 yum.repos.d]# cat iso.repo 

[rhel-iso]

name=Red Hat Enterprise Linux - Source


baseurl=file:///mnt

enabled=1

gpgcheck=0

gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release


2、查看package



#yum list


yum install -y zlib readline gcc

yum install -y python python-devel

yum install perl-ExtUtils-Embed

yum install -y readline-devel

yum install -y zlib-devel

yum install -y lsof  


3、验证包是否安装



rpm -qa --queryformat "%{NAME}-%{VERSION}-%{RELEASE} (%{ARCH})
" | grep -E "zlib|readline|gcc

|python|python-devel|perl-ExtUtils-Embed|readline-devel|zlib-devel"  


四、准备及安装


1、创建存放安装包的目录并解压安装包(任一主机操作)



su - root

mkdir -p opt/software/gaussdb 

cd opt/software/gaussdb

tar -zxvf GaussDB_100_1.0.0-CLUSTER-REDHAT7.5-64bit.tar.gz

vi clusterconfig.xml                 --创建集群配置文件

内容如下:

 

 

 

 

 

 

 

 

 

 

 


 

 

 

 

 


 

 

 

 


 

 

 


 

 

 

 

 

 


 

 

 

 

 


 

 

 

 


 

 

 

 

 


 

 

 

 

 

 


 

 

 

 

 


 

 

 

 

 

 


 

 

 


 

 

 

 

 


 

 

 

给目录赋权

chmod -R 755 opt/software  


2、确认集群各节点root密码一致,因脚本互信配置需密码一致。如果不能修改密码,请提前手工完成root用户的互信配置


3、使用gs_preinstall准备好安装环境



su - root

cd opt/software/gaussdb/script

--预安装配置环境

./gs_preinstall -U omm -G dbgrp -X opt/software/gaussdb/clusterconfig.xml  


示例:



4、查看预安装日志发现有安装环境时钟同步不一致警告,需要进行NTP设置



5、配置NTP,节点1作为NTP服务器,其他节点同步节点1


1)安装ntp



yum -y install ntp  


2)节点1/etc/ntp.conf新增如下内容



server 127.0.0.1

fudge 127.0.0.1 stratum 10

restrict 192.168.57.21 nomodify notrap nopeer noquery                       <<====当前节点IP地址

restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap                  <<====集群所在网段的网关(Gateway),子网掩码(Genmask)  


3)其他节点/etc/ntp.conf新增如下内容


节点2:



server 192.168.57.21                                                        <<====同步NTP服务器的IP

Fudge 192.168.57.21 stratum 10                                              <<====同步NTP服务器的IP

restrict 192.168.57.22 nomodify notrap nopeer noquery

restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap  


节点3:



server 192.168.57.21

Fudge 192.168.57.21 stratum 10

restrict 192.168.57.23 nomodify notrap nopeer noquery

restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap  


节点4:



server 192.168.57.21

Fudge 192.168.57.21 stratum 10

restrict 192.168.57.24 nomodify notrap nopeer noquery

restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap  


4)启动ntp服务



service ntpd start  


5)查看ntp服务器有无和上层ntp连通



ntpstat  


6)查看ntp服务器与上层ntp的状态



ntpq -p  


7)设置ntp服务开机启动



systemctl enable ntpd  



6、使用gs_checkos检查环境是否符合安装



7、开始安装数据库



su - omm

cd opt/software/gaussdb/script

./gs_install -X opt/software/gaussdb/clusterconfig.xml  


附:


使用gs_uninstall卸载数据库集群:



gs_uninstall --delete-data


或者在集群中每个节点执行本地卸载:



gs_uninstall --delete-data -L


当集群状态不正常,获取不到集群信息时执行如下命令卸载集群:



gs_uninstall --delete-data -X

/opt/software/gaussdb/clusterconfig.xml


或者在集群中每个节点执行本地卸载:



gs_uninstall --delete-data -L -X

/opt/software/gaussdb/clusterconfig.xml



8、检查集群安装成功



注:由于本机内存不够,故将四台虚拟机改为三台虚拟机,并将paxos组网方式改成了ha组网。


附:


1)查看集群状态



gs_om -t status


2)停掉某个主机的所有实例



gs_om -t stop -h gaussdb13


3)启动某个主机的所有实例



gs_om -t start -h gaussdb13


4)DN主备切换,gaussdb13为备DN所在的主机名,DB2_3为要被切换的备DN名称



gs_om -t switch -h gaussdb13 -I DB2_3


5)CM主备切换, gaussdb12为当前备CM所在的主机名称, CM2为gaussdb12主机上的CM实例名称



gs_om -t switch -h gaussdb12 -I CM2


6)启停集群



gs_om -t start

gs_om -t stop


7)启停etcd



gs_om -t startetcd

gs_om -t stopetcd


五、高可用测试


本次测试以模拟节点3宕掉为背景进行。


1、查看主备DN状态,我们可以看到主DN分别为节点2上的DB1_1及节点3上的DB2_3



2、模拟节点3宕掉,停掉节点3上的所有实例



3、节点2上的备DN DB2_4变成主DN



4、启动节点3上的所有实例



5、发现主备库自动追平



6、将DB2_3备DN切成主DN



7、切换成功



六、安装问题大汇总


问题一:预安装报包类型跟CPU类型不一致



[root@gaussdb11 script]# ./gs_preinstall -U omm -G dbgrp -X opt/software/gaussdb/clusterconfig.xml

Parsing the configuration file.

Successfully parsed the configuration file.

Installing the tools>Successfully installed the tools>Are you sure you want to create trust for root (yes/no)? yes

Please enter password for root.

Password: 

Creating SSH trust for the root permission user.

Checking network information.

All nodes in the network are Normal.

Successfully checked network information.

Creating SSH trust.

Creating the local key file.

Successfully created the local key files.

Appending local ID to authorized_keys.

Successfully appended local ID to authorized_keys.

Updating the known_hosts file.

Successfully updated the known_hosts file.

Appending authorized_key>Successfully appended authorized_key>Checking common authentication file content.

Successfully checked common authentication content.

Distributing SSH trust file to all node.

Successfully distributed SSH trust file to all node.

Verifying SSH trust>Successfully verified SSH trust>Successfully created SSH trust.

Successfully created SSH trust for the root permission user.

[GAUSS-52406] : The package type "" is inconsistent with the Cpu type "X86".

[root@gaussdb11 script]#


解决方法:


1)查看preinstall脚本运行日志。路径是clusterconfig.xml中参数gaussdbLogPath对应的路径,在该目录下om/gs_preinstall*.log的前置日志报错如下:



[2019-11-28 22:50:08.335532][gs_preinstall][LOG]:Successfully created SSH trust for the root permission user.

[2019-11-28 22:50:08.992537][gs_preinstall][ERROR]:[GAUSS-52406] : The package type "" is inconsistent with the Cpu type "X86".

Traceback (most recent call last)

   File "./gs_preinstall", line 507, in

   File "/opt/software/gaussdb/script/impl/preinstall/PreinstallImpl.py", line 1861, in run


2)修改/opt/software/gaussdb/script/impl/preinstall/PreinstallImpl.py注释如下行



#self.getAllCpu()


问题二:预安装是报时钟同步告警



A12.[ Time consistency status ]                             : Warning


解决方法:配置NTP同步,配置方法见第四节步骤5。


问题三:安装数据库时报由于权限问题SYSDBA登录失败



[omm@gaussdb11 script]$ ./gs_install -X opt/software/gaussdb/clusterconfig.xml

Parsing the configuration file.

Check preinstall>Successfully checked preinstall>Creating the backup directory.

Successfully created the backup directory.

Check the time difference between hosts in the cluster.

Installing the cluster.

Installing applications>Successfully installed APP.

Distribute etcd communication keys.

Successfully distrbute etcd communication keys.

Initializing cluster instances

.............193s

[FAILURE] gaussdb11:

Using omm:dbgrp to install database.

Using installation program path : home/omm

Initialize GTS1 instance

[GAUSS-51607] : Failed to start zenith instance..Output: 

ZS-00001: no privilege is found

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 


[FAILURE] gaussdb12:

Using omm:dbgrp to install database.

Using installation program path : home/omm

Initialize GTS2 instance

Successfully Initialize GTS2 instance.

Initialize cn_402 instance

[GAUSS-51607] : Failed to start zenith instance..Output: 

ZS-00001: no privilege is found

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 


[FAILURE] gaussdb13:

Using omm:dbgrp to install database.

Using installation program path : home/omm

Initialize DB1_1 instance

[GAUSS-51607] : Failed to start zenith instance..Output: 

ZS-00001: no privilege is found

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 


[FAILURE] gaussdb14:

Using omm:dbgrp to install database.

Using installation program path : home/omm

Initialize DB2_3 instance

[GAUSS-51607] : Failed to start zenith instance..Output: 

ZS-00001: no privilege is found

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 



.[omm@gaussdb11 script]$


分析解决步骤:


1)查看install日志,路径:



cd opt/gaussdb/log/omm/om

[root@gaussdb11 om]# ls -lrt

total 52

-rw-------. 1 omm dbgrp 42006 Dec  1 21:43 gs_local-2019-12-01_213124.log

-rw-------. 1 omm dbgrp  5240 Dec  1 21:44 gs_install-2019-12-01_213118.log

[root@gaussdb11 om]# tail -25 gs_local-2019-12-01_213124.log

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 


[2019-12-01 21:43:26.533606][Install][ERROR]:[GAUSS-51607] : Failed to start zenith instance..Output: 

ZS-00001: no privilege is found

ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect



SQL> 

ZS-00001: connection is not established

SQL> 


Traceback (most recent call last)

   File "/opt/software/gaussdb/script/local/Install.py", line 704, in

   File "/opt/software/gaussdb/script/local/Install.py", line 625, in initInstance

   File "/opt/software/gaussdb/script/local/Install.py", line 614, in __tpInitInstance

   File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/Zenith.py", line 308, in initialize

   File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/CN_OLTP/Zsharding.py", line 62, in initDbInstance

   File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/CN_OLTP/Zsharding.py", line 100, in initZenithInstance

   File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/Zenith.py", line 406, in startInstance


2)查看/opt/gaussdb/log/omm/db_log/GTS1/run/zengine.rlog发现是内存不足导致。



UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|INFO>[PARAM] LOG_HOME             = opt/gaussdb/log/omm/db_log/GTS1

UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|206158456515|INFO>starting instance(nomount)

UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>GS-00001 : Failed to allocate 4592381952 bytes for sga [srv_sga.c:170]

UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>failed to create sga

UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>Instance Startup Failed


3)把所有虚拟机的内存加大即可


本次测试虚拟机内存配置如下,供参考:


  • Gaussdb11:3.9G

  • Gaussdb12:4.9G

  • Gaussdb13:4.9G


问题四:安装报GAUSS-50601


1)安装进度日志:



[omm@gaussdb11 script]$ ./gs_install -X /opt/software/gaussdb/clusterconfig.xml

Parsing the configuration file.

Check preinstall>Successfully checked preinstall>Creating the backup directory.

Successfully created the backup directory.

Check the time difference between hosts in the cluster.

Installing the cluster.

Installing applications>Successfully installed APP.

Distribute etcd communication keys.

Successfully distrbute etcd communication keys.

Initializing cluster instances

390s

[SUCCESS] gaussdb11:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize cn_401 instance

Successfully Initialize cn_401 instance.

Modifying user's environmental variable $GAUSS_ENV.

Successfully modified user's environmental variable $GAUSS_ENV.

[FAILURE] gaussdb12:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize DB1_1 instance

Successfully Initialize DB1_1 instance.

Initialize DB2_4 instance

[GAUSS-50601] : The port [40001] is occupied.

[SUCCESS] gaussdb13:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize DB1_2 instance

Successfully Initialize DB1_2 instance.

Initialize DB2_3 instance

Successfully Initialize DB2_3 instance.

Modifying user's environmental variable $GAUSS_ENV.

Successfully modified user's environmental variable $GAUSS_ENV.


2)查看安装日志发现端口被占用



[omm@gaussdb11 omm]$ tail -300 om/gs_install-2019-12-09_161757.log

[2019-12-09 16:18:15.998104][gs_install][LOG]:Initializing cluster instances

[2019-12-09 16:18:15.999396][gs_install][DEBUG]:Init instance by cmd: source /etc/profile; source /home/omm/.bashrc;python '/opt/software/gaussdb/script/local/Install.py' -t init_instance -U omm:dbgrp -X /opt/software/gaussdb/clusterconfig.xml -l /opt/gaussdb/log/omm/om/gs_local.log  --autostart=yes  --alarm=/opt/huawei/snas/bin/snas_cm_cmd 

[2019-12-09 16:24:49.689716][gs_install][ERROR]:[SUCCESS] gaussdb11:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize cn_401 instance

Successfully Initialize cn_401 instance.

Modifying user's environmental variable $GAUSS_ENV.

Successfully modified user's environmental variable $GAUSS_ENV.

[FAILURE] gaussdb12:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize DB1_1 instance

Successfully Initialize DB1_1 instance.

Initialize DB2_4 instance

[GAUSS-50601] : The port [40001] is occupied.

[SUCCESS] gaussdb13:

Using omm:dbgrp to install database.

Using installation program path : /home/omm

Initialize DB1_2 instance

Successfully Initialize DB1_2 instance.

Initialize DB2_3 instance

Successfully Initialize DB2_3 instance.

Modifying user's environmental variable $GAUSS_ENV.

Successfully modified user's environmental variable $GAUSS_ENV.


Traceback (most recent call last)

   File "./gs_install", line 281, in

   File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 93, in run

   File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 193, in doDeploy

   File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 291, in doInstall


[root@gaussdb12 om]# netstat -na |grep 40001

tcp        0      0 192.168.57.22:40001     0.0.0.0:*               LISTEN     

tcp        0      0 127.0.0.1:40001         0.0.0.0:*               LISTEN


3)卸载然后修改clusterconfig.xml文件,将节点3的DN端口改成50000继续,注意检查所有节点50000端口是否被占用。



su - omm

./gs_uninstall --delete-data -X /opt/software/gaussdb/clusterconfig.xml

vi clusterconfig.xml


 

                                          <<=================端口从40001修改成50000

 


问题五、安装过程中报节点1的sha256文件不存在,集群安装失败


解决方法:从其他节点把文件scp过来即可



su - omm

cd /opt/software/gaussdb

scp *.sha256 gaussdb11:/opt/software/gaussdb


后续将带来GaussDB T的其他干货,欢迎关注一起交流学习。  






推荐阅读
  • RHEL/CentOS/Fedora Linux命令下安装Google Chrome
    键入以下命令查看当前版本是32还是64$echoYouareusing$(getconfLONG_BIT)bitLinuxdistro.You64 ... [详细]
  • 前言可能你并不太了解msys2,但是作为一个程序员,你一定知道mingw,而msys2就集成了mingw,同时msys2还有一些其他的特性,例如包管理器等。msys2可以在wind ... [详细]
  • 我是python小白一枚,对kivy开发手机app产生了兴趣,并没感觉到kivy写代码有多难,折腾打包成手机apk倒是花了好长时间,走过了大大小小的坑,这里把经验记录下来,供大家参 ... [详细]
  • 本文介绍了Hyperledger Fabric外部链码构建与运行的相关知识,包括在Hyperledger Fabric 2.0版本之前链码构建和运行的困难性,外部构建模式的实现原理以及外部构建和运行API的使用方法。通过本文的介绍,读者可以了解到如何利用外部构建和运行的方式来实现链码的构建和运行,并且不再受限于特定的语言和部署环境。 ... [详细]
  • 本文介绍了Perl的测试框架Test::Base,它是一个数据驱动的测试框架,可以自动进行单元测试,省去手工编写测试程序的麻烦。与Test::More完全兼容,使用方法简单。以plural函数为例,展示了Test::Base的使用方法。 ... [详细]
  • Linux如何安装Mongodb的详细步骤和注意事项
    本文介绍了Linux如何安装Mongodb的详细步骤和注意事项,同时介绍了Mongodb的特点和优势。Mongodb是一个开源的数据库,适用于各种规模的企业和各类应用程序。它具有灵活的数据模式和高性能的数据读写操作,能够提高企业的敏捷性和可扩展性。文章还提供了Mongodb的下载安装包地址。 ... [详细]
  • 开发网站你需要知晓的部分专用术语
      越来越多的企业和个人都在拥有属于自己的网站门户,首当其冲的就是你得知晓几个网站方面的专业术语,先是中就有好多的客户不明白这些,造成误会是正常的,那不如我们对它有个大致的了解,这样就不容易感觉 ... [详细]
  • packagetest;importjava.io.FileInputStream;importjava.io.FileOutputStream;importjava.io.IOE ... [详细]
  • hibernate映射组件映射
    在Hibernate中,component是某个实体的逻辑组成部分,它与实体的根本区别是没有oid(对象标识符),compo ... [详细]
  • 如何理解MyBatis动态SQL
    本篇内容主要讲解“如何理解MyBatis动态SQL”,感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习“如何理解M ... [详细]
  • ARToolKitunity
    ARToolKit为开源的AR库,相对于高通和easyAr有几点特点:1)开源2)识别项目可以动态添加(详细在后)3)识别文件可以本地生成4)目前只能识别图片(目前为.jpg格式) ... [详细]
  • Lunix历史及如何学习
    1.Lunix是什么1.1Lunix是操作系统还是应用程序Lunix是一套操作系统,它提供了一个完整的操作系统当中最底层的硬件控制与资源管理的完整架构, ... [详细]
  • 本文讨论了在数据库打开和关闭状态下,重新命名或移动数据文件和日志文件的情况。针对性能和维护原因,需要将数据库文件移动到不同的磁盘上或重新分配到新的磁盘上的情况,以及在操作系统级别移动或重命名数据文件但未在数据库层进行重命名导致报错的情况。通过三个方面进行讨论。 ... [详细]
  • Windows 7 部署工具DISM学习(二)添加补丁的步骤详解
    本文详细介绍了在Windows 7系统中使用部署工具DISM添加补丁的步骤。首先需要将光驱中的安装文件复制到指定文件夹,并进行挂载。然后将需要的MSU补丁解压并集成到系统中。文章给出了具体的命令和操作步骤,帮助读者完成补丁的添加过程。 ... [详细]
  • 本文介绍了在CentOS上安装Python2.7.2的详细步骤,包括下载、解压、编译和安装等操作。同时提供了一些注意事项,以及测试安装是否成功的方法。 ... [详细]
author-avatar
苦逼的码农
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有