作者:手机用户2502870105 | 来源:互联网 | 2023-05-19 08:44
一、分布式集群成员及所有框架flume-1.5.0-cdh5.3.6-binhbase-0.98.6-cdh5.3.6hue-3.7.0-cdh5.3.6sqoop-1.4.5-cdh5.3
一、分布式集群成员及所有框架
flume-1.5.0-cdh5.3.6-bin
hbase-0.98.6-cdh5.3.6
hue-3.7.0-cdh5.3.6
sqoop-1.4.5-cdh5.3.6( mysql-connector-java-5.1.31.jar)
hadoop-2.5.0-cdh5.3.6
hive-0.13.1-cdh5.3.6 ( mysql-connector-java-5.1.31.jar)
jdk1.7.0_67
zookeeper-3.4.5-cdh5.3.6
二、集群规划
rainbow.com.cn01 rainbow.com.cn02 rainbow.com.cn03
zookeeper zookeeper zookeeper
namenode namenode
datanode datanode datanode
resourcemanager resourcemanager
nodemanager nodemanager nodemanager
historysever
Hmaster (Hmaster) (Master)
hregionserver hregionserver hregionserver
三、hue搭建
1、前提以上框架已搭建ok
2、解压安装hue-3.7.0-cdh5.3.6
3、修改配置文件hue.ini
(1)hue.ini----[desktop]
[desktop]
# Set this to a random string, the longer the better.
# This is used for secure hashing in the session store.
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn
# Webserver listens on this address and port
http_host=rainbow.com.cn01
http_port=8888
# Time zone name
time_zOne=Asia/Shanghai
(2)与hdfs集成
hue.ini----[hdfs clusters]-----ns1
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://ns1:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://ns1:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# Default umask for file and directory creation, specified in an octal value.
## umask=022
# Directory of the Hadoop configuration
hadoop_conf_dir=/opt/modules/hadoop-2.5.0-cdh5.3.6/etc/hadoop
hadoop_hdfs_home=/opt/modules/hadoop-2.5.0-cdh5.3.6
hadoop_bin=/opt/modules/hadoop-2.5.0-cdh5.3.6/bin
修改 hadoop---etc/hadoop/core-site.xml
hadoop.proxyuser.hue.hosts
*
hadoop.proxyuser.hue.groups
*
修改 hadoop---etc/hadoop/hdfs-site.xml
dfs.webhdfs.enabled
true
将hadoop/etc/hadoop下面的hdfs-site.xml、core-site.xml yarn-site.xml
复制到hue/conf目录下
$ cp hdfs-site.xml core-site.xml yarn-site.xml /opt/modules/hue-3.7.0-cdh5.3.6/desktop/conf/
每次配置完一项都重新启动hue与相应的框架并观察http://rainbow.com.cn03:8888页面。
(3)与yarn集成 --hue.ini
[yarn_clusters]
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=rainbow.com.cn03
# The port where the ResourceManager IPC listens on
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Resource Manager logical name (required for HA)
## logical_name=
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
resourcemanager_api_url=http://rainbow.com.cn03:8088
# URL of the ProxyServer API
proxy_api_url=http://rainbow.com.cn03:8088
# URL of the HistoryServer API
history_server_api_url=http://rainbow.com.cn03:19888
# In secure mode (HTTPS), if SSL certificates from Resource Manager's
# Rest Server have to be verified against certificate authority
## ssl_cert_ca_verify=False
# HA support by specifying multiple clusters
# e.g.
[[[ha]]]
# Resource Manager logical name (required for HA) !!!注意热备
logical_name=rmcluster
# Configuration for MapReduce (MR1)
(4)与hive集成
1、hive-site.xml
远程登录配置
hive.server2.long.polling.timeout
5000
Time in milliseconds that HiveServer2 will wait, before responding to asynchronous calls that use long polling
hive.server2.thrift.port
10000
Port number of HiveServer2 Thrift interface.
Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT
hive.server2.thrift.bind.host
rainbow.com.cn01
Bind host on which to run the HiveServer2 Thrift interface.
Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST
2、hue.ini
[beeswax]
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=rainbow.com.cn01
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/opt/modules/hive-0.13.1-cdh5.3.6/conf/
(5)与RDBMS集成
修改hue.ini
[librdbms]
# The RDBMS app can have any number of databases configured in the databases
# section. A database is known by its section name
# (IE sqlite, mysql, psql, and oracle in the list below).
[[databases]]
# sqlite configuration.
## [[[sqlite]]]
# Name to show in the UI.
nice_name=SQLite
# For SQLite, name defines the path to the database.
name=/opt/modules/hue-3.7.0-cdh5.3.6/desktop/desktop.db
# Database backend to use.
engine=sqlite
# Database options to send to the server when connecting.
# https://docs.djangoproject.com/en/1.4/ref/databases/
## optiOns={}
# mysql, oracle, or postgresql configuration.
[[[mysql]]] [[[mysql]]]的注释要放开
# Name to show in the UI.
nice_name="My SQL DB"
# For MySQL and PostgreSQL, name is the name of the database.
# For Oracle, Name is instance of the Oracle server. For express edition
# this is 'xe' by default.
##name=mysqldb
# Database backend to use. This can be:
# 1. mysql
# 2. postgresql
# 3. oracle
engine=mysql
# IP or hostname of the database to connect to.
## host=localhost
# Port the database server is listening to. Defaults are:
# 1. MySQL: 3306
# 2. PostgreSQL: 5432
# 3. Oracle Express Edition: 1521
port=3306
# Username to authenticate with when connecting to the database.
user=root
# Password matching the username to authenticate with when
# connecting to the database.
password=root
# Database options to send to the server when connecting.
# https://docs.djangoproject.com/en/1.4/ref/databases/
## optiOns={}
(6)与oozie集成
(7)与hbase集成