作者:刘自龙Sophisten | 来源:互联网 | 2023-08-17 11:45
一 spark、hadoop、yarn关系
spark :计算
hadoop:存储
yarn: 资源管理
在这里主要配置hdfs和yarn
hdfs
yarn
mapreduce(计算框架, spark)
yarn: 主进程:resourcemanager
yarn的开: sbin/start-yarn.sh
yarn的关闭:sbin/stop-yarn.sh
登录的url:http://localhost:8088
hdfs:
namenode 进程
datanode 进程
dfs的开与关: sbin/start-dfs.sh , sbin/stop-dfs.sh
登录url:http://localhost:50070
二 hadoop2.6集群环境搭建
hadoop 下载,解压
设置环境变量
2.1 HADOOP_HOME 设置
2.2 hadoop_CONF_DIR 设置 $HADOOP_HOME/etc/hadoop
2.3 YARN_CONF_DIR 设置 $HADOOP_HOME/etc/hadoop
具体见:
vim ~/bashrc
export JAVA_HOME=/usr/lib/java/jdk1.8.0_45
export JRE_HOME=${JAVA_HOME}/jre
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export CLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
core-site.xml 设置
<configuration><property><name>fs.default.namename><value>hdfs://localhost:9000value>property><property><name>hadoop.tmp.dirname><value>/usr/local/hadoop/hadoop-2.6.0/tmpvalue>property>configuration>
- hdfs-site.xml 设置
<configuration><property><name>dfs.replicationname><value>1value>property><property><name>dfs.name.dirname><value>/usr/local/hadoop/hadoop-2.6.0/dfs/namevalue>property><property><name>dfs.data.dirname><value>/usr/local/hadoop/hadoop-2.6.0/dfs/datavalue>property>configuration>
- mapred-site.xml 设置
<configuration><property><name>mapred.job.trackername><value>localhost:9001value>property>configuration>
- hadoop-env.sh设置 $JAVA_HOME
export JAVA_HOME&#61;/usr/lib/java/jdk1.8.0_45
- YARN-env.sh设置 $JAVA_HOME
export JAVA_HOME&#61;/usr/lib/java/jdk1.8.0_45