笔记及视频位置:链接:https:pan.baidu.coms11IXcvZZm9DOulUaZVC8l-A密码:p0291.在开始之前:需要修改下主机名 vimetcho
笔记及视频位置:
链接:https://pan.baidu.com/s/11IXcvZZm9DOulUaZVC8l-A 密码:p029
1.在开始之前:需要修改下主机名
vim /etc/hosts
192.168.126.129 master
2.关闭防火墙
你的Linux的版本不同,关闭防火墙的方式不同
-
systemctl status firewalld.service
-
启动一个服务:systemctl start firewalld.service
-
关闭一个服务:systemctl stop firewalld.service
-
重启一个服务:systemctl restart firewalld.service
-
显示一个服务的状态:systemctl status firewalld.service
-
在开机时启用一个服务:systemctl enable firewalld.service
-
在开机时禁用一个服务:systemctl disable firewalld.service
-
查看服务是否开机启动:systemctl is-enabled firewalld.service;echo $?
-
查看已启动的服务列表:systemctl list-unit-files|grep enabled
3.创建用户
-
adduser flume:添加用户
-
userdel -r flume: 删除用户
-
passwd flume:设置密码
4.切换到新用户
su - flume
5. 创建目录
|--bigdata
|--install :存放解压包
|--software :存放待安装的包
|--test :存放数据文件
6.jdk安装
1.解压jdk:
tar -zxvf jdk-8u102-linux-x64.tar.gz -C ~/bigdata/install/
2.配置环境变量
vim ~/.bash_profile
JAVA_HOME=/home/flume/bigdata/install/jdk1.8.0_102
export PATH=$JAVA_HOME/bin:$PATH
3.生效: source ~/.bash_profile
4. 配置环境变量有几种方式?
环境变量的设置有4中设置
1). root用户可以设置在/etc/profile文件中
2). 其他用户:~/.bash_profile :每个用户都可使用该文件输入专用于自己使用的shell信息,当用户登录时,该文件仅仅执行一次!默认情况下,他设置一些环境变量,执行用户的.bashrc文件.
3). 其他用户:~/.etc/bashrc: 该文件包含专用于你的bash shell的bash信息,当登录时以及每次打开新的shell时,该该文件被读取.
4). 使用脚本:可以设置在脚本中(等同于~/.etc/bashrc)
7.安装flume
1.解压
tar -zxvf /home/flume/bigdata/install/apache-flume-1.7.0-bin -C ~/bigdata/install/
2.配置flume到环境变量
vim ~/.bash_profile
FLUME_HOME=/home/flume/bigdata/install/apache-flume-1.7.0-bin
export PATH=$FLUME_HOME/bin:$PATH
source ~/.bash_profile
3.配置文件conf/flume.env.sh
cp flume-env.ps1.template flume.env.sh
export JAVA_HOME=/home/soup/bigdata/install/jdk1.8.0_102
8.测试
测试模型1:
source:NetCat TCP
sink:logger
channel:memory
配置文件名 # netcat.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动:
flume-ng agent --conf conf --conf-file $FLUME_HOME/conf/netcat.conf
--name a1 -Dflume.root.logger=INFO,console
另开一个窗口测试的方式:
telnet localhost 44444
测试模型2:
source:exec
sink:logger
channel:memory
配置文件名 # exec.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/flume/bigdata/test/flume_t.txt
a1.sources.r1.shell = /bin/bash -c
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动:
flume-ng agent --conf conf --conf-file $FLUME_HOME/conf/exec.conf \
--name a1 -Dflume.root.logger=INFO,console
另开一个窗口测试的方式:
echo "ABC">>/home/flume/bigdata/test/flume_t.txt
测试模型3: 从web的服务器上获取日志内容,在控制台打印出来
source:avro
sink:logger
channel:memory
配置文件名 # exec.conf
web的日志只需要在log4j的jar引入到pom
#####log日志的依赖
log4j
log4j
1.2.17
#####flume与log的对接包
org.apache.flume.flume-ng-clients
flume-ng-log4jappender
1.7.0
添加log4j.properties
log4j.rootCategory=INFO,stdout,flume
log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.COnversionPattern= %-d{yyyy-MM-dd HH:mm:ss, SSS}} [ %t] - [ %p ] %m%n
log4j.appender.flume = org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.Hostname =192.168.126.129 ----Linux的ip
log4j.appender.flume.Port = 4141 ----随意一个端口只要和下面的配置文件一致就好
log4j.appender.flume.UnsafeMode = true
测试的程序:
package com.itstar;
import org.apache.log4j.Logger;
public class flumeLog {
private static Logger log=Logger.getLogger(flumeLog.class);
public static void main(String[] args) throws Exception{
while (true){
Thread.sleep(5000);
log.info("hi");
}
}
}
配置文件名 #web-loggor.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1 .sources.r1.type = avro
a1.sources.r1.bind = master ###等价于192.168.126.129
a1.sources.r1.port =4141
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动:
flume-ng agent --conf conf --conf-file $FLUME_HOME/conf/web-loggor.conf \
--name a1 -Dflume.root.logger=INFO,console
测试:
启动 main生成日志就好
测试模型4: 从web的服务器上获取日志内容,传输到另一台Linux上 并在在控制台打印出来
agent1:a1
source:avro
sink:avro
channel:memory
配置文件名 # web-agent1-logger.conf
agent2:agent2
source:avro
sink:logger
channel:memory
配置文件名 # agent1-logger.conf
#web-agent1.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = 192.168.126.129
a1.sources.r1.port = 41414
a1.sinks.k1.type = avro
a1.sinks.k1.channel = c1
a1.sinks.k1.hostname = 192.168.126.128
a1.sinks.k1.port = 4545
a1.channels.c1.type = memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
# agent1-logger.con
agent2.sources = r1
agent2.sinks = k1
agent2.channels = c1
agent2.sources.r1.type = avro
agent2.sources.r1.bind = 192.168.126.128
agent2.sources.r1.port = 4545
agent2.sinks.k1.type = logger
agent2.channels.c1.type = memory
agent2.sources.r1.channels = c1
agent2.sinks.k1.channel = c1
启动:要注意启动的顺序:
flume-ng agent --conf conf --conf-file $FLUME_HOME/conf/agent1-logger.conf \
--name agent2 -Dflume.root.logger=INFO,console
flume-ng agent --conf conf --conf-file $FLUME_HOME/conf/web-agent1.conf \
--name a1 -Dflume.root.logger=INFO,console
测试:启动你web端的程序
测试模型5: 从web的服务器上获取日志内容,Linux上获取 并写到hdfs
前提:启动hdfs
source:avro
sink:hdfs
channel:memory
配置文件名 # avro-hdfs.conf
avro-hdfs.sources =avro1
avro-hdfs.sinks = k1
avro-hdfs.channels = c1
###定义source
avro-hdfs.sources.avro1.type = avro
avro-hdfs.sources.avro1.bind = 192.168.126.129
avro-hdfs.sources.avro1.port = 4141
###定义sink
avro-hdfs.sinks.k1.type = hdfs
avro-hdfs.sinks.k1.hdfs.path = /output/flume/
avro-hdfs.sinks.k1.hdfs.fileType = DataStream
###定义channel
avro-hdfs.channels.c1.type = memory
###创建关联
avro-hdfs.sources.avro1.channels = c1
avro-hdfs.sinks.k1.channel = c1
启动:
bin/flume-ng agent --conf conf --conf-file conf/avro-hdfs.conf --name avro-hdfs -Dflume.root.logger=INFO,console
测试
启动程序
查看hadoop中的数据
hadoop fs -ls /output/flume