前言
全局命令
在环境变量中增加如下命令,可以使用 bd 快速切换到 /data/tools/bigdata
1 2
| cd /etc/profile.d/ vi bd.sh
|
内容如下
1
| alias bd='cd /data/tools/bigdata'
|
配置生效
Flume的安装
下载地址
https://dlcdn.apache.org/flume/1.9.0/
上传至虚拟机,并解压
1
| tar -zxvf apache-flume-1.9.0-bin.tar.gz -C /data/tools/bigdata
|
环境变量
创建配置文件
1
| vi /etc/profile.d/flume.sh
|
内容设置为
1 2 3
| export FLUME_HOME=/data/tools/bigdata/apache-flume-1.9.0-bin export PATH=$PATH:$FLUME_HOME/bin
|
配置生效
查看是否生效
查看flume版本
测试flume
监控一个目录,将数据打印出来
配置文件
1
| vi $FLUME_HOME/conf/spoolingtest.conf
|
内容如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
|
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /root/data
a1.sources.r1.fileHeader = true
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = timestamp
a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100
a1.sinks.k1.type = logger
a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
|
新建/root/data目录
启动agent
1
| flume-ng agent -n a1 -f $FLUME_HOME/conf/spoolingtest.conf -Dflume.root.logger=DEBUG,console
|
在/root/data/目录下新建文件,输入内容,观察flume进程打印的日志
随意在test.txt中加入一些内容
我们会发现它做了如下操作
- 会把文件中的每一行都打印出来
- 打印完成后文件重命名为
test.txt.COMPLETED
flume的使用
系统文件到HDFS
创建配置文件
1
| vi $FLUME_HOME/conf/spoolingToHDFS.conf
|
配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
|
a.sources = r1
a.sinks = k1
a.channels = c1
a.sources.r1.type = spooldir a.sources.r1.spoolDir = /root/data a.sources.r1.fileHeader = true a.sources.r1.interceptors = i1 a.sources.r1.interceptors.i1.type = timestamp
a.sinks.k1.type = hdfs a.sinks.k1.hdfs.path = /flume/data/dir1
a.sinks.k1.hdfs.filePrefix = student
a.sinks.k1.hdfs.rollSize = 102400
a.sinks.k1.hdfs.rollCount = 1000
a.sinks.k1.hdfs.fileType = DataStream
a.sinks.k1.hdfs.writeFormat = text
a.sinks.k1.hdfs.fileSuffix = .txt
a.channels.c1.type = memory a.channels.c1.capacity = 1000
a.channels.c1.transactionCapacity = 100
a.sources.r1.channels = c1 a.sinks.k1.channel = c1
|
在 /root/data/目录下准备数据
内容如下
1 2
| good good study day day up
|
启动agent
如果HDFS没有启动要先启动HDFS
1
| flume-ng agent -n a -f $FLUME_HOME/conf/spoolingToHDFS.conf -Dflume.root.logger=DEBUG,console
|
启动HDFS
1
| $HADOOP_HOME/sbin/start-dfs.sh
|
停止HDFS
1
| $HADOOP_HOME/sbin/stop-dfs.sh
|
可以从网址中查看文件保存情况(192.168.7.101是我的服务器IP)
http://192.168.7.101:50070/explorer.html#/
HBase日志到HDFS
1
| vi $FLUME_HOME/conf/hbaseLogToHDFS.conf
|
配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
|
a.sources = r1
a.sinks = k1
a.channels = c1
a.sources.r1.type = exec a.sources.r1.command = tail -f /data/tools/bigdata/hbase-2.4.11/logs/hbase-root-master-master.log
a.sinks.k1.type = hdfs a.sinks.k1.hdfs.path = /flume/data/dir2
a.sinks.k1.hdfs.filePrefix = hbaselog
a.sinks.k1.hdfs.rollSize = 102400
a.sinks.k1.hdfs.rollCount = 1000
a.sinks.k1.hdfs.fileType = DataStream
a.sinks.k1.hdfs.writeFormat = text
a.sinks.k1.hdfs.fileSuffix = .txt
a.channels.c1.type = memory a.channels.c1.capacity = 1000
a.channels.c1.transactionCapacity = 100
a.sources.r1.channels = c1 a.sinks.k1.channel = c1
|
启动
1
| flume-ng agent -n a -f $FLUME_HOME/conf/hbaseLogToHDFS.conf -Dflume.root.logger=DEBUG,console
|
HBase日志到HBase
创建配置文件
1
| vi $FLUME_HOME/conf/hbaselogToHBase.conf
|
配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
|
a.sources = r1
a.sinks = k1
a.channels = c1
a.sources.r1.type = exec a.sources.r1.command = cat /data/tools/bigdata/hbase-2.4.11/logs/hbase-root-master-master.log
a.sinks.k1.type = hbase a.sinks.k1.table = t_log a.sinks.k1.columnFamily = cf1
a.channels.c1.type = memory a.channels.c1.capacity = 100000
a.channels.c1.transactionCapacity = 100
a.sources.r1.channels = c1 a.sinks.k1.channel = c1
|
在hbase中创建log表
启动
1
| flume-ng agent -n a -f $FLUME_HOME/conf/hbaselogToHBase.conf -Dflume.root.logger=DEBUG,console
|
网络日志获取
创建配置文件
1
| vi $FLUME_HOME/conf/netcatLogger.conf
|
配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
|
a.sources = r1
a.sinks = k1
a.channels = c1
a.sources.r1.type = netcat a.sources.r1.bind = 0.0.0.0 a.sources.r1.port = 8888
a.sinks.k1.type = logger
a.channels.c1.type = memory a.channels.c1.capacity = 1000
a.channels.c1.transactionCapacity = 100
a.sources.r1.channels = c1 a.sinks.k1.channel = c1
|
启动
先启动agent
1
| flume-ng agent -n a -f $FLUME_HOME/conf/netcatLogger.conf -Dflume.root.logger=DEBUG,console
|
新打开一个控制台
安装telnet
再启动telnet
网络请求日志
创建配置文件
1
| vi $FLUME_HOME/conf/httpToLogger.conf
|
配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
|
a.sources = r1
a.sinks = k1
a.channels = c1
a.sources.r1.type = http a.sources.r1.port = 6666
a.sinks.k1.type = logger
a.channels.c1.type = memory a.channels.c1.capacity = 1000
a.channels.c1.transactionCapacity = 100
a.sources.r1.channels = c1 a.sinks.k1.channel = c1
|
先启动agent
1
| flume-ng agent -n a -f $FLUME_HOME/conf/httpToLogger.conf -Dflume.root.logger=DEBUG,console
|
再使用curl发起一个http请求
新打开一个控制台
1
| curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "hello~http~flume~"}]' http://localhost:6666
|