作者:HelloMsLin你好_林小姐 | 来源:互联网 | 2023-06-08 23:15
Kafka作为Flume 的 Channel,将数据保存到topic中,Flink作为Kafka的消费者,消费topic中的数据,实现实时数据的分析。
Flink 程序:
import org.apache.flink.api.common.serialization.SimpleStringSchema
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, createTypeInformation}
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerimport java.util.Properties/*** DATE:2022/10/3 21:49* AUTHOR:GX*/
object SourceKafkaTest {def main(args: Array[String]): Unit = {val env = StreamExecutionEnvironment.getExecutionEnvironmentenv.setParallelism(1)//保存Kafka连接的相关配置val properties = new Properties()properties.setProperty("bootstrap.servers","master:9092")properties.setProperty("group.id","consumer-group")val stream = env.addSource(new FlinkKafkaConsumer[String]("clicks",new SimpleStringSchema(), properties))stream.print()env.execute()}
}
Kafka作为Flume的Channels
采集方案:
a1.sources=s1
a1.channels=c1
a1.sources.s1.type=exec
a1.sources.s1.command=tail -F /opt/flinkDemo/data/logs/
a1.channels.c1.type = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.c1.kafka.bootstrap.servers=master:9092
a1.channels.c1.kafka.topic=clicks
a1.sources.s1.channels=c1
定时向文件中插入数据(模拟日志文件的生成,向指定文件中插入当前时间戳)
while true;do echo $(date "+%Y%m%d%H%M%S") >> /opt/flinkDemo/data/logs/logs.log;sleep 0.5;done
可以创建一个生产者,手动向topic中插入数据:
bin/kafka-console-producer.sh --bootstrap-server master:9092 --topic clicks
打开消费者,查看是否有数据实时产生:
bin/kafka-console-consumer.sh --bootstrap-server master:9092 --topic clicks