当前位置: 首页 > news >正文

数据采集工具之Flume

本文主要实现数据到datahub的采集过程

1、下载

Index of /dist/flume/1.11.0

datahub插件下载

https://aliyun-datahub.oss-cn-hangzhou.aliyuncs.com/tools/aliyun-flume-datahub-sink-2.0.9.tar.gz

2、安装

$ tar aliyun-flume-datahub-sink-x.x.x.tar.gz
$ cd aliyun-flume-datahub-sink-x.x.x
$ mkdir ${FLUME_HOME}/plugins.d
$ mv aliyun-flume-datahub-sink ${FLUME_HOME}/plugins.d

3、编写配置文件 

# A single-node Flume configuration for DataHub
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /soft/data/test.csv
# Describe the sink
a1.sinks.k1.type = com.aliyun.datahub.flume.sink.DatahubSink
a1.sinks.k1.datahub.accessId = 2Z8tAOpDPBm5LEkA
a1.sinks.k1.datahub.accessKey = Tlupsw2G0PdKGCRyPLucHjeESqoCla
a1.sinks.k1.datahub.endPoint = https://datahub.cn-beijing-tbdg-d01.dh.res.bigdata.tbea.com
a1.sinks.k1.datahub.project = bigdata
a1.sinks.k1.datahub.topic = txt_flume
a1.sinks.k1.serializer = DELIMITED
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = id,name,gender,salary,my_time,decimal
a1.sinks.k1.serializer.charset = UTF-8
a1.sinks.k1.datahub.retryTimes = 5
a1.sinks.k1.datahub.retryInterval = 5
a1.sinks.k1.datahub.batchSize = 100
a1.sinks.k1.datahub.batchTimeout = 5
a1.sinks.k1.datahub.enablePb = true
a1.sinks.k1.datahub.compressType = DEFLATE
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

4、启动

flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console

Q:启动报错

[root@hadoop2 apache-flume-1.11.0-bin]# flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console
Info: Including Hive libraries found via () for Hive access
+ exec /soft/jdk1.8.0_421/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/soft/apache-flume-1.11.0-bin/conf:/soft/apache-flume-1.11.0-bin/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -n a1 -f ./conf/flume-txt2datahub.conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/slf4j-log4j12-1.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkNotNull(Ljava/lang/Object;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;at com.aliyun.datahub.flume.sink.DatahubSink.configure(DatahubSink.java:59)at org.apache.flume.conf.Configurables.configure(Configurables.java:41)at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:456)at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:109)at org.apache.flume.node.Application.main(Application.java:491)

A:删除Flume lib文件夹中的guava  jar包文件,重新启动

http://www.lryc.cn/news/418411.html

相关文章:

  • 【24年最新】AI大模型零基础入门到精通学习资料大全,学完你就是LLM大师!
  • 使用RabbitMQ死信交换机实现延迟消息
  • overleaf上latex表格的使用,latex绘制三线表
  • 聚焦光热型太阳光模拟器助力多晶硅均匀加热
  • 【Android】四大组件(Activity、Service、Broadcast Receiver、Content Provider)、结构目录
  • 前端开发:创建可拖动的固定位置 `<div>` 和自动隐藏悬浮按钮
  • Java Bean Validation 注解:@NotEmpty、@NotBlank 和 @NotNull 的区别
  • Java | Leetcode Java题解之第322题零钱兑换
  • Linux初启征程指南:攻克常见系统指令与权限初理解
  • 第十九节、野猪受伤死亡逻辑动画
  • vue 开发工具 Hbuilder 简介及应用
  • 【杂谈】-MQTT与HTTP在物联网中的比较:为什么MQTT是更好的选择
  • 冠豪猪优化算法(CPO)、卷积神经网络(CNN)与支持向量机(SVM)结合的预测模型(CPO-CNN-SVM)及其Python和MATLAB实现
  • 【通信原理】
  • 有序数组的平方(LeetCode)
  • Python配置镜像
  • Python新手错误集锦(PyCharm)
  • CTFHUB-web-RCE-php://input
  • Python网络爬虫核心面试题
  • DSL domain specific language of Kola
  • 【RISC-V设计-05】- RISC-V处理器设计K0A之GPR
  • Linux/C 高级——shell脚本
  • SpringBoot面试题整理(1)
  • LVS原理及实例
  • Spring统一功能处理:拦截器、响应与异常的统一管理
  • 深入理解小程序的渲染机制与性能优化策略
  • Linux:多线程(二.理解pthread_t、线程互斥与同步、基于阻塞队列的生产消费模型)
  • Pandas中`str`对象解析与应用实例
  • C语言典型例题29
  • Docker 常规安装简介