当前位置: 首页 > news >正文

1. Flink自定义Source

一. Source 简介

DataStream是Flink的低级API,用于进行数据的实时处理,Flink编程模型分为Source、Transformation、Sink三个部分,如下图所示。
在这里插入图片描述

默认Flink提供了大量的内置Source,常见的Source如下:

  • 基于文件的Source
  • 基于Socket的Source
  • 基于集合的Source
  • 基于Kafka消息队列的Source

当以上内置Source不能满足业务需要时,可以实现自定义Source。

Flink中有关Source的接口类的继承关系如下:
在这里插入图片描述

  • SourceFunction:单并行度Source的基类
  • RichSourceFunction:单并行度增强型Source的基类
  • ParallelSourceFunction:多并行度Source的基类
  • RichParallelSourceFunction:多并行度增强型Source的基类
二. 自定义单并行度Source

自定义单并行度的source需要实现SourceFunction接口。

代码实现

MySource.java

package flink.basic.source;import org.apache.flink.streaming.api.functions.source.SourceFunction;
import java.util.Random;public class MySource implements SourceFunction<String> {boolean running = true;@Overridepublic void run(SourceContext<String> ctx) throws Exception {Random random = new Random();while (running) {// "Num"加上0~100的随机数生成一个字符串ctx.collect("Num: " + random.nextInt(100));Thread.sleep(1000);}}@Overridepublic void cancel() {running = false;}
}

SourceDemo.java

package flink.basic.source;import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class SourceDemo {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<String> source = env.addSource(new MySource());source.print();env.execute("source_demo");}
}

运行结果

5> Num: 62
6> Num: 91
7> Num: 13
8> Num: 53
三. 自定义多并行度Source

自定义多并行度的source需要实现ParallelSourceFunction接口。

代码实现

MyParallelSource.java

package flink.basic.source;import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
import java.util.Random;public class MyParallelSource implements ParallelSourceFunction<String> {boolean running = true;@Overridepublic void run(SourceContext<String> ctx) throws Exception {Random random = new Random();while (running) {ctx.collect("Num: " + random.nextInt(100));Thread.sleep(1000);}}@Overridepublic void cancel() {running = false;}
}

SourceDemo.java

package flink.basic.source;import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class SourceDemo {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<String> source = env.addSource(new MyParallelSource());source.print();env.execute("source_demo");}
}

运行结果

7> Num: 43
8> Num: 30
1> Num: 92
2> Num: 50
5> Num: 39
6> Num: 6
4> Num: 20
3> Num: 2
四. 自定义单并行度增强型Source

增强型Source额外提供了open和close方法,可以用于自定义Source的初始化和清理工作。单并行度增强型Source需要实现RichSourceFunction接口。下面演示实现读取mysql表的单并行度Source。

在mysql中创建student表,并插入三条数据。

create table student (id int primary key,name varchar(50),age int
);insert into student values(1, "name1", 20),(2, "name2", 30), (3, "name3", 15);

实现代码

Student.java

package flink.basic.source;public class Student {private int id;private String name;private int age;public Student(int id, String name, int age) {this.id = id;this.name = name;this.age = age;}public Student() {}public int getId() {return id;}public void setId(int id) {this.id = id;}public String getName() {return name;}public void setName(String name) {this.name = name;}public int getAge() {return age;}public void setAge(int age) {this.age = age;}@Overridepublic String toString() {return "Student{" +"id=" + id +", name='" + name + '\'' +", age=" + age +'}';}
}

MysqlSource.java

package flink.basic.source;import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;public class MysqlSource extends RichSourceFunction<Student> {Connection conn;Statement stmt;@Overridepublic void open(Configuration parameters) throws Exception {Class.forName("com.mysql.cj.jdbc.Driver");String url = "jdbc:mysql://192.168.47.130:3306/test";String user = "root";String password = "root";conn = DriverManager.getConnection(url,user,password);stmt = conn.createStatement();}@Overridepublic void run(SourceContext<Student> ctx) throws Exception {ResultSet rs = stmt.executeQuery("select * from student");while (rs.next()) {int id = rs.getInt("id");String name = rs.getString("name");int age = rs.getInt("age");ctx.collect(new Student(id, name, age));}rs.close();}@Overridepublic void cancel() {}@Overridepublic void close() throws Exception {stmt.close();conn.close();}
}

SourceDemo.java

package flink.basic.source;import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class SourceDemo {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.getExecutionEnvironment();// 添加mysql SourceDataStreamSource<Student> source = env.addSource(new MysqlSource());source.print();env.execute("source_demo");}
}

运行结果

1> Student{id=3, name='name3', age=15}
8> Student{id=2, name='name2', age=30}
7> Student{id=1, name='name1', age=20}
http://www.lryc.cn/news/501905.html

相关文章:

  • 关于LinuxWindows双系统在八月更新后出现的问题
  • VMware:如何在CentOS7上开启22端口
  • ubuntu远程桌面开启opengl渲染权限
  • 从小学题到技术选型哲学:以智能客服系统为例,解读相关AI技术栈20241211
  • 【C语言练习(5)—回文数判断】
  • 【Rust 学习笔记】Rust 基础数据类型介绍——数组、向量和切片
  • 2024年特别报告,「十大生活方式」研究数据报告
  • R中单细胞RNA-seq分析教程 (5)
  • openpnp - Too many misdetects - retry and verify fiducial/nozzle tip detection
  • 不与最大数相同的数字之和
  • CSS学习记录11
  • D95【python 接口自动化学习】- pytest进阶之fixture用法
  • Abaqus断层扫描三维重建插件CT2Model 3D V1.1版本更新
  • 隐式对象和泛型
  • CSS的颜色表示方式
  • 单链表常见面试题 —— LeetCode
  • Pydantic中的discriminator:优雅地处理联合类型详解
  • pgloader SQLSERVER -> PostgreSQL 配置文件样例
  • APP、小程序对接聚合广告平台,有哪些广告变现策略?
  • HarmonyOs DevEco Studio小技巧39-模拟器的使用
  • 【C语言】浮点数的原理、整型如何转换成浮点数
  • TesseractOCR-GUI:基于WPF/C#构建TesseractOCR简单易用的用户界面
  • Elasticsearch高性能实践
  • 软件测试--录制与回放脚本
  • nodejs 06.npm的使用以及package.json详解
  • 如何使用WinCC DataMonitor基于Web发布浏览Excel报表文档
  • 颜色的基本处理
  • 跟李笑来学美式俚语(Most Common American Idioms): Part 66
  • 爬虫技术简介
  • 如何打开Windows10的设备管理器