当前位置: 首页 > news >正文

寻找可能认识的人

给一个命名为:friend.txt的文件

其中每一行中给出两个名字,中间用空格分开。(下图为文件内容)

题目:《查找出可能认识的人 》

代码如下:

RelationMapper:

package com.fesco.friend;import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;import java.io.IOException;public class RelationMapper extends Mapper<LongWritable, Text, Text, Text> {@Overrideprotected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context) throws IOException, InterruptedException {// 拆分人名String[] arr = value.toString().split(" ");context.write(new Text(arr[0]), new Text(arr[1]));}
}

RelationReducer :

package com.fesco.friend;import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;import java.io.IOException;
import java.util.LinkedList;
import java.util.List;public class RelationReducer extends Reducer<Text, Text, Text, IntWritable> {// 真的认识private static final IntWritable trueFriend = new IntWritable(1);// 可能认识private static final IntWritable fakeFriend = new IntWritable(0);@Overrideprotected void reduce(Text key, Iterable<Text> values, Reducer<Text, Text, Text, IntWritable>.Context context) throws IOException, InterruptedException {// key = tom// values = rose jim smith lucyString name = key.toString();// 迭代器values本身是一个伪迭代器,只能迭代一次// 所以还需要自己定义集合来存储好友列表List<String> fs = new LinkedList<>();// 确定真实好友关系for (Text value : values) {String f = value.toString();fs.add(f);if (name.compareTo(f) <= 0) context.write(new Text(name + "-" + f), trueFriend);else context.write(new Text(f + "-" + name), trueFriend);}// 推测好友关系for (int i = 0; i < fs.size() - 1; i++) {String f1 = fs.get(i);for (int j = i + 1; j < fs.size() ; j++) {String f2 = fs.get(j);if(f1.compareTo(f2) <= 0) context.write(new Text(f1 + "-" + f2), fakeFriend);else context.write(new Text(f2 + "-" + f1), fakeFriend);}}}
}

RelatioDriver: 

package com.fesco.friend;import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.io.IOException;public class RelationDriver {public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(RelationDriver.class);job.setMapperClass(RelationMapper.class);job.setReducerClass(RelationReducer.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(Text.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);FileInputFormat.addInputPath(job, new Path("hdfs://10.16.3.181:9000/txt/friend.txt"));FileOutputFormat.setOutputPath(job, new Path("hdfs://10.16.3.181:9000/result/relation"));job.waitForCompletion(true);}
}

FriendMapper: 

package com.fesco.friend;import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;import java.io.IOException;public class FriendMapper extends Mapper<LongWritable, Text, Text, LongWritable> {@Overrideprotected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, LongWritable>.Context context) throws IOException, InterruptedException {// 拆分数据String[] arr = value.toString().split("\t");context.write(new Text(arr[0]), new LongWritable(Long.parseLong(arr[1])));}
}

FriendReducer: 

package com.fesco.friend;import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;import java.io.IOException;public class FriendReducer extends Reducer<Text, LongWritable, Text, Text> {@Overrideprotected void reduce(Text key, Iterable<LongWritable> values, Reducer<Text, LongWritable, Text, Text>.Context context) throws IOException, InterruptedException {// 想要验证l两个人是否认识,验证逻辑:如果出现了数字1,说明两个人真的认识,那么就不是要找的可能认识的人// 如果遍历完成,全部都是数字0,那么说明这俩人真的是不认识,但是两个人有共同好友for (LongWritable value : values) {if (value.get() == 1) return ;}// 循环完成没有return,说明全部都是数字0String[] arr = key.toString().split("-");context.write(new Text(arr[0]), new Text(arr[1]));}
}

FriendDriver: 

package com.fesco.friend;import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.io.IOException;public class FriendDriver {public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(FriendDriver.class);job.setMapperClass(FriendMapper.class);job.setReducerClass(FriendReducer.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(LongWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);FileInputFormat.addInputPath(job, new Path("hdfs://10.16.3.181:9000/result/relation"));FileOutputFormat.setOutputPath(job, new Path("hdfs://10.16.3.181:9000/result/friend"));job.waitForCompletion(true);}
}

http://www.lryc.cn/news/320782.html

相关文章:

  • 机器学习----特征缩放
  • 机器学习_正则化
  • python知识点总结(四)
  • upload-labs-pass01
  • 2.4 ROC曲线是什么?
  • mysql笔记:21. 演示脏读、不可重复读和幻读现象
  • iOS通过wifi连接硬件设备
  • SQL-Labs靶场“36-37”关通关教程
  • RabbitMQ介绍及搭建
  • VSCode + PicGo + Github 实现markdown图床管理
  • 小程序搜索排名优化二三事
  • 分布式 Session--一起学习吧之架构
  • 记录一下小程序自定义导航栏消息未读已读小红点,以及分组件的消息数量数据实时读取
  • qt+ffmpeg 实现音视频播放(二)之音频播放
  • Bash Shell中双引号中的感叹号问题详解
  • MFC中CString的用法及使用示例
  • 注册个人小程序
  • VTK----VTK的事件机制
  • 常用的vim和linux命令
  • 生产环境中间件服务集群搭建-zk-activeMQ-kafka-reids-nacos
  • Smart Light Random Memory Sprays Retinex 传统图像增强 SLRMSR
  • Oracle数据库实例概述
  • Odoo17免费开源ERP开发技巧:如何在表单视图中调用JS类
  • [RCTF2015]EasySQL ---不会编程的崽
  • Memcached-分布式内存对象缓存系统
  • bash: sqlplus: command not found 问题解决方法
  • 大模型-Prompt
  • Python实战:SQLAlchemy ORM使用教程
  • 能不能绕过c去学c++?
  • Python 小爬虫:爬取 bing 每日壁纸设为桌面壁纸