当前位置: 首页 > news >正文

物体检测-系列教程16:YOLOV5 源码解析6(马赛克数据增强函数load_mosaic)

😎😎😎物体检测-系列教程 总目录

有任何问题欢迎在下面留言
本篇文章的代码运行界面均在Pycharm中进行
本篇文章配套的代码资源已经上传
点我下载源码

9、load_mosaic函数

Mosaic(马赛克)数据增强:将四张不同的图像拼接成一张大图像来增加场景的复杂性和多样性

9.1 load_mosaic函数

def load_mosaic(self, index):labels4, segments4 = [], []s = self.img_sizeyc, xc = [int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border]  # mosaic center x, yindices = [index] + random.choices(self.indices, k=3)  # 3 additional image indicesfor i, index in enumerate(indices):img, _, (h, w) = load_image(self, index)if i == 0:  # top leftimg4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8)x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, ycx1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, helif i == 1:  # top rightx1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), ycx1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), helif i == 2:  # bottom leftx1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)elif i == 3:  # bottom rightx1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]padw = x1a - x1bpadh = y1a - y1blabels, segments = self.labels[index].copy(), self.segments[index].copy()if labels.size:labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padw, padh)segments = [xyn2xy(x, w, h, padw, padh) for x in segments]labels4.append(labels)segments4.extend(segments)labels4 = np.concatenate(labels4, 0)for x in (labels4[:, 1:], *segments4):np.clip(x, 0, 2 * s, out=x)img4, labels4 = random_perspective(img4, labels4, segments4,degrees=self.hyp['degrees'],translate=self.hyp['translate'],scale=self.hyp['scale'],shear=self.hyp['shear'],perspective=self.hyp['perspective'],border=self.mosaic_border)return img4, labels4
  1. 定义函数,接受索引为参数
  2. labels4, segments4,存储拼接后图像的标签和分割信息
  3. s,获取单张图像的目标大小
  4. yc, xc,计算马赛克图像中心点的坐标,但是这个中心点坐标是在一个确定的范围内随机产生的,4张图像可能会相互覆盖,超出边界的会进行裁剪
  5. indices ,随机选择另外三个图像的索引,组成一个列表indices
  6. 现在indices 是一个包含4个图像索引的list,遍历这个list

依次遍历计算4张图像的位置坐标和裁剪的区域,构建大图像:( 初始化一个大图,计算当前小图像放在大图中什么位置,计算当前小图像取哪一部分放在大图中,可能有些图像大小不足以放到哪个区域就用114填充,如果图像和标签越界了,越界的图像就不要了,越界的框也要修正一下)

  1. img, _, (h, w),通过当前遍历的索引使用load_image函数加载图像,返回加载后的图像与长宽
  2. 如果是第1张图像,即top left左上角:
  3. 创建一个大小为(s * 2, s * 2),通道数与img相同,所有像素值全部为114的大图像
  4. 计算第1张图像在马赛克图像中的位置坐标
  5. 计算需要从第1张图像中裁剪的区域
  6. 如果是第2张图像,即top right右上角:
  7. 计算第2张图像在马赛克图像中的位置坐标
  8. 计算需要从第2张图像中裁剪的区域
  9. 如果是第3张图像,即bottom left左下角:
  10. 计算第3张图像在马赛克图像中的位置坐标
  11. 计算需要从第3张图像中裁剪的区域
  12. 如果是第4张图像,即bottom right右下角:
  13. 计算第4张图像在马赛克图像中的位置坐标
  14. 计算需要从第4张图像中裁剪的区域
  15. 将当前图像进行裁剪后放回大图像中
  16. padw ,计算水平方向上的填充量
  17. padh ,计算垂直方向上的填充量
  18. 复制当前图像索引对应的标签和分割信息
  19. 如果当前图像有标签:
  20. 将标签从归一化的xywh格式使用xywhn2xyxy函数转换为像素级的xyxy格式,并考虑填充调整
  21. 对分割信息使用xyn2xy函数进行同样的转换和调整
  22. 将当前图像的标签添加到labels4列表中
  23. 将当前图像的分割信息添加到segments4列表中
  24. labels4 ,将所有图像的标签合并成一个ndarray
  25. 遍历所有标签和分割信息的坐标,准备进行裁剪
  26. 使用np.clip函数限制坐标值不超出马赛克图像的范围

做完大图后,可以再对大图进行一些数据增强操作(这里使用的是辅助函数),也有先对小图像进行数据增强后再拼成大图像

  1. 对马赛克图像及其标签使用random_perspective函数应用随机透视变换,以进行进一步的数据增强
  2. 返回马赛克图像和对应的标签

9.2 load_image函数

def load_image(self, index):# loads 1 image from dataset, returns img, original hw, resized hwimg = self.imgs[index]if img is None:  # not cachedpath = self.img_files[index]img = cv2.imread(path)  # BGRassert img is not None, 'Image Not Found ' + pathh0, w0 = img.shape[:2]  # orig hwr = self.img_size / max(h0, w0)  # resize image to img_sizeif r != 1:  # always resize down, only resize up if training with augmentationinterp = cv2.INTER_AREA if r < 1 and not self.augment else cv2.INTER_LINEARimg = cv2.resize(img, (int(w0 * r), int(h0 * r)), interpolation=interp)return img, (h0, w0), img.shape[:2]  # img, hw_original, hw_resizedelse:return self.imgs[index], self.img_hw0[index], self.img_hw[index]  # img, hw_original, hw_resized

9.3 xywhn2xyxy函数

def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):# Convert nx4 boxes from [x, y, w, h] normalized to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-righty = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)y[:, 0] = w * (x[:, 0] - x[:, 2] / 2) + padw  # top left xy[:, 1] = h * (x[:, 1] - x[:, 3] / 2) + padh  # top left yy[:, 2] = w * (x[:, 0] + x[:, 2] / 2) + padw  # bottom right xy[:, 3] = h * (x[:, 1] + x[:, 3] / 2) + padh  # bottom right yreturn y

9.4 xywhn2xyxy函数

def xyn2xy(x, w=640, h=640, padw=0, padh=0):# Convert normalized segments into pixel segments, shape (n,2)y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)y[:, 0] = w * x[:, 0] + padw  # top left xy[:, 1] = h * x[:, 1] + padh  # top left yreturn y

9.5 random_perspective函数

def random_perspective(img, targets=(), segments=(), degrees=10, translate=.1, scale=.1, shear=10, perspective=0.0,border=(0, 0)):height = img.shape[0] + border[0] * 2  # shape(h,w,c)width = img.shape[1] + border[1] * 2C = np.eye(3)C[0, 2] = -img.shape[1] / 2  # x translation (pixels)C[1, 2] = -img.shape[0] / 2  # y translation (pixels)P = np.eye(3)P[2, 0] = random.uniform(-perspective, perspective)  # x perspective (about y)P[2, 1] = random.uniform(-perspective, perspective)  # y perspective (about x)R = np.eye(3)a = random.uniform(-degrees, degrees)s = random.uniform(1 - scale, 1 + scale)R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)S = np.eye(3)S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)T = np.eye(3)T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width  # x translation (pixels)T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height  # y translation (pixels)M = T @ S @ R @ P @ C  # order of operations (right to left) is IMPORTANTif (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any():  # image changedif perspective:img = cv2.warpPerspective(img, M, dsize=(width, height), borderValue=(114, 114, 114))else:  # affineimg = cv2.warpAffine(img, M[:2], dsize=(width, height), borderValue=(114, 114, 114))n = len(targets)if n:use_segments = any(x.any() for x in segments)new = np.zeros((n, 4))if use_segments:  # warp segmentssegments = resample_segments(segments)  # upsamplefor i, segment in enumerate(segments):xy = np.ones((len(segment), 3))xy[:, :2] = segmentxy = xy @ M.T  # transformxy = xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2]  # perspective rescale or affinenew[i] = segment2box(xy, width, height)else:  # warp boxesxy = np.ones((n * 4, 3))xy[:, :2] = targets[:, [1, 2, 3, 4, 1, 4, 3, 2]].reshape(n * 4, 2)  # x1y1, x2y2, x1y2, x2y1xy = xy @ M.T  # transformxy = (xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2]).reshape(n, 8)  # perspective rescale or affinex = xy[:, [0, 2, 4, 6]]y = xy[:, [1, 3, 5, 7]]new = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).Tnew[:, [0, 2]] = new[:, [0, 2]].clip(0, width)new[:, [1, 3]] = new[:, [1, 3]].clip(0, height)i = box_candidates(box1=targets[:, 1:5].T * s, box2=new.T, area_thr=0.01 if use_segments else 0.10)targets = targets[i]targets[:, 1:5] = new[i]return img, targets
http://www.lryc.cn/news/305815.html

相关文章:

  • 星河做市基金会全球DAO社区启动,为数字货币市场注入新活力
  • QT Widget自定义菜单
  • UnityWebGL 设置全屏
  • 100224. 分割数组
  • WSL2配置Linux、Docker、VS Code、zsh、oh my zsh(附Docker开机自启设置)
  • 深度学习基础(四)医疗影像分析实战
  • ChatGPT调教指南 | 咒语指南 | Prompts提示词教程(一)
  • LeetCode | 两数相加 C语言
  • 【Spring MVC】处理器映射器:AbstractHandlerMethodMapping源码分析
  • 网络编程知识整理
  • 【小白友好】leetcode 移动零
  • 迭代、递归、尾递归实现斐波那契数列的第n项
  • vulnhub靶场之driftingblues-1
  • NGINX服务器配置实现加密的WebSocket连接WSS协议
  • 5个免费文章神器,用来改写文章太方便了
  • 详细教程!VMware Workstation Pro16 安装 + 创建 win7 虚拟机!
  • Python文件和异常(二)
  • 大模型+影像:智能手机“上春山”
  • 8-pytorch-损失函数与反向传播
  • MySQL高级特性篇(8)-数据库连接池的配置与优化
  • mac下使用jadx反编译工具
  • 分布式一致性软件-zookeeper
  • 企业计算机服务器中了babyk勒索病毒怎么办?Babyk勒索病毒解密数据恢复
  • 板块一 Servlet编程:第五节 Cookie对象全解 来自【汤米尼克的JAVAEE全套教程专栏】
  • 自动驾驶---Motion Planning之Path Boundary
  • Leetcode 3048. Earliest Second to Mark Indices I
  • 从源码学习单例模式
  • axios介绍和使用
  • redis雪崩问题
  • [SUCTF 2019]EasySQL1 题目分析与详解