当前位置: 首页 > news >正文

每日Attention学习7——Frequency-Perception Module

模块出处

[link] [code] [ACM MM 23] Frequency Perception Network for Camouflaged Object Detection


模块名称

Frequency-Perception Module (FPM)


模块作用

获取频域信息,更好识别伪装对象


模块结构

在这里插入图片描述

模块代码
import torch
import torch.nn as nn
import torch.nn.functional as Fclass FirstOctaveConv(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, alpha=0.5, stride=1, padding=1, dilation=1,groups=1, bias=False):super(FirstOctaveConv, self).__init__()self.stride = stridekernel_size = kernel_size[0]self.h2g_pool = nn.AvgPool2d(kernel_size=(2, 2), stride=2)self.h2l = torch.nn.Conv2d(in_channels, int(alpha * in_channels),kernel_size, 1, padding, dilation, groups, bias)self.h2h = torch.nn.Conv2d(in_channels, in_channels - int(alpha * in_channels),kernel_size, 1, padding, dilation, groups, bias)def forward(self, x):if self.stride ==2:x = self.h2g_pool(x)X_h2l = self.h2g_pool(x)X_h = xX_h = self.h2h(X_h)X_l = self.h2l(X_h2l)return X_h, X_lclass OctaveConv(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, alpha=0.5, stride=1, padding=1, dilation=1,groups=1, bias=False):super(OctaveConv, self).__init__()kernel_size = kernel_size[0]self.h2g_pool = nn.AvgPool2d(kernel_size=(2, 2), stride=2)self.upsample = torch.nn.Upsample(scale_factor=2, mode='nearest')self.stride = strideself.l2l = torch.nn.Conv2d(int(alpha * in_channels), int(alpha * out_channels),kernel_size, 1, padding, dilation, groups, bias)self.l2h = torch.nn.Conv2d(int(alpha * in_channels), out_channels - int(alpha * out_channels),kernel_size, 1, padding, dilation, groups, bias)self.h2l = torch.nn.Conv2d(in_channels - int(alpha * in_channels), int(alpha * out_channels),kernel_size, 1, padding, dilation, groups, bias)self.h2h = torch.nn.Conv2d(in_channels - int(alpha * in_channels),out_channels - int(alpha * out_channels),kernel_size, 1, padding, dilation, groups, bias)def forward(self, x):X_h, X_l = xif self.stride == 2:X_h, X_l = self.h2g_pool(X_h), self.h2g_pool(X_l)X_h2l = self.h2g_pool(X_h)X_h2h = self.h2h(X_h)X_l2h = self.l2h(X_l)X_l2l = self.l2l(X_l)X_h2l = self.h2l(X_h2l)X_l2h = F.interpolate(X_l2h, (int(X_h2h.size()[2]),int(X_h2h.size()[3])), mode='bilinear')X_h = X_l2h + X_h2hX_l = X_h2l + X_l2lreturn X_h, X_lclass LastOctaveConv(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, alpha=0.5, stride=1, padding=1, dilation=1,groups=1, bias=False):super(LastOctaveConv, self).__init__()self.stride = stridekernel_size = kernel_size[0]self.h2g_pool = nn.AvgPool2d(kernel_size=(2, 2), stride=2)self.l2h = torch.nn.Conv2d(int(alpha * out_channels), out_channels,kernel_size, 1, padding, dilation, groups, bias)self.h2h = torch.nn.Conv2d(out_channels - int(alpha * out_channels),out_channels,kernel_size, 1, padding, dilation, groups, bias)self.upsample = torch.nn.Upsample(scale_factor=2, mode='nearest')def forward(self, x):X_h, X_l = xif self.stride == 2:X_h, X_l = self.h2g_pool(X_h), self.h2g_pool(X_l)X_h2h = self.h2h(X_h) X_l2h = self.l2h(X_l) X_l2h = F.interpolate(X_l2h, (int(X_h2h.size()[2]), int(X_h2h.size()[3])), mode='bilinear')X_h = X_h2h + X_l2h return X_hclass FPM(nn.Module):def __init__(self, in_channels, out_channels, kernel_size=(3, 3)):super(FPM, self).__init__()self.fir = FirstOctaveConv(in_channels, out_channels, kernel_size)self.mid1 = OctaveConv(in_channels, in_channels, kernel_size)self.mid2 = OctaveConv(in_channels, out_channels, kernel_size)self.lst = LastOctaveConv(in_channels, out_channels, kernel_size)def forward(self, x):x_h, x_l = self.fir(x)                  x_h_1, x_l_1 = self.mid1((x_h, x_l))     x_h_2, x_l_2 = self.mid1((x_h_1, x_l_1)) x_h_5, x_l_5 = self.mid2((x_h_2, x_l_2)) x_ret = self.lst((x_h_5, x_l_5))return x_retif __name__ == '__main__':x = torch.randn([3, 256, 16, 16])fpm = FPM(in_channels=256, out_channels=64)out = fpm(x)print(out.shape)  # 3, 64, 16, 16

原文表述

具体来说,我们采用八度卷积以端到端的方式自动感知高频和低频信息,从而实现伪装物体检测的在线学习。八度卷积可以有效避免DCT 引起的块状效应,并利用GPU的计算速度优势。此外,它可以轻松插入任意网络。

http://www.lryc.cn/news/388567.html

相关文章:

  • 【从0实现React18】 (五) 初探react mount流程 完成核心递归流程
  • 0-30 VDC 稳压电源,电流控制 0.002-3 A
  • HTML5+CSS3+JS小实例:图片九宫格
  • 湘潭大学软件工程数据库总结
  • Codeforces Testing Round 1 B. Right Triangles 题解 组合数学
  • 怎样将word默认Microsoft Office,而不是WPS
  • C语言之进程的学习2
  • web使用cordova打包Andriod
  • 内卷情况下,工程师也应该了解的项目管理
  • 【解锁未来:深入了解机器学习的核心技术与实际应用】
  • 1-3.文本数据建模流程范例
  • 【FFmpeg】avformat_alloc_output_context2函数
  • Flask 缓存和信号
  • 基于weixin小程序农场驿站系统的设计
  • JAVA将List转成Tree树形结构数据和深度优先遍历
  • 设计模式——开闭、单一职责及里氏替换原则
  • 代码随想录算法训练营第59天:动态[1]
  • jvm性能监控常用工具
  • ISP IC/FPGA设计-第一部分-SC130GS摄像头分析-IIC通信(1)
  • HTTP协议头中X-Forwarded-For是能做什么?
  • Linux高并发服务器开发(八)Socket和TCP
  • 力扣第220题“存在重复元素 III”
  • Qt实战项目——贪吃蛇
  • Windows 10,11 Server 2022 Install Docker-Desktop
  • C++中的RAII(资源获取即初始化)原则
  • 【机器学习】Whisper:开源语音转文本(speech-to-text)大模型实战
  • ubuntu22.04 编译安装openssl C++ library
  • 百度Agent初体验(制作步骤+感想)
  • 7-491 3名同学5门课程成绩,输出最好成绩及所在的行和列(二维数组作为函数的参数)
  • OpenCloudOS开源的操作系统