当前位置: 首页 > news >正文

【C++】CUDA线程在全局索引中的计算方式

文章目录

  • 1. 一维网格一维线程块
  • 2. 二维网格二维线程块
  • 3. 三维网格三维线程块
  • 4. 不同组合形式
    • 4.1 一维网格一维线程块
    • 4.2 一维网格二维线程块
    • 4.3 一维网格三维线程块
    • 4.4 二维网格一维线程块
    • 4.5 二维网格二维线程块
    • 4.6 二维网格三维线程块
    • 4.7 三维网格一维线程块
    • 4.8 三维网格二维线程块
    • 4.9 三维网格三维线程块

1. 一维网格一维线程块

定义grid与block尺寸:

dim3 grid_size(4);
dim3 block_size(8);

调用核函数:

kernal_fun<<<grid_size, block_size>>>(...);

具体索引方式如下图所示, b l o c k I d x . x blockIdx.x blockIdx.x 从0~3, t h r e a d I d x . x threadIdx.x threadIdx.x 从0~7

在这里插入图片描述

计算方式:

i n t i d = b l o c k I d x . x ∗ b l o c k D i m . x + t h r e a d I d x . x int \ id = blockIdx.x * blockDim.x + threadIdx.x int id=blockIdx.xblockDim.x+threadIdx.x

2. 二维网格二维线程块

定义grid与block尺寸

dim3 grid_size(2,2);
dim3 block_size(4,4);

调用核函数:

kernal_fun<<<grid_size, block_size>>>(...);

具体线程索引方式如下图所示, b l o c k I d x . x blockIdx.x blockIdx.x b l o c k I d x . y blockIdx.y blockIdx.y 从0到1, t h r e a d I d x . x threadIdx.x threadIdx.x t h r e a d I d x . y threadIdx.y threadIdx.y从0到3:

在这里插入图片描述
计算方式:
i n t b l o c k I d = b l o c k I d x . x + b l o c k I d . y ∗ g i r d D i m . x i n t t h r e a d I d = t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x i n t i d = b l o c k I d ∗ ( b l o c k D i m . x ∗ b l o c k D i m . y ) + t h r e a d I d \begin{align*} &int \ blockId = blockIdx.x + blockId.y*girdDim.x \\ &int \ threadId = threadIdx.y * blockDim.x + threadIdx.x \\ &int \ id = blockId*(blockDim.x*blockDim.y) + threadId \end{align*} int blockId=blockIdx.x+blockId.ygirdDim.xint threadId=threadIdx.yblockDim.x+threadIdx.xint id=blockId(blockDim.xblockDim.y)+threadId

3. 三维网格三维线程块

定义grid和block尺寸:

dim3 grid_size(2,2,2);
dim3 block_size(4,4,2);

调用核函数:

kernal_fun<<<grid_size, block_size>>>(...);

具体线程索引方式如图所示:

在这里插入图片描述

  • b l o c k I d x . x blockIdx.x blockIdx.x 从0到1
  • b l o c k I d x . y blockIdx.y blockIdx.y 从0到1
  • b l o c k I d x . z blockIdx.z blockIdx.z 从0到1
  • t h r e a d I d x . x threadIdx.x threadIdx.x 从0到3
  • t h r e a d I d x . y threadIdx.y threadIdx.y 从0到3
  • t h r e a d I d x . z threadIdx.z threadIdx.z 从0到1

计算方式:
i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g i r d D i m . x + g r i d D i m . x ∗ g r i d D i m . y ∗ b l o c k I d x . z i n t t h r e a d I d = t h r e a d I d x . z ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x i n t i d = b l o c k I d ∗ ( b l o c k D i m . x ∗ b l o c k D i m . y ∗ b l o c k D i m . z ) + t h r e a d I d \begin{align*} &int \ blockId = blockIdx.x + blockIdx.y*girdDim.x + gridDim.x * gridDim.y*blockIdx.z \\ &int \ threadId = threadIdx.z * blockDim.x * blockDim.y+ threadIdx.y * blockDim.x + threadIdx.x \\ &int \ id = blockId*(blockDim.x*blockDim.y*blockDim.z) + threadId \end{align*} int blockId=blockIdx.x+blockIdx.ygirdDim.x+gridDim.xgridDim.yblockIdx.zint threadId=threadIdx.zblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.xint id=blockId(blockDim.xblockDim.yblockDim.z)+threadId

4. 不同组合形式

4.1 一维网格一维线程块

i n t b l o c k I d = b l o c k I d x . x i n t i d = b l o c k I d x . x ∗ b l o c k D i m . x + t h r e a d I d x . x \begin{align*} &int \ blockId = blockIdx.x \\ &int \ id = blockIdx.x*blockDim.x + threadIdx.x \end{align*} int blockId=blockIdx.xint id=blockIdx.xblockDim.x+threadIdx.x

4.2 一维网格二维线程块

i n t b l o c k I d = b l o c k I d x . x i n t i d = b l o c k I d x . x ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x \begin{align*} &int \ blockId = blockIdx.x \\ &int \ id = blockIdx.x*blockDim.x*blockDim.y + threadIdx.y*blockDim.x + threadIdx.x \end{align*} int blockId=blockIdx.xint id=blockIdx.xblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.x

4.3 一维网格三维线程块

i n t b l o c k I d = b l o c k I d x . x i n t i d = b l o c k I d x . x ∗ b l o c k D i m . x ∗ b l o c k D i m . y ∗ b l o c k D i m . z + t h r e a d I d x . z ∗ b l o c k D i m . y ∗ b l o c k D i m . x + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId = blockIdx.x \\ int \ id = blockIdx.x*blockDim.x*blockDim.y*blockDim.z +threadIdx.z*blockDim.y*blockDim.x +threadIdx.y*blockDim.x+threadIdx.x int blockId=blockIdx.xint id=blockIdx.xblockDim.xblockDim.yblockDim.z+threadIdx.zblockDim.yblockDim.x+threadIdx.yblockDim.x+threadIdx.x

4.4 二维网格一维线程块


i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x i n t i d = b l o c k I d ∗ b l o c k D i m . x + t h r e a d I d x . x ​ int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x \\ int \ id=blockId∗blockDim.x+threadIdx.x ​ int blockId=blockIdx.x+blockIdx.ygridDim.xint id=blockIdblockDim.x+threadIdx.x

4.5 二维网格二维线程块

i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x i n t i d = b l o c k I d ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x \\ int \ id=blockId∗blockDim.x∗blockDim.y+threadIdx.y∗blockDim.x+threadIdx.x int blockId=blockIdx.x+blockIdx.ygridDim.xint id=blockIdblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.x

4.6 二维网格三维线程块

i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x i n t i d = b l o c k I d ∗ b l o c k D i m . x ∗ b l o c k D i m . y ∗ b l o c k D i m . z + t h r e a d I d x . z ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x \\ int \ id=blockId∗blockDim.x∗blockDim.y∗blockDim.z+threadIdx.z∗blockDim.x∗blockDim.y+threadIdx.y∗blockDim.x+threadIdx.x int blockId=blockIdx.x+blockIdx.ygridDim.xint id=blockIdblockDim.xblockDim.yblockDim.z+threadIdx.zblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.x

4.7 三维网格一维线程块

i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x + b l o c k I d x . z ∗ g r i d D i m . x ∗ g r i d D i m . y i n t i d = b l o c k I d ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x+blockIdx.z∗gridDim.x∗gridDim.y\\ int \ id=blockId∗blockDim.x+threadIdx.x int blockId=blockIdx.x+blockIdx.ygridDim.x+blockIdx.zgridDim.xgridDim.yint id=blockIdblockDim.x+threadIdx.x

4.8 三维网格二维线程块

i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x + b l o c k I d x . z ∗ g r i d D i m . x ∗ g r i d D i m . y i n t i d = b l o c k I d ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x+blockIdx.z∗gridDim.x∗gridDim.y \\ int \ id=blockId∗blockDim.x∗blockDim.y+threadIdx.y∗blockDim.x+threadIdx.x int blockId=blockIdx.x+blockIdx.ygridDim.x+blockIdx.zgridDim.xgridDim.yint id=blockIdblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.x

4.9 三维网格三维线程块

i n t b l o c k I d = b l o c k I d x . x + b l o c k I d x . y ∗ g r i d D i m . x + b l o c k I d x . z ∗ g r i d D i m . x ∗ g r i d D i m . y i n t i d = b l o c k I d ∗ b l o c k D i m . x ∗ b l o c k D i m . y ∗ b l o c k D i m . z + t h r e a d I d x . z ∗ b l o c k D i m . x ∗ b l o c k D i m . y + t h r e a d I d x . y ∗ b l o c k D i m . x + t h r e a d I d x . x int \ blockId=blockIdx.x+blockIdx.y∗gridDim.x+blockIdx.z∗gridDim.x∗gridDim.y \\ int \ id=blockId∗blockDim.x∗blockDim.y∗blockDim.z+threadIdx.z∗blockDim.x∗blockDim.y+threadIdx.y∗blockDim.x+threadIdx.x int blockId=blockIdx.x+blockIdx.ygridDim.x+blockIdx.zgridDim.xgridDim.yint id=blockIdblockDim.xblockDim.yblockDim.z+threadIdx.zblockDim.xblockDim.y+threadIdx.yblockDim.x+threadIdx.x

http://www.lryc.cn/news/504792.html

相关文章:

  • 【笔记】C语言转C++
  • 锂电池SOH预测 | 基于BiGRU双向门控循环单元的锂电池SOH预测,附锂电池最新文章汇集
  • 半导体器件与物理篇5 1~4章课后习题
  • Pytest-Bdd-Playwright 系列教程(16):标准化JSON报告Gherkin格式命令行报告
  • 机器学习之学习范式
  • PHPstudy中的数据库启动不了
  • 鸿蒙开发-ArkTS 创建自定义组件
  • 记录学习《手动学习深度学习》这本书的笔记(五)
  • 【Qt】Qt+Visual Studio 2022环境开发
  • 云计算HCIP-OpenStack04
  • HCIA-Access V2.5_3_2_VLAN数据转发
  • transformer学习笔记-导航
  • 功能篇:JAVA后端实现跨域配置
  • 防火墙内局域网特殊的Nginx基于stream模块进行四层协议转发模块的监听443 端口并将所有接收转发到目标服务器
  • 【Hive】-- hive 3.1.3 伪分布式部署(单节点)
  • C++ STL 队列queue详细使用教程
  • 【前端】JavaScript 中的 filter() 方法的理论与实践深度解析
  • 【机器学习算法】——决策树之集成学习:Bagging、Adaboost、Xgboost、RandomForest、XGBoost
  • JVM运行时数据区内部结构
  • Navicat for MySQL 查主键、表字段类型、索引
  • 如何在谷歌浏览器中实现自定义主题
  • visual studio 2022 c++使用教程
  • 曝光三要素
  • 01-2 :PyCharm安装配置教程(图文结合-超详细)
  • 类OCSP靶场-Kioptrix系列-Kioptrix Level 1
  • Maven插件打包发布远程Docker镜像
  • VisualStudio vsix插件自动加载
  • Codesoft许可管理
  • Unity3D 3D模型/动画数据压缩详解
  • ffmpeg和ffplay命令行实战手册