当前位置：首页 > news >正文

SparseConv 的学习笔记

news 2025/8/21 23:43:13

安装

环境设置在74.183 sdfstudio 里面,SparseNeus 推荐的版本是是 torchsparse = 2.0.0版本

命令行如下：

需要 C++ 的 sudo 权限指定安装：


## 安装依赖项
conda install -c conda-forge sparsehash
sudo apt-get install libsparsehash-dev

进入官网下载源代码安装torchsparse, 中间最好打开服务器的外网，需要下载好7~8个python 对应的 wheel , 大约有20分钟的等待时间：

git clone --recursive https://github.com/mit-han-lab/torchsparse
python setup.py install

学习(以`SparseNeus`的 code 为例子)

将一个空间Volume 划分成 96*96*96 的小的 voxel ,每一个小的 voxel 都有一个当前相机系的(齐次)坐标，变量名是up_coords 对应的 shape(241788,4). 每一个小的 voxel 都会向 reference image 进行投影，插值得到2D 的feature,最后得到变量 multiview_features，对应的 shape 是(241788,6,16),表示每个点在6张 reference image 上投影得到 C=32的 feature.

multiview_features, multiview_masks = back_project_sparse_type(up_coords, partial_vol_origin, self.voxel_size, feats,KRcam, sizeH=sizeH, sizeW=sizeW)  # (num of voxels, num_of_views, c), (num of voxels, num_of_views)

对于每一个小的 Voxel 内部的 feature 进行方差的计算去得到 costvolume

volume = self.aggregate_multiview_features(multiview_features, multiview_masks)

volume 的shape 是 (241788,6,32)

使用 sparseConv 对于点进行卷积：

sparse_feat = SparseTensor(feat, r_coords.to(torch.int32))  # - directly use sparse tensor to avoid point2voxel operations
feat = self.sparse_costreg_net(sparse_feat)

SparseConv 是如何工作的

教程的网址：https://towardsdatascience.com/how-does-sparse-convolution-work-3257a0a8fd1

SparseConv 通常用于对于 3D 点云的处理， 3D 点云 在空间中的绝大多数的地方是 empty，因此使用 Dense 的 3DCNN 实际上是效率很低下的。本质上是建立 Hash Table，保存特定位置的计算结果。每一个稀疏的元素可以被数据和对应的索引所表示。 SparseConv 使用 HashTable 存储了所有的 active input sites, 然后通过一个叫做 RuleBook的数据结构，实现向输出的映射。

使用

SparseConv 还是和普通的 Conv3d 一样，只能处理规则化的离散数据，因此对于每一个有效数据，都也要输入对应的 index. 这里的 r_coords 是每个点对应的 index,数据类型是 torch.int32, 维度是[X,Y,Z,B] (V2.0) 或者 [B,X,Y,Z] (V2.1).

此处需要严格注意 torch.sparse 的版本，决定batch_size 放在哪一块。

转换成 sparseTensor 的结构

feat [N,16] r_coords [N,4], 4 维度分别存储[X,Y,Z,B]
sparse_feat = SparseTensor(feat, r_coords.to(torch.int32))

使用SparseCNN 对于 SparseTensor 进行卷积

feat = self.sparse_costreg_net(sparse_feat)

实际测试调用 Sparse Tensor 的例子：

   import numpy as npresolution = np.array([101, 101, 301])X_range = np.array([0, 100])Y_range = np.array([0, 100])Z_range = np.array([0, 300])xs = np.linspace(X_range[0],X_range[1],resolution[0])ys = np.linspace(Y_range[0],Y_range[1],resolution[1])zs = np.linspace(Z_range[0],Z_range[1],resolution[2])xx, yy, zz = np.meshgrid(xs, ys, zs, indexing="ij")coord = torch.tensor(np.vstack([xx.ravel(), yy.ravel(), zz.ravel()]).T, dtype=torch.float).contiguous().cuda()point_nums = coord.shape[0]# batch index is in the last position in v2.0zeros = torch.zeros(point_nums, 1).cuda()coord = torch.cat((zeros,coord), dim=1).to(torch.int32)feat = torch.randn(point_nums, 3).cuda()sparse_feat= SparseTensor(feat,coords=coord)  net = SparseCostRegNet(d_in=3, d_out=16).cuda()ans = net.forward(sparse_feat)print(ans.shape)exit()