激光点云语义分割俯视图系列

作者：蓝羽月妞妞 | 来源：互联网 | 2023-09-12 11:52

激光点云语义分割-俯视图系列文章目录激光点云语义分割-俯视图系列1.自己的研究思路2.SalsaNet:FastRoadandVehicleSegmentationinLiDARP

激光点云语义分割-俯视图系列

文章目录

激光点云语义分割-俯视图系列
- 1. 自己的研究思路
- 2. SalsaNet: Fast Road and Vehicle Segmentation in LiDAR Point Clouds for Autonomous Driving
- - 1. 基本思想
  - 2. 基本原理
  - 3. 实验效果
  - 4. 一些启示
- 3. Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDAR
- - 1. 基本思想
  - 2. 基本原理
  - 3. 实验效果
  - 4. 一些启示
- 4. 阿里比赛
- - 4.1 阿里竞赛第四名分享
  - - 4.1.1 比赛要求
    - 4.1.2 基本技术点-技术路线
    - 4.1.3 数据增强
    - 4.1.4 网络模型
    - 4.1.5 tricks
    - 4.1.6 题外话
    - 4.1.7 一些经验总结

1. 自己的研究思路

通过利用pointpillar的俯视图编码&＃xff1b; 然后用Unet进行分割&＃xff08;Unet进行细胞分割时&＃xff0c;能分割很精细的线条&＃xff0c;而俯视图中点云也是精细的线条&＃xff09;&＃xff1b;最后通过反查找的方式进行点云label的映射&＃xff0c;映射到3D&＃xff1b; 另外用knn或者CRF进行修正&＃xff1b;

2. SalsaNet: Fast Road and Vehicle Segmentation in LiDAR Point Clouds for Autonomous Driving

作者	年份	学校	会议	数据集	性能
Eren Erdal Aksoy	20190918	-	CoRR2019	KITTI	SalsaNet 93.75 (background/iou), 73.72(road/iou), 71.44(Vehicle/iou) 79.74(avg)

1. 基本思想

利用图像的道路标注的掩码&＃xff0c;映射到点云地面点&＃xff0c;然后生成训练集&＃xff1b; 然后将点云前视图和俯视图映射&＃xff0c;用类似Unet进行分割&＃xff1b; 效果如下&＃xff1a;
在这里插入图片描述

2. 基本原理

2.1 数据准备过程

如下图所示&＃xff0c;整个数据处理过程
1&＃xff09;先用MultiNet进行Kitti的地面分割&＃xff0c; 为什么用MultiNet&＃xff0c;因为作者说只有这个模型是在Kitti道路图像上训练过模型参数&＃xff0c; 然后将图像上分割的映射到点云上。
2&＃xff09;然后再用MaskRCNN进行车辆的语义分割&＃xff0c;然后映射到3D点云上
如图中间的效果&＃xff0c; 考虑俯视图丢失了高度信息&＃xff0c; 然后考虑了两种映射方式&＃xff0c;将点云在前视图和俯视图上分别映射&＃xff0c;然后用网络去预测学习&＃xff1b;

俯视图

1&＃xff09;俯视图范围w&＃61;[-6, 12], L&＃61;[0&＃xff0c;50] --> 映射为256x64大小的2D; cell_size(0.2, 0.3)
1. 每个cell的编码&＃xff1a;Similar to the work in 论文[4], in each grid cell, we compute the mean and maximum elevation, average reflectivity (i.e. intensity) value, and number of projected points.
1. Compared to 论文[4], we avoid using the minimum and standard deviation values of the height as additional features since our experiments showed that there is no significant contribution coming from those channels.

前视图

1&＃xff09;和SqueezeSeg一样的投射方式
2&＃xff09;前视图一个不好的特性&＃xff1a;遮挡&＃xff0c;弯曲和变形&＃xff1a;Although SFV returns more dense representation compared to BEV, SFV has certain distortion and deformation effects on small objects, e.g. vehicles. It is also more likely that objects in SFV tend to occlude each other. We, therefore, employ BEV representation as the main input to our network.

2.2 模型

在conv_block中的最后一层添加dropout和pooling; dropout对与3D这种噪声大的数据可能有好处&＃xff1b;
下采样16x&＃xff1b;
dropout放置的位置作者给出了说明&＃xff0c;参考的是文献[26]&＃xff0c;We here emphasize that dropout needs to be placed right after batch normalization. As shown in [26], an early application of dropout can otherwise lead to a shift in the weight distribution and thus minimize the effect of batch normalization during training.

2.3 类别不平衡问题
用类别比例的开方作为loss weight的比例

2.4 训练超参数配置

数据增强的特殊方式&＃xff1a; adding random pixel noise with probaility of 0.5, random rotation [-5,5]

3. 实验效果

性能对比&＃xff08;外部&＃xff09;
俯视图和前视图的性能对比 &＃xff08;内部&＃xff09;
速度对比

4. 一些启示

下面引用[11]需要学习一下&＃xff0c; 如何半自动标注
这个论文的揭露了我们不一定把所有地面标出来&＃xff0c;可以只标注freespace
这个论文loss weight的设计有一定参考价值
性能对比&＃xff0c;不仅仅IOU&＃xff0c; precision和recall的性能对比

3. Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDAR

作者	年份	学校	会议	数据集	性能
Tarlan Suleymanov	20190711	-	ITSC19	OxfordRobotcar dataset	81-91(F1 score)

1. 基本思想

提出了一个框架标注3D点云&＃xff0c;并且将其投射到鸟瞰图&＃xff0c;对路沿进行掩码标注&＃xff0c;标注包括&＃xff1a;遮挡和未遮挡的。
主要自己构建路沿检测数据集&＃xff0c;然后学习预测遮挡的和不遮挡路沿的数据&＃xff08;可能用传统算法获取路沿拟合的线&＃xff09;&＃xff0c;提出了一个网络分别拟合遮挡的和不遮挡的路沿&＃xff0c;并且用了anchor line机制&＃xff0c;该anchor机制提高了预测的准确度&＃xff08;和物体检测的anchor思想一样&＃xff09;

2. 基本原理

相关工作

[11] uses range and intensity information from 3D LIDAR to detect
visible curbs on elevation data, which fails in the presence of occluding obstacles.
[12] presents a LIDAR-based method to detect visible curbs using sliding-beam segmentation followed by segment-specific curb detection, but fails to detect curbs behind obstacles.

如何生成路沿的曲线

In this work, we used images acquired by a Point Grey Bumblebee XB3 camera, mounted on the front of the platform facing towards
the direction of motion. In particular, our implementation of VO uses FAST corners [16] combined with BRIEF descriptors [17], RANSAC [18] for outlier rejection, and nonlinear least-squares refinement.
将点的高度设置在3.55m内&＃xff0c; 防止地上的水导致点云点特别低的情况。

将可视的线和遮挡的线进行分离

To determine which points are visible and which are occluded we use the hidden point removal operator as described in [20]. The operator determines all visible points in a pointcloud when observed from a given viewpoint. This is achieved by extracting all points residing on the convex hull of a transformed pointcloud. These points resemble the visible points, all other (labeled) points are considered as hidden (or occluded). We take the previously trimmed pointclouds and create binary bird’s-eye view images by taking the height of points from the ground into account. The points that are within a predefined height difference from the LIDAR roughly correspond to the points (obstacles) that are
blocking the view. By putting together raw labels and binary masks of obstacles, obtained by running the hidden point removal algorithm, we obtain separate masks for visible and occluded road boundaries。&＃xff08;待了解&＃xff09;

网络结构

分析类Unet模型&＃xff0c;不能很好检测处遮挡路沿的原因&＃xff1a; first, the network’s limited receptive field, which is not big enough to capture context around large obstacles to estimate the position of curbs behind them, and second, the lack of structure (model-free) which prevents the network to infer very thin curves of occluded road boundaries within an image.
可见路沿是Unet这样的网络直接进行检测&＃xff1b;
遮挡的路沿&＃xff08;也就是这种路沿是没有实际点特征的&＃xff0c;label是表示这个点是遮挡路沿的点&＃xff09;&＃xff0c; 采用anchor line的方式&＃xff08;设定一些先验线&＃xff09;&＃xff0c;如下图所示, 取4个角度的先验线&＃xff0c; 然后取最切近的一个线去预测目标线。
每个grid cell怎么预测这些线的&＃xff1f;&＃xff1a; Lines in each grid cell are parameterised in a discrete-continuous(离散且连续的线段) form: first, fitted lines are assigned to one of four types of anchor lines, and secondly, offsets between fitted and anchor lines are calculated. Anchor lines pass through the centre of a grid cell at different angles (22.5◦,67.5◦, 112.5◦and 157.5◦). During fitting, lines are assignedto the closest anchor line. Once a fitted line is discretised,two continuous parameters are calculated: (1) an angle offsetbetween a fitted and the respective anchor line ($w^k_{i,j,gy}
$), a n d (2) a d i s t a n c e f r o m t h e c e n t r e o f t h e c e l l t o t h e f i t t e d l i n e ($ β^k_{i, j,gt}$). As a result, we obtain 16 numbers for each grid cell, 4 numbers(w, $β\beta$ , 类别-是否是线) for each line category.
To increase the receptive field of the model we added
intra-layer convolutions [23] before the multi-scale parameter
estimation layers. Traditional layer-by-layer convolutions are
applied between feature maps, but intra-layer convolutions
are slice-by-slice convolutions within feature maps. Hence,
intra-layer convolutions capture aspects across the whole
image and can thereby capture spatial relationships over
longer distances. For example, there is a strong correlation
between the length of the occluded curbs and the size of
objects which are obstructing the view (ranging from 10-15
pixels through occlusions by traffic cones to 200-300 pixels
through occlusions by several parked cars).
用交叉熵损失预测是否为路沿&＃xff0c; 用smoothL1预测w, $β\beta$ ;

后处理

采用时间信息&＃xff0c;也就是前后帧进行跟踪识别&＃xff0c; 这样做有两个好处: filtering out false positives and tracking true positives.
VO: 用旋转和平移矩阵表示前后帧路沿线的关系&＃xff1b; 通过视觉里程计VO&＃xff0c;将前一帧的结果映射到后一帧&＃xff0c;然后与后一帧的识别结果进行综合&＃xff1b;
filtering: we transform the last three output masks of detected road boundaries into a common reference attached to the current frame. Then we construct a histogram of output mask size (480x960) by counting the number of overlapping pixels with a value grater than threshold of 0.7 (which was determined experimentally). 可能如果这三帧的在同一个位置都有值的话&＃xff0c;histogram会高&＃xff0c;则保留&＃xff1b; 否则则剔除该点。
Tracking. In the second step, we perform a similar procedure as outlined above. However, this time we consider
road boundary masks from the last three frames that were
generated by the first step (as shown in Figure 9). By
taking the union of these masks we track the detected road
boundaries over the time. Integrating temporal information
helps to close gaps between boundary segments

3. 实验效果

总结性能图
可见路沿和遮挡路沿的对比
添加后处理后的效果

4. 一些启示

直接预测路沿线的方法
用anchor line对遮挡的线进行预测
线的拟合思路&＃xff1a; Fast corner【16】–> Brief descriptor[17]–>Ransac[18] for outlier rejection and nonlinear least-squares refinement.
将路沿分成可见路沿和遮挡路沿两种类别的思路不错
可见路沿和遮挡路沿分开进行预测的方式也值得借鉴。

4. 阿里比赛

4.1 阿里竞赛第四名分享

4.1.1 比赛要求

要求参赛者的方案在i7 CPU&＃43;GTX 1080 GPU显卡的硬件上达到至少10帧每秒的处理速度&＃xff1b; 所以本比赛一致采用的是点云映射到平面上的方式&＃xff0c;采用俯视图检测出框&＃xff0c;然后再圈点

4.1.2 基本技术点-技术路线

采用complexYOLO的数据映射方式- 栅格地图
ROI区域–>Z:-2m—2M
单个栅格尺寸&＃xff1a;10cm*10cm
栅格类别判断-> 非机动车->行人->机动车

4.1.3 数据增强

随机30‘旋转
随即水平旋转
随即平移
RGB值变更
ROI尺寸变更
最大最小高度地图

鸟瞰图像有一个很大的特点&＃xff0c;就是多方向性。传统图像数据集里面&＃xff0c;道路目标姿态往往都是类似的&＃xff0c;同时也不会有较大的倾斜。鸟瞰图数据集的这个问题就严重&＃xff0c;道路目标的朝向东南西北都有可能的&＃xff0c;因此训练集里的朝向应当要丰富&＃xff0c;避免学习到的模型不具有泛性。针对这一问题我们采取了下面几种数据增强方法&＃xff1a;随机30°倍数旋转&＃xff0c;随机水平翻转&＃xff0c;随机平移。&＃xff08;均为线上数据增强) 这脏方式有优点&＃xff0c;但是可能会存在一些异常数据被造出来&＃xff0c;如翻转的话&＃xff0c;车的左右会被调换这样可能不合理

官方提供了pts&＃xff0c;intensity&＃xff0c;category三类点云数据&＃xff0c;我们这里参考了Complex-YOLO: Real-time 3D Object Detection on Point Clouds的思路将pts&＃xff0c;intensity点云数据处理为最大反射强度&＃xff0c;最大高度&＃xff0c;归一化密度后再分别归一化到0~1的范围后重组为三通道图片数组&＃xff0c;作为我们的训练图像。

4.1.4 网络模型

YOLOv3: 0.09
RetinaNet 0.18
复赛

CascadeRCNN 严重超时
Faster RCNN Resnet101 &＃43;ROI Align&＃43; FPN&＃xff1a;0.2

4.1.5 tricks

模型融合
softnms
最后一层feature map attention
focal loss
复赛

根据推断阶段的图片尺寸重做大小一直的训练集
调整nms
调整不同类别的置信度阈值-0.24

4.1.6 题外话

这道题目前三名均为国内外在职工程师&＃xff0c;第四名、第五名是在校研究生&＃xff0c;第一名团队的答辩人是一位白俄罗斯的Kaggle Master&＃xff0c;成绩领先我们在内的中国团队很多&＃xff0c;且实时性暴打我们一众人。

博主说&＃xff1a;其实前四名的方案全是使用的栅格地图&＃xff0c;方法都是雷同的&＃xff0c;且均达到了10fps的硬性实时要求。总结一下&＃xff0c;PointNet及其衍生方法是不适合这个赛题的&＃xff0c;个人觉得pointnet更适合室内这种小空间、高信息量、实时性要求不高的环境。
主要原因如下&＃xff1a;

a、Ponitnet系列方案会利用每一个点云进行分析&＃xff0c;而无人驾驶点云数据较为稀疏&＃xff1b;

b、无人驾驶实时性要求高&＃xff0c;而采集到的数据范围极广&＃xff0c;前后纵深可达四十米&＃xff0c;使用全部数据&＃xff0c;很有可能超时&＃xff0c;等检测到前方有障碍物&＃xff0c;估计已经车毁人亡了&＃xff1b;

4.1.7 一些经验总结

1.我们本次比赛对数据的清洗和分析做的不够&＃xff0c;实际上该数据集类间数量分布很不均匀&＃xff0c;需要针对这个情况&＃xff0c;对每个类别进行置信度调整&＃xff0c;同时部分数据的标注也存在一定的问题&＃xff0c;要进行部分数据的筛选。 2.验证集的分割没做好&＃xff0c;理想应该挑选5%的数据作为验证集&＃xff0c;我们的验证集太小&＃xff0c;缺乏代表性 3.图像分割在这个赛题会比目标检测算法有着更好的精度&＃xff0c;同时速度上也会有较大的优势。 4.若使用图像分割的话&＃xff0c;推断时间减少&＃xff0c;就可以尝试在inference阶段使用TTA&＃xff08;test time augmentation) 的思路&＃xff0c;减少假阳性.

给我们的启示是对数据需要细细研究一下&＃xff0c;看一下类别分布方面。

推荐阅读

tree
欢乐的票圈重构之旅——RecyclerView的头尾布局增加

项目重构的Git地址：https:github.comrazerdpFriendCircletreemain-dev项目同步更新的文集：http:www.jianshu.comno ... [详细]

蜡笔小新 2023-12-11 19:09:56
uri
引擎之旅 Chapter.2 线程库

预备知识可参考我整理的博客Windows编程之线程:https:www.cnblogs.comZhuSenlinp16662075.htmlWindows编程之线程同步:https ... [详细]

蜡笔小新 2023-12-12 14:06:39
text
CSS3选择器的使用方法详解，提高Web开发效率和精准度

本文详细介绍了CSS3新增的选择器方法，包括属性选择器的使用。通过CSS3选择器，可以提高Web开发的效率和精准度，使得查找元素更加方便和快捷。同时，本文还对属性选择器的各种用法进行了详细解释，并给出了相应的代码示例。通过学习本文，读者可以更好地掌握CSS3选择器的使用方法，提升自己的Web开发能力。 ... [详细]

蜡笔小新 2023-12-14 14:37:52
function
JS进修笔记——闭包的运转机制和作用域

本文介绍了闭包的定义和运转机制，重点解释了闭包如何能够接触外部函数的作用域中的变量。通过词法作用域的查找规则，闭包可以访问外部函数的作用域。同时还提到了闭包的作用和影响。 ... [详细]

蜡笔小新 2023-12-14 18:45:00
process
【机器学习】生成式对抗网络模型综述

生成式对抗网络模型综述摘要生成式对抗网络模型(GAN)是基于深度学习的一种强大的生成模型，可以应用于计算机视觉、自然语言处理、半监督学习等重要领域。生成式对抗网络 ... [详细]

蜡笔小新 2023-12-14 17:51:18
text
Android开发笔记：使用Picasso加载网络图片等比例缩放

在Android开发中，使用Picasso库可以实现对网络图片的等比例缩放。本文介绍了使用Picasso库进行图片缩放的方法，并提供了具体的代码实现。通过获取图片的宽高，计算目标宽度和高度，并创建新图实现等比例缩放。 ... [详细]

蜡笔小新 2023-12-14 17:34:00
text
Nginx使用（server参数配置）

本文介绍了Nginx的使用，重点讲解了server参数配置，包括端口号、主机名、根目录等内容。同时，还介绍了Nginx的反向代理功能。 ... [详细]

蜡笔小新 2023-12-14 17:08:34
uri
C#生成随机数的三种方法及其问题分析

本文介绍了C#中生成随机数的三种方法，并分析了其中存在的问题。首先介绍了使用Random类生成随机数的默认方法，但在高并发情况下可能会出现重复的情况。接着通过循环生成了一系列随机数，进一步突显了这个问题。文章指出，随机数生成在任何编程语言中都是必备的功能，但Random类生成的随机数并不可靠。最后，提出了需要寻找其他可靠的随机数生成方法的建议。 ... [详细]

蜡笔小新 2023-12-14 14:15:30
process
九度OnlineJudge之1002：Grading问题的解决方法

本文介绍了九度OnlineJudge中的1002题目“Grading”的解决方法。该题目要求设计一个公平的评分过程，将每个考题分配给3个独立的专家，如果他们的评分不一致，则需要请一位裁判做出最终决定。文章详细描述了评分规则，并给出了解决该问题的程序。 ... [详细]

蜡笔小新 2023-12-14 13:00:09
function
不同优化算法的比较分析及实验验证

本文介绍了神经网络优化中常用的优化方法，包括学习率调整和梯度估计修正，并通过实验验证了不同优化算法的效果。实验结果表明，Adam算法在综合考虑学习率调整和梯度估计修正方面表现较好。该研究对于优化神经网络的训练过程具有指导意义。 ... [详细]

蜡笔小新 2023-12-13 16:05:14
function
Java中vector的使用详解

本文详细介绍了Java中vector的使用方法和相关知识，包括vector类的功能、构造方法和使用注意事项。通过使用vector类，可以方便地实现动态数组的功能，并且可以随意插入不同类型的对象，进行查找、插入和删除操作。这篇文章对于需要频繁进行查找、插入和删除操作的情况下，使用vector类是一个很好的选择。 ... [详细]

蜡笔小新 2023-12-13 14:14:39
function
如何在服务器主机上实现文件共享的方法和工具

本文介绍了在服务器主机上实现文件共享的方法和工具，包括Linux主机和Windows主机的文件传输方式，Web运维和FTP/SFTP客户端运维两种方式，以及使用WinSCP工具将文件上传至Linux云服务器的操作方法。此外，还介绍了在迁移过程中需要安装迁移Agent并输入目的端服务器所在华为云的AK/SK，以及主机迁移服务会收集的源端服务器信息。 ... [详细]

蜡笔小新 2023-12-13 13:23:48
function
MooTools和JQuery并排 - MooTools and JQuery Side by Side

IjustinheritedsomewebpageswhichusesMooTools.IneverusedMooTools.NowIneedtoaddsomef ... [详细]

蜡笔小新 2023-12-12 13:43:58
uri
PHP调用实现波场交互[支持TRX/TRC20]的开发包

本文介绍了一个适用于PHP应用快速接入TRX和TRC20数字资产的开发包，该开发包支持使用自有Tron区块链节点的应用场景，也支持基于Tron官方公共API服务的轻量级部署场景。提供的功能包括生成地址、验证地址、查询余额、交易转账、查询最新区块和查询交易信息等。详细信息可参考tron-php的Github地址：https://github.com/Fenguoz/tron-php。 ... [详细]

蜡笔小新 2023-12-11 17:02:09
text
《JavaScript高等顺序设计》进修笔记：JavaScript中的事宜流和事宜处置惩罚顺序

JavaScript和HTML之间的交互是经由过程事宜完成的。事宜：文档或浏览器窗口中发作的一些特定的交互霎时。能够运用侦听器（或处置惩罚递次来预订事宜），以便事宜发作时实行相应的 ... [详细]

蜡笔小新 2023-12-11 11:40:52

蓝羽月妞妞

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章