Vision-Vibration Fusion

Introduction

Multimodal fusion has attracted more and more attention from researchers, such as LiDAR and vision, infrared and vision, and so on. However, the method of LiDAR and visual fusion are limited by the type of sensor. Lidar performs poorly in rainy and foggy days, while vision performs poorly at night.

We propose a vibration data set for the first time, named VBLane data set, which collects the vibration of the current wheel through the vibration sensors arranged on the wheel. The vibration marking on the road can give the signal of whether to press the line at present, so the weak supervision signal can be processed by continuous learning, reinforcement learning and other methods. At the same time, on the acquisition vehicle, we maintain the layout of LiDAR and visual sensors to facilitate multimodal fusion. The VBLane data set consists of two parts, the vibration signal dataset and the corresponding time visual camera dataset.

VBLane data set

Fig.1 The experimental vehicle

The cameras on the left and right sides are aligned with the wheels, which can record the time and condition of the wheel pressing line, and verify whether the vibration data division result is correct. In addition, a 32-line LiDAR is arranged at the front of the vehicle to facilitate the perception of the surrounding environment.

Acquisition devices. The data acquisition equipment configuration is shown in Fig.1. The experimental vehicle is BaiQi Lite, the camera model is Basler acA1920-40, and the vibration signal acquisition sensor is a Siemens PCB 3-way ICP acceleration sensor, as shown in Fig.2. The detailed installation position of the sensor is: the front-view direction camera is installed on the front hood, and the left and right cameras are installed on the rear mirror respectively.

Fig.2 Siemens PCB 3-way ICP acceleration sensor

Fig.3. A raised oscillation marking line

Production process. From May 2021 to July 2021, we recorded lane line video from 17:00 to 19:00 using cameras(setting: 20Hz, 1920 x 1080) and vibration signal acquisition sensor(setting: 128Hz). We collected camera data and vibration data in scenes such as turning intersections and speed bumps in Beijing, China. The lane line we selected when we collected the data was a raised oscillation marking line, shown in the Fig.3.

Vibration signal dataset. Since the existing data sets, such as KITTI, A2D2, etc., do not provide data sets on vibration, we have carried out synchronous collection of vibration and visual data. The visualization effect of the vibration signal is shown in the Fig.4.

Fig.4. Vibration data visualization

Fig.5. Images dataset visualization

Images dataset. Most of the current data sets on lane line detection focus on the front-view camera's record of the lane line ahead, while little attention is paid to lane line information from a lateral perspective, but it is this information that makes a clear record of whether the wheel is pressing the line. To do this, we've released a new lane detection dataset recorded using lateral cameras. The data set contains a variety of occasions, such as straight lines(top left), dashed lines(top middle), curves(top right), occlusion(bottom left), daytime(bottom middle), dusk(bottom right), etc. We split the entire video to frames whose total size is 8G, and those video frames were divided into two categories, pressure line and non-pressure line, and made a dataset by categorizing and sampling. When sampling, we split the dataset into 158 folders, each containing at least 4 consecutive video frames, to facilitate subsequent work with continuous frames as input. The images dataset visualization is shown in Fig.5.


Annotation

In this dataset, we provided two labels, in addition to the classification labels provided through the vibration signal classification network, and we manually label the RGB image by observing whether the wheels are pressing the line. It is worth noting that we divide the mask labels into two types of different colors, namely, horizontal lane lines and longitudinal lane lines. The purpose of the two categories mark is that we can train the model by using the horizontal lane line or the longitudinal lane line, in order to better judge the detection effect of the model on the horizontal lane lines or the longitudinal lane lines.

Download

The VBLane dataset is published under CC BY-NC-SA 4.0 license. It can be downloaded for free for academic research.

Thanks to Datatang(Stock Code : 831428) for providing us with professional data annotation services. Datatang is the world’s leading AI data service provider. We have accumulated numerous training datasets and provide on-demand data collection and annotation services. Our Data++ platform could greatly reduce data processing costs by integrating automatic annotation tools. Datatang provides high-quality training data to 1,000+ companies worldwide and helps them improve the performance of AI models.

Tsinghua University

Haidian District, Beijing, 100084, P. R. China

Email:xyzhang@tsinghua.edu.cn

© Copyright 2016-2020 www.OpenMPD.com. All Rights Reserved