Argoverse 2 Lidar Dataset Overview
Table of Contents
Overview
The Argoverse 2 Lidar Dataset is intended to support research into self-supervised learning in the lidar domain as well as point cloud forecasting. The AV2 Lidar Dataset is mined with the same criteria as the Forecasting Dataset to ensure that each scene is interesting. While the Lidar Dataset does not have 3D object annotations, each scenario carries an HD map with rich, 3D information about the scene.
Dataset Size
Our dataset is the largest such collection to date with 20,000 thirty second sequences.
Sensor Suite
Lidar sweeps are collected at 10 Hz. In addition, 6-DOF ego-vehicle pose in a global coordinate system are provided. Lidar returns are captured by two 32-beam lidars, spinning at 10 Hz in the same direction, but separated in orientation by 180°.
We aggregate all returns from the two stacked 32-beam sensors into a single sweep. These sensors each have different, overlapping fields-of-view. Both lidars have their own reference frame, and we refer to them as up_lidar
and down_lidar
, respectively. We have egomotion-compensated the lidar sensor data to the egovehicle reference nanosecond timestamp. All lidar returns are provided in the egovehicle reference frame, not the individual lidar reference frame.
Dataset Structure Format
Tabular data (lidar sweeps, poses) are provided as Apache Feather Files with the file extension .feather
.
Maps: A local vector map is provided per log, please refer to the Map README for additional details.
Directory structure:
av2
└───lidar
└───train
| └───LyIXwbWeHWPHYUZjD1JPdXcvvtYumCWG
| └───sensors
| | └───lidar
| | └───15970913559644000.feather
| | .
| | .
| | .
| └───calibration
| | └───egovehicle_SE3_sensor.feather
| └───map
| | └───log_map_archive_LyIXwbWeHWPHYUZjD1JPdXcvvtYumCWG__Summer____PIT_city_77257.json
| └───city_SE3_egovehicle.feather
└───val
└───test
An example sweep sensors/lidar/15970913559644000.feather
, meaning a reference timestamp of 15970913559644000 nanoseconds:
x y z intensity laser_number offset_ns
0 -1.291016 2.992188 -0.229370 24 31 3318000
1 -25.921875 25.171875 0.992188 5 14 3318000
2 -15.500000 18.937500 0.901855 34 16 3320303
3 -3.140625 4.593750 -0.163696 12 30 3320303
4 -4.445312 6.535156 -0.109802 14 29 3322607
... ... ... ... ... ... ...
98231 18.312500 -38.187500 3.279297 26 50 106985185
98232 23.109375 -34.437500 3.003906 20 49 106987490
98233 4.941406 -5.777344 -0.162720 12 32 106987490
98234 6.640625 -8.257812 -0.157593 6 33 106989794
98235 20.015625 -37.062500 2.550781 12 47 106989794
[98236 rows x 6 columns]
Lidar Dataset splits
We randomly partition the dataset into the following splits:
- Train (16,000 logs)
- Validation (2,000 logs)
- Test (2,000 logs)