3D Object Detection

Table of Contents

Overview

The Argoverse 3D Object Detection task differentiates itself with its 26 category taxonomy and long-range (150 m) detection evaluation. We detail the task, metrics, evaluation protocol, and detailed object taxonomy information below.

Baselines

Task Definition

For a unique tuple, (log_id, timestamp_ns), produce a ranked set of predictions that describe an object’s location, size, and orientation in the 3D scene:

3D Object Detection Taxonomy

CategoryDescription
REGULAR_VEHICLEAny conventionally sized passenger vehicle used for the transportation of people and cargo. This includes Cars, vans, pickup trucks, SUVs, etc.
PEDESTRIANPerson that is not driving or riding in/on a vehicle. They can be walking, standing, sitting, prone, etc.
BOLLARDBollards are short, sturdy posts installed in the roadway or sidewalk to control the flow of traffic. These may be temporary or permanent and are sometimes decorative.
CONSTRUCTION_CONEMovable traffic cone that is used to alert drivers to a hazard. These will typically be orange and white striped and may or may not have a blinking light attached to the top.
CONSTRUCTION_BARRELMovable traffic barrel that is used to alert drivers to a hazard. These will typically be orange and white striped and may or may not have a blinking light attached to the top.
STOP_SIGNRed octagonal traffic sign displaying the word STOP used to notify drivers that they must come to a complete stop and make sure no other road users are coming before proceeding.
BICYCLENon-motorized vehicle that typically has two wheels and is propelled by human power pushing pedals in a circular motion.
LARGE_VEHICLELarge motorized vehicles (four wheels or more) which do not fit into any more specific subclass. Examples include extended passenger vans, fire trucks, RVs, etc.
WHEELED_DEVICEObjects involved in the transportation of a person and do not fit a more specific class. Examples range from skateboards, non-motorized scooters, segways, to golf-carts.
BUSStandard city buses designed to carry a large number of people.
BOX_TRUCKChassis cab truck with an enclosed cube shaped cargo area. It should be noted that the cargo area is rigidly attached to the cab, and they do not articulate.
SIGNOfficial road signs placed by the Department of Transportation (DOT signs) which are of interest to us. This includes yield signs, speed limit signs, directional control signs, construction signs, and other signs that provide required traffic control information. Note that Stop Sign is captured separately and informative signs such as street signs, parking signs, bus stop signs, etc. are not included in this class.
TRUCKVehicles that are clearly defined as a truck but does not fit into the subclasses of Box Truck or Truck Cab. Examples include common delivery vehicles (UPS, FedEx), mail trucks, garbage trucks, utility trucks, ambulances, dump trucks, etc.
MOTORCYCLEMotorized vehicle with two wheels where the rider straddles the engine. These are capable of high speeds similar to a car.
BICYCLISTPerson actively riding a bicycle, non-pedaling passengers included.
VEHICULAR_TRAILERNon-motorized, wheeled vehicle towed behind a motorized vehicle.
TRUCK_CABHeavy truck commonly known as “Semi cab”, “Tractor”, or “Lorry”. This refers to only the front of part of an articulated tractor trailer.
MOTORCYCLISTPerson actively riding a motorcycle or a moped, including passengers.
DOGAny member of the canine family.
SCHOOL_BUSBus that primarily holds school children (typically yellow) and can control the flow of traffic via the use of an articulating stop sign and loading/unloading flasher lights.
WHEELED_RIDERPerson actively riding or being carried by a wheeled device.
STROLLERPush-cart with wheels meant to hold a baby or toddler.
ARTICULATED_BUSArticulated buses perform the same function as a standard city bus, but are able to bend (articulate) towards the center. These will also have a third set of wheels not present on a typical bus.
MESSAGE_BOARD_TRAILERTrailer carrying a large, mounted, electronic sign to display messages. Often found around construction sites or large events.
MOBILE_PEDESTRIAN_SIGNMovable sign designating an area where pedestrians may cross the road.
WHEELCHAIRChair fitted with wheels for use as a means of transport by a person who is unable to walk as a result of illness, injury, or disability. This includes both motorized and non-motorized wheelchairs as well as low-speed seated scooters not intended for use on the roadway.

Metrics

All of our reported metrics require assigning predictions to ground truth annotations written as to compute true positives (TP), false positives (FP), and false negatives (FN).

Formally, we define a true positive as:

where is a distance threshold in meters.

Important

Duplicate assignments are considered false positives.

Average Precision

Average precision measures the area underneath the precision / recall curve across different true positive distance thresholds.

True Positive Metrics

All true positive metrics are at a threshold.

Average Translation Error (ATE)

ATE measures the distance between true positive assignments.

Average Scale Error (ASE)

ASE measures the shape misalignent for true positive assignments.

Average Orientation Error (AOE)

AOE measures the minimum angle between true positive assignments.

Composite Detection Score (CDS)

CDS measures the overall performance across all previously introduced metrics.

where are the normalized true positive errors.

Note

, , and are upper bounded by , , and .

Important

CDS is the ranking metric.

Evaluation

The 3D object detection evaluation consists of the following steps:

  1. Partition the predictions and ground truth objects by a unique id, (log_id: str, timestamp_ns: uint64), which corresponds to a single sweep.

  2. For the predictions and ground truth objects associated with a single sweep, greedily assign the predictions to the ground truth objects in descending order by likelihood.

  3. Compute the true positive, false positive, and false negatives.

  4. Compute the true positive metrics.

  5. True positive, false positive, and false negative computation.