Benchmarks Multimodal

Relational Map Perception

Online HD map construction from sensor data, targeting full Lanelet2 fidelity — including traffic signs, traffic lights, bike lanes, and complete topological connectivity. KITScenes Multimodal is the first public dataset to provide all of these elements simultaneously.

Leaderboard

Stay tuned for the KITScenes Multimodal Challenges!

Community leaderboard coming soon.

Preview the dataset on HuggingFace ↗

Paper Results

Average precision (AP ↑) per map element category. Classes grouped into Lane Markings (LM), Lane Centerlines (LC), Road Infrastructure (RI), Traffic Lights (TL), and Traffic Signs (TS). For MapQR-Topo the topology score (APtopo) is additionally reported.

Model APavg APLM APLC APRI APTL APTS APtopo
MapTRv2 5.1 18.0 6.7 5.8 3.0 8.1
SDTagNet 4.5 19.4 7.1 6.3 2.4 9.0
MapQR-Topo 4.1 16.0 5.9 3.6 1.9 5.6 16.4

APtopo reported only for MapQR-Topo, which is the only method designed to predict full topological connectivity. Road Markings (RM) category not reported.

KIT FZI TU Delft UC3M UPM University of Toronto