UniOcc:
A Unified Benchmark for Occupancy Forecasting and Prediction
in Autonomous Driving
- Yuping Wang*
- Xiangyu Huang*
- Xiaokang Sun*
- Mingxuan Yan
- Shuo Xing
- Zhengzhong Tu
- Jiachen Li‡
*Equal contribution, ‡Corresponding author
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
A comprehensive, open-source benchmark unifying 2D/3D occupancy labels, per-voxel flow annotations across multiple real-world and synthetic datasets.
Our UniOcc Framework
enables three representative tasks: occupancy prediction, occupancy forecasting with optional flow, and cooperative occupancy prediction and forecasting
by unifying:
-
➤ Data Format and Features
File Structure
-
scene_infos.pkl
A list of dictionaries, each containing the scene name, start and end frame, and other metadata.
-
scene_XXX
A directory containing the data for a single scenario.
-
YYY.npz
A NumPy file containing the following data for a single time step.
Core Data (`.npz` file contents)
- occ_label
A 3D occupancy grid (L x W x H) with semantic labels.
- occ_mask_camera
A 3D grid (L x W x H) with binary values, with 1 indicating the voxel is in the camera FOV and 0 otherwise.
- occ_flow_forward
A 3D flow field (L x W x H x 3) with voxel flow vectors pointing to each voxel’s next frame coordinate.
- occ_flow_backward
A 3D flow field pointing to each voxel’s previous frame coordinate.
- ego_to_world_transformation
A 4x4 transformation matrix from the ego vehicle to the world coordinate system.
-
cameras: List of camera objects.
- name
e.g. CAM_FRONT
- filename
Relative path to image.
- intrinsics
3x3 intrinsic matrix.
- extrinsics
4x4 camera-to-ego matrix.
- name
-
annotations: List of object annotations.
- token
Original object token.
- agent_to_ego
4x4 object-to-ego matrix.
- agent_to_world
4x4 object-to-world matrix.
- size
Bounding box in meters (L,W,H).
- category_id
Semantic class ID.
- token
-
scene_infos.pkl
-
➤ Datasets
Occ3D nuScenes
nuScenesData Source:9.5 hrsLength:1110Scenarios:2 HzSampling Rate:0.2 / 0.4 mResolution:-Flow:17Obj Categories:Occ3D Waymo
WaymoData Source:4.0 hrsLength:998Scenarios:10 HzSampling Rate:0.2 / 0.4 mResolution:-Flow:15Obj Categories:SurroundOcc
nuScenesData Source:9.5 hrsLength:1110Scenarios:2 HzSampling Rate:0.5 mResolution:-Flow:17Obj Categories:OpenOccupancy
nuScenesData Source:9.5 hrsLength:1110Scenarios:2 HzSampling Rate:0.1 mResolution:-Flow:17Obj Categories:CoHFF
OpenCOODData Source:0.69 hrsLength:44Scenarios:10 HzSampling Rate:1.0 mResolution:-Flow:10Obj Categories:UniOcc (Ours)
nuScenes, Waymo, CARLA, OpenCOODData Source:14.2+ hrsLength:2152Scenarios:2 Hz / 10 HzSampling Rate:0.2 / 0.4 mResolution:Voxel LevelFlow:10, 15, 17Obj Categories: -
➤ Occupancy Processing Toolkit
Occupancy Space Utilities
- GetVoxelCoordinates
Compute voxel indices in a 3D grid occupied by a transformed bounding box.
- VoxelToCorners
Convert voxel indices to 3D bounding-box corner coordinates.
- OccFrameToEgoFrame
Transform voxel coordinates between occupancy and ego-centric frames.
- AlignToCentroid
Recenter voxel coordinates by subtracting their centroid.
- RasterizeCoordsToGrid
Convert a list of voxel coordinates into a binary 3D occupancy grid.
- AlignWithPCA
Rotate voxel point clouds to align with principal axes using PCA.
- SegmentVoxels
Perform 3D connected-component labeling (CCL) on an occupancy grid.
- EstimateEgoMotionFromFlows
Estimate ego-motion from voxel flow fields using RANSAC.
- TrackOccObjects
Track objects across frames using voxel flows and estimated ego-motion.
- BipartiteMatch
Solve the optimal assignment problem (Hungarian algorithm) to associate objects.
Voxel Flow Computation
- ComputeFlowsForObjects
Compute flow vectors for all foreground (dynamic) objects.
- ComputeFlowsForBackground
Compute static scene flow due to ego-motion for background voxels.
- ComputeFlowsForOccupancyGrid
Combine dynamic and static flow to produce a complete scene flow grid.
- GetVoxelCoordinates
-
➤ Evaluation Metrics
- FindGMMForCategory
Fit a Gaussian-Mixture Model (GMM) to object dimensions for a given category.
- ComputeObjectLikelihoods
Score object shapes against the pretrained GMM to get plausibility probabilities.
- ComputeTemporalShapeConsistencyByTracking
Track objects and report mean IoU of consecutive shapes for temporal consistency.
- ComputeStaticConsistency
Warp static voxels between frames and compute IoU for background stability.
- ComputeIoU
Compute standard intersection-over-union between two occupancy grids.
- ComputeIoUForCategory
Compute IoU restricted to voxels of a single semantic class.
- FindGMMForCategory
-
➤ Visualization API
- FillRoadInOcc
Ensure the bottom-most slice of the occupancy grid contains labeled road voxels.
- CreateOccHandle
Build a full Open3D visualizer for rendering occupancy grids.
- AddFlowToVisHandle
Draw flow vectors on the Open3D visualizer for motion inspection.
- AddCenterEgoToVisHandle
Add the ego vehicle model to the visualizer.
- VisualizeOcc
Create a visualizer for a static occupancy grid.
- VisualizeOccFlow
Visualize both occupancy and voxel-level flow vectors in 3D.
- VisualizeOccFlowFile
Load and visualize data from an .npz file.
- RotateO3DCamera
Load camera parameters and apply them to the current visualizer view.
- FillRoadInOcc
Abstract
We introduce UniOcc, a comprehensive, unified benchmark for occupancy forecasting (i.e., predicting future occupancies based on historical information) and current-frame occupancy prediction from camera images. UniOcc unifies data from multiple real-world datasets (i.e., nuScenes, Waymo) and high-fidelity driving simulators (i.e., CARLA, OpenCOOD), which provides 2D/3D occupancy labels with per-voxel flow annotations and support for cooperative autonomous driving. In terms of evaluation, unlike existing studies that rely on suboptimal pseudo labels for evaluation, UniOcc incorporates novel metrics that do not depend on ground-truth occupancy, enabling robust assessment of additional aspects of occupancy quality. Through extensive experiments on state-of-the-art models, we demonstrate that large-scale, diverse training data and explicit flow information significantly enhance occupancy prediction and forecasting performance.
Key Ideas and Contributions
1) First-of-its-kind unified 2D/3D occupancy forecasting and prediction benchmark: including flow information for conventional and cooperative driving by unifying real data from nuScenes and Waymo and synthetic data from CARLA & OpenCOOD.
2) A user-friendly platform for current-frame occupancy prediction and multi-frame occupancy forecasting: enabling easy setup, cross-dataset augmentation, and comprehensive occupancy evaluation with or without reference to ground-truth labels
3) State-of-the-art performance of our pipeline and evaluation metrics on leading occupancy forecasting/prediction models: showing that (1) incorporating flow information yields performance gains in occupancy forecasting and (2) existing methods face challenges in cross-domain generalization, highlighting avenues for future research.
Qualitative Results of Occupancy Prediction
Quantitative Results of Cross Data Source Training and Evaluation for Occupancy Forecasting