PEPR: Privileged Event-based Predictive Regularization

Abstract

Deep neural networks for visual perception are highly susceptible to domain shift, limiting their deployment under conditions that differ from the training data. We address this problem through a cross-modal Learning Using Privileged Information framework, where event cameras are available only during training and the final model remains RGB-only at inference.

RGB streams are semantically dense but domain-dependent, while event streams are sparse yet more domain-invariant. Direct feature alignment between these modalities is therefore suboptimal, as it can force RGB representations to mimic sparse event features and lose semantic detail. To overcome this, we introduce Privileged Event-based Predictive Regularization: RGB features are trained to predict event-derived latent representations in a shared feature space, transferring event robustness without direct alignment or input reconstruction.

PEPR improves robustness to day-to-night and adverse domain shifts across object detection and semantic segmentation, while preserving a standard RGB-only inference pipeline.

TL;DR: PEPR uses events only during training. Instead of aligning RGB and event features, it makes RGB features predict event latents, improving robustness while keeping RGB-only inference.

Key Ideas

Prediction, not alignment

PEPR avoids forcing dense RGB features to directly match sparse event features. Instead, the RGB encoder predicts event-derived latent targets.

Events as privileged information

Events are used only during training as an additional supervisory signal. They are discarded after training.

RGB-only deployment

At test time, PEPR uses the original RGB model without event input, additional sensors, or extra inference modules.

Method

During training, PEPR uses an RGB encoder, a task-specific prediction head, a privileged event encoder, and a predictor. The RGB stream is optimized with the standard supervised task loss, while the predictor maps RGB features to event latent patches produced by the event encoder. The event encoder and predictor are removed at inference, leaving a robust RGB-only model.

Results

PEPR improves RGB-only detection and segmentation robustness under domain shift, outperforming direct alignment-based regularization.

Semantic Segmentation

Object Detection — FRED

Object Detection — Hard-DSEC

Datasets

FRED: Hugging Face dataset
DSEC: official dataset website
Hard-DSEC-DET: EA-DETR repository
Cityscapes: official website
Cityscapes Adverse: Hugging Face dataset

To simulate the event version of Cityscapes, please refer to the official VID2E repository.

Code

Code for semantic segmentation and object detection is coming soon.

GitHub Repository

BibTeX

@inproceedings{magrini2026pepr,
  title={PEPR: Privileged Event-based Predictive Regularization for Domain Generalization},
  author={Magrini, Gabriele and Becattini, Federico and Biondi, Niccol{\`o} and Pala, Pietro},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}