Small, fast, and lightweight drones present significant challenges for traditional RGB cameras due to their limitations in capturing fast-moving objects, especially under challenging lighting conditions. Event cameras offer an ideal solution, providing high temporal definition and dynamic range, yet existing benchmarks often lack fine temporal resolution or drone-specific motion patterns, hindering progress in these areas. This paper introduces the Florence RGB-Event Drone dataset (FRED), a novel multimodal dataset specifically designed for drone detection, tracking, and trajectory forecasting, combining RGB video and event streams. FRED features more than 7 hours of densely annotated drone trajectories, using 5 different drone models and including challenging scenarios such as rain and adverse lighting conditions. We provide detailed evaluation protocols and standard metrics for each task, facilitating reproducible benchmarking. The authors hope FRED will advance research in high-speed drone perception and multimodal spatiotemporal understanding.
FRED dataset is spatio-temporally synchronized, meaning RGB and Event frames can be perfectly overlapped.
FRED has detection annotations for 7+ hours of drone footage, with more than 700,000 annotated frames per each modality.
FRED presents challenging tasks such as tracking multiple drones within a scene.
Drone trajectory forecasting is also explored in FRED as a separate task, with a collection of complex drone motions given the multiple drone models and challenging scenarios.
Under Review