SCIEPublish

MUGI-Net: A Group-Aware Pedestrian Trajectory Prediction Model for Autonomous Vehicles from First-Person View

Article Open Access

MUGI-Net: A Group-Aware Pedestrian Trajectory Prediction Model for Autonomous Vehicles from First-Person View

Author Information
Wang Zheng Institute of Micro-Electronics, Changzhou University, Changzhou 213000, China
*
Authors to whom correspondence should be addressed.

Received: 12 March 2026 Revised: 26 March 2026 Accepted: 10 April 2026 Published: 22 April 2026

Creative Commons

© 2026 The authors. This is an open access article under the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).

Views:27
Downloads:18
Drones Auton. Veh. 2026, 3(2), 10012; DOI: 10.70322/dav.2026.10012
ABSTRACT: With the rapid development of autonomous driving, first-person view (FPV) pedestrian trajectory prediction has emerged as a key research direction to improve transportation system safety and operational efficiency. However, current studies ignore inter-pedestrian group information and long- and short-term dependence, leading to error accumulation at medium and long temporal horizons. To address these problems, we propose an FPV pedestrian trajectory prediction model dubbed MUGI-Net (Mixture of Universals and Group Interaction Network). It adopts a group pooling mechanism to adaptively aggregate group nodes and build sparse intra- and inter-group interaction graphs to fuse group interaction information. Afterward, it employs a Mixture of Universals (MoU) structure that combines MoF (Mixture of Feature Extractors) and MoA (Mixture of Architectures) to capture short-term dynamics and long-term dependencies simultaneously. Extensive experiments on the JAAD and PIE datasets show that MUGI-Net reduces the 1.5 s prediction MSE by 5% compared with the state-of-the-art AANet, and achieves the best performance on multiple key metrics, which is beneficial for autonomous driving in mixed traffic scenarios.
Keywords: First-person view; Trajectory prediction; Group interaction; Hybrid temporal encoding
TOP