https://doi.org/10.1140/epjp/s13360-025-07234-8
Regular Article
A two-stage convolutional gated recurrent unit with multi-head attention and regularization for multi-step ahead prediction of particulate air pollutants
1
Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences (PIEAS), 45650, Nilore, Islamabad, Pakistan
2
Department of Physics, AIR University, PAF Complex, E-9, 44000, Islamabad, Pakistan
3
Future Technology Research Center, National Yunlin University of Science and Technology, 123 University Road, Section 3, 64002, Douliou, Yunlin, Taiwan, R.O.C.
Received:
27
May
2025
Accepted:
18
December
2025
Published online:
8
January
2026
The rapid pace of industrialization and urbanization has significantly intensified the presence of harmful air pollutants, particularly particulate matter, posing serious risks to public health. Accurate and timely prediction of these pollutants is critical for developing effective mitigation and control strategies. In this study, a two-stage framework is introduced for particulate matter prediction that integrates the strengths of the random forest regressor for spatial components and a deep temporal model of the convolutional gated recurrent unit. Random forest generates initial predictions that capture intricate, nonlinear patterns in the input features, thereby enriching the input representation for downstream learning, wherein the convolutional layer extracts localized spatial features and the gated recurrent unit layers for long-term temporal dependencies. To further refine the learned representations and emphasize the most critical temporal and spatial features, a multi-head attention layer is applied after the convolutional gated recurrent unit outputs. Additionally, regularization techniques are employed within the proposed model component to enhance generalization and mitigate overfitting. The triangular approach explores the interactions among pollutant concentrations, meteorological variables, and station locations, providing a comprehensive spatiotemporal representation of pollutant dynamics. Extensive experiments are carried out on real-world Beijing hourly datasets, using both single and multi-step forecasting. A case study involving the benchmark Indian dataset and statistical significance at a 95% confidence level validates the model’s effectiveness. The results consistently demonstrate the superiority of the proposed framework over existing methods.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to Società Italiana di Fisica and Springer-Verlag GmbH Germany, part of Springer Nature 2026
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

