https://doi.org/10.1140/epjp/s13360-023-04775-8
Regular Article
Group Attack Dingo Optimizer for enhancing speech recognition in noisy environments
1
Department of Electronics and Communication Engineering, National Institute of Technology Karnataka, Surathkal, 575025, Mangalore, Karnataka, India
2
Department of Mathematics, Nitte (Deemed to be University), NMAM Institute of Technology, Nitte, 574110, Karkala, Karnataka, India
3
Department of Electronics and Communication Engineering, Indian Institute of Information Technology, 580029, Dharwad, Karnataka, India
Received:
3
August
2023
Accepted:
3
December
2023
Published online:
21
December
2023
The speech recognition system has become a vital technology enabling seamless human–computer interactions, even in noisy public places. To enhance the performance of various applications like machine translation, natural language processing, spoken language understanding, and text generation, speech enhancement (SE) techniques play a crucial role. In this study, we introduce a novel approach termed (GA-DOA) for optimizing speech enhancement tasks. Our method combines an improved short-time Fourier transform (STFT) and an optimized deep U-Net, with GA-DOA used to fine-tune the parameters. Additionally, feature extraction employs Mel-frequency cepstral coefficients (MFCCs), spectral features, and one-dimensional convolutional neural networks (1D-CNN). To select the most effective features, we employ GA-DOA-assisted feature selection. These optimized features are then fed into our proposed hybrid model for speech recognition (HMSR), which integrates bidirectional long short-term memory (BiLSTM) with the gated recurrent unit (GRU). Experimental results reveal that our proposed model achieves superior recognition rates and significantly lowers the word error rate (WER), thereby demonstrating enhanced system performance, even in noisy environments.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to Società Italiana di Fisica and Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.