Feature fusion-based hand gesture classification with time-domain descriptors and multi-level deep attention network

ALÇİN, ÖMER; Korkmaz, Deniz; Acikgoz, Hakan

doi:10.1016/j.asoc.2025.113375

Feature fusion-based hand gesture classification with time-domain descriptors and multi-level deep attention network

ALÇİN Ö. F., Korkmaz D., Acikgoz H.

Applied Soft Computing, cilt.178, 2025 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 178
Basım Tarihi: 2025
Doi Numarası: 10.1016/j.asoc.2025.113375
Dergi Adı: Applied Soft Computing
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
Anahtar Kelimeler: Attention mechanism, Convolutional neural network, Hand gesture classification, Human-robot interaction, SEMG
İnönü Üniversitesi Adresli: Evet

Özet

In conventional human-robot interaction (HRI), it is difficult to provide adaptability by located systems in the human body. Surface Electromyography (sEMG) signals have the potential to meet adaptability in HRI by directly representing movements, and classifying hand gestures with sEMG can be an effective solution to meet the increasing needs of these applications. In this paper, a hybrid and multi-scale convolutional neural network (CNN) model is proposed to obtain an efficient sEMG-based classification approach of human hand gestures. The proposed method includes an effective feature extraction process, including spectral moments, sparseness, irregularity factor, Teager–Kaiser energy, Shannon entropy, Katz fractal dimension, and Higuchi's fractal dimension, and waveform length. The obtained features are then converted to RGB images. The designed network is built on multi-scale convolutional blocks with residual learning and convolutional blocks, including the CBAM to improve the network performance by focusing on channel and spatial features. Furthermore, a pyramid non-pooling local block is utilized at the end of the network to learn more powerful features and their correlations. Five comprehensive publicly available datasets are evaluated in the experiments, and the obtained results are compared with the benchmark CNN models and network variations with different attention mechanisms. In the comparative evaluations, the CBAM achieves a classification accuracy between 84.62 % and 97.56 % while other attention mechanism results give accuracy values between 82.88 % and 97.17 %. The experiments show that the proposed method gives more accurate and robust classification performance compared with other variations and benchmark models.