https://doi.org/10.1140/epjp/s13360-024-05220-0
Regular Article
ARiViT: attention-based residual-integrated vision transformer for noisy brain medical image classification
1
Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, Pakistan
2
Department of Computer System Engineering, The University of Engineering and Applied Science (UEAS), Swat, Pakistan
3
Technology Research Center, National Yunlin University of Science and Technology, 123 University Road, Section 3, 64002, Douliou, Yunlin, Taiwan, R.O.C.
Received:
28
February
2024
Accepted:
23
April
2024
Published online:
24
May
2024
Brain tumor detection in medical image processing presents a formidable challenge due to the complex behavior exhibited by these tumors. Their intricate nature arises from a variety of shapes and textures within brain tumor images, exacerbated by their diverse cellular origins. This complexity is further compounded by the presence of noisy images, which add an additional layer of difficulty. The overlapping image intensities in tumor and non-tumor areas pose a significant hurdle in extracting meaningful insights from raw data using predictive models. In response to these challenges, we introduce ARiViT, a novel framework based on residual learning. ARiViT merges vision transformers, convolutions, and adversarial learning to tackle the complexities of brain tumor detection. ARiViT’s core generator integrates RiT blocks, merging residual convolutions and transformers, with residual connections, channel compression, and strategic weight sharing for enhanced capabilities and computational efficiency. ARiViT demonstrates exceptional robustness in handling noisy images and adapts well to various modality setups. Our systematic approach involves dataset splitting into training, testing, and validation sets, maintaining distributions of 80:20 and subsequently 70:30, ensuring rigorous evaluation. Additionally, our data preprocessing strategies are tailored to handle noisy and low-quality images, as well as those with unusual tumor shapes. The GAN architecture embedded in ARiViT effectively manages the complexities inherent in such image variations, ensuring reliable tumor detection even in challenging scenarios. Through exhaustive comparative evaluations, ARiViT proves its superiority over existing CNN and transformer techniques, as evidenced by both qualitative and quantitative analyses. Our efforts culminate in remarkable achievements, including an F1-score of 98.09%, establishing a pioneering solution in the demanding field of brain tumor detection.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to Società Italiana di Fisica and Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.