https://doi.org/10.1140/epjp/s13360-025-06686-2
Regular Article
An adaptive recalibrative contextual squeeze-and-excitation self-attention V-Net for kidney tumor segmentation in RCC imaging
1
Department of Computer Science and Engineering, SRM Valliammai Engineering College, Kanchipuram, India
2
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India
a
This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
28
March
2025
Accepted:
24
July
2025
Published online:
14
August
2025
Abstract
Accurate and efficient kidney tumor segmentation in renal cell carcinoma (RCC) imaging is essential for early diagnosis and surgical intervention. However, existing models struggle with class imbalance, small tumor detection, boundary irregularities, and imaging variations across CT protocols, limiting their clinical applicability and generalization. To address these challenges, we propose an advanced segmentation framework called as Adaptive Recalibrative Contextual Squeeze-and-Excitation Self-Attention V-Net (ARCSAV-Net). The novel ARCSAV-Net combines various innovations in the traditional V-Net architecture to more effectively segment kidney tumors in RCC images. First, Adaptive Recalibrative Contextual Squeeze-and-Excitation (AR-CSE) Blocks enhance feature prioritization by utilizing radiomic biomarkers such as entropy and vascular features to reduce class imbalance and tumor heterogeneity. Second, the Self-Attention V-Net Mechanism enhances boundary definition by reducing redundant features and enhancing focus on low-contrast and small tumors to enhance segmentation accuracy. Third, Task-Switching Self-Supervision (TSSS) reinforces feature learning through alternating between primary segmentation and secondary tasks such as rotation and intensity prediction to mitigate overfitting and enhance model robustness. Second, Context-Based Confidence Estimation (CBCT) strengthens uncertain predictions to impose consistency on segmentation across varying imaging protocols. Lastly, Bayesian Hyperparameter Optimization (ML-TPE) adjusts model parameters with low computational overhead, reducing computational overhead while ensuring generalization. Experimental results on KiTS19 and KiTS21 datasets demonstrate that AR-CSE-SAV-Net achieves better segmentation performance, with a Dice Similarity Coefficient (DSC) of 0.985, Volumetric Overlap Error (VOE) of 0.16, and Mean Surface Distance (MSD) of 0.6 mm, significantly outperforming existing methods in accuracy and inference speed.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to Società Italiana di Fisica and Springer-Verlag GmbH Germany, part of Springer Nature 2025
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

