| Issue |
ITM Web Conf.
Volume 84, 2026
2026 International Conference on Advent Trends in Computational Intelligence and Data Science (ATCIDS 2026)
|
|
|---|---|---|
| Article Number | 01003 | |
| Number of page(s) | 5 | |
| Section | Intelligent Computing in Healthcare and Bioinformatics | |
| DOI | https://doi.org/10.1051/itmconf/20268401003 | |
| Published online | 06 April 2026 | |
Swin-UNet: A Unified Transformer–CNN Framework for Multi-Organ Medical Image Segmentation
School of Information Science and Engineering, Lanzhou University, Lanzhou, China
* Corresponding author’s email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Abstract
Transformer-based architectures have demonstrated significant promise in medical image segmentation due to their strong ability to model long-range contextual relationships. However, standard Vision Transformer (ViT) modules used in hybrid networks such as TransUNet are limited in representing both fine-grained and coarse features effectively. To overcome this limitation, this paper introduces Swin-UNet, a hybrid framework that combines the hierarchical Swin Transformer encoder with a U-Net-inspired decoder. The encoder utilizes shifted-window self-attention for efficient local-global feature learning, while the decoder integrates residual convolutional paths and multi-scale patch embeddings for improved reconstruction and scale robustness. Evaluated on the Synapse multi-organ CT dataset, the model achieves competitive Dice scores and lower Hausdorff distances compared to U-Net and TransUNet, highlighting its potential as a robust and generalizable approach for medical image segmentation. These results suggest that the Swin-UNet effectively balances computational efficiency with segmentation accuracy, offering a strong foundation for future medical imaging applications.
© The Authors, published by EDP Sciences, 2026
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

