International Journal of Emerging Research in Engineering, Science, and Management
Vol. 5, Issue 2, pp. 96-118, Apr-Jun 2026.
https://doi.org/10.58482/ijeresm.v5i2.6
Received: 12 Mar 2026 | Revised: 29 May 2026 | Accepted: 10 Jun 2026 | Published: 16 Jun 2026
This work is licensed under a Creative Commons Attribution 4.0 International License.
Feedback-Guided Parallel Transformer Framework for Remote Sensing Image Semantic Segmentation
1K. S. Raghavendra Reddy
1B. Sudhakar
2K. Venkata Ramanaiah
1Department of Electronics and Communication Engineering, Faculty of Engineering and Technology, Annamalai Univeristy, Tamilnadu- 608002, India
2Department of Electronics and Communication Engineering, Y.S.R. Engineering College of Yogi Vemana University, Proddattur, AndhraPradesh – 516360, India.
Abstract: Accurate semantic segmentation of remote sensing images is essential for applications such as urban planning, environmental monitoring, and land-use analysis. This study proposes a parallel-branch feedback-guided transformer framework for semantic segmentation of remote sensing images in complex scenes. Initially, images obtained from publicly available datasets are pre-processed using bilateral filtering to reduce noise while preserving important edge details. The pre-processed images are forwarded to a hierarchical transformer encoder, where feature representations are progressively extracted. Parallel processing is subsequently performed within the Atrous Spatial Pyramid Pooling-based Densely connected Residual (ASPP-DR) and Dual Attention Mechanism (DAM) modules to capture multi-scale contextual information together with spatial and channel-wise feature dependencies. A dual attention mechanism is incorporated to capture both spatial and channel-wise dependencies, improving contextual feature representation. The extracted multi-level feature maps are passed to a lightweight ASPP-DR module, which enhances contextual feature representation across multiple receptive fields. These enhanced features are subsequently forwarded to a multi-stage decoder that progressively reconstructs the spatial resolution. A feedback pyramid module is integrated within the decoding process to iteratively refine feature representations using previously generated outputs. In parallel, a multi-scale feature aggregation strategy combines features from different levels to produce a more discriminative representation. Finally, a cascaded upsampling decoder generates a high-resolution semantic segmentation map with accurate pixel-level classification. The proposed framework was evaluated on the LoveDA and WHU Building datasets. Experimental results achieved mIoU values of 95.4% and 96.1%, Dice scores of 97.7% and 98.0%, and precision values of 97.8% and 98.1% on the LoveDA and WHU datasets, respectively. The results indicate that the proposed framework effectively handles multi-scale variations while maintaining high segmentation accuracy.
Keywords: Remote sensing image segmentation, Vision transformer, Dual attention mechanism, Multi-scale feature learning, Feedback refinement, Semantic segmentation.
References
- J. Li, Y. Cai, Q. Li, M. Kou, and T. Zhang, "A Review of Remote Sensing Image Segmentation by Deep Learning Methods," International Journal of Digital Earth, vol. 17, no. 1, 2024. https://doi.org/10.1080/17538947.2024.2328827
- G. Vivone et al., "Deep Learning in Remote Sensing Image Fusion: Methods, Protocols, Data, and Future Perspectives," IEEE Geoscience and Remote Sensing Magazine, vol. 13, no. 1, pp. 269–310, 2025. https://doi.org/10.1109/MGRS.2024.3495516
- R. Liu, T. Luo, S. Huang, Y. Wu, Z. Jiang, and H. Zhang, "CrossMatch: Cross-View Matching for Semi-Supervised Remote Sensing Image Segmentation," IEEE Transactions on Geoscience and Remote Sensing, vol. 62, Art. no. 5650515, pp. 1–15, 2024. https://doi.org/10.1109/TGRS.2024.3507050
- X. Ma, X. Zhang, X. Ding, M.-O. Pun, and S. Ma, "Decomposition-Based Unsupervised Domain Adaptation for Remote Sensing Image Semantic Segmentation," IEEE Transactions on Geoscience and Remote Sensing, vol. 62, Art. no. 5645118, pp. 1–18, 2024. https://doi.org/10.1109/TGRS.2024.3483283
- X. He et al., "Hierarchical Relation Learning for Few-Shot Semantic Segmentation in Remote Sensing Images," IEEE Transactions on Geoscience and Remote Sensing, vol. 63, Art. no. 4410615, pp. 1–15, 2025. https://doi.org/10.1109/TGRS.2025.3571738
- H. Xu, C. Zhang, P. Yue, and K. Wang, "SDCluster: A Clustering Based Self-Supervised Pre-Training Method for Semantic Segmentation of Remote Sensing Images," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 223, pp. 1–14, 2025. https://doi.org/10.1016/j.isprsjprs.2025.02.021
- W. Wang et al., "Multi-Dimension Transformer with Attention-Based Filtering for Medical Image Segmentation," 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), Herndon, VA, USA, pp. 632–639, 2024. https://doi.org/10.1109/ICTAI62512.2024.00095
- K. Chen, J. Zhang, C. Liu, Z. Zou, and Z. Shi, "RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models," IGARSS 2025 – IEEE International Geoscience and Remote Sensing Symposium, Brisbane, Australia, pp. 1070–1074, 2025. https://doi.org/10.1109/IGARSS55030.2025.11243338
- T. Wang et al., "LMFNet: Lightweight Multimodal Fusion Network for High-Resolution Remote Sensing Image Segmentation," Pattern Recognition, vol. 164, p. 111579, 2025. https://doi.org/10.1016/j.patcog.2025.111579
- S. Gui, S. Song, R. Qin, and Y. Tang, "Remote Sensing Object Detection in the Deep Learning Era—A Review," Remote Sensing, vol. 16, no. 2, p. 327, 2024. https://doi.org/10.3390/rs16020327
- G. Vivone et al., "Deep Learning in Remote Sensing Image Fusion: Methods, Protocols, Data, and Future Perspectives," IEEE Geoscience and Remote Sensing Magazine, vol. 13, no. 1, pp. 269–310, 2025. https://doi.org/10.1109/MGRS.2024.3495516
- Y. Chen, Z. Yang, L. Zhang, and W. Cai, "A Semi-Supervised Boundary Segmentation Network for Remote Sensing Images," Scientific Reports, vol. 15, no. 1, p. 2007, 2025. https://doi.org/10.1038/s41598-025-85125-9
- L. Yang, H. Chen, A. Yang, and J. Li, "EasySeg: An Error-Aware Domain Adaptation Framework for Remote Sensing Imagery Semantic Segmentation via Interactive Learning and Active Learning," IEEE Transactions on Geoscience and Remote Sensing, vol. 62, Art. no. 4407518, pp. 1–18, 2024. https://doi.org/10.1109/TGRS.2024.3399260
- K. An, Y. Wang, and L. Chen, "Encouraging the Mutual Interact Between Dataset-Level and Image-Level Context for Semantic Segmentation of Remote Sensing Image," IEEE Transactions on Geoscience and Remote Sensing, vol. 62, Art. no. 5606116, pp. 1–16, 2024. https://doi.org/10.1109/TGRS.2024.3352582
- Z. Marinov, P. F. Jäger, J. Egger, J. Kleesiek, and R. Stiefelhagen, "Deep Interactive Segmentation of Medical Images: A Systematic Review and Taxonomy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 10998–11018, 2024. https://doi.org/10.1109/TPAMI.2024.3452629
- M. Huang, J. Zou, Y. Zhang, U. A. Bhatti, and J. Chen, "Efficient Click-Based Interactive Segmentation for Medical Image With Improved Plain-ViT," IEEE Journal of Biomedical and Health Informatics, vol. 29, no. 12, pp. 8904–8916, 2025. https://doi.org/10.1109/JBHI.2024.3392893
- Y. Du, F. Bai, T. Huang, and B. Zhao, "SegVol: Universal and Interactive Volumetric Medical Image Segmentation," 38th Conference on Neural Information Processing Systems (NeurIPS), pp. 110746–110783, 2024. https://doi.org/10.52202/079017-3516
- J. Liu, H. Liu, X. Li, J. Ren, and X. Xu, "MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation," IEEE Transactions on Image Processing, vol. 34, pp. 1686–1699, 2025. https://doi.org/10.1109/TIP.2025.3544484
- J. Lin et al., "AdaptiveClick: Click-Aware Transformer With Adaptive Focal Loss for Interactive Image Segmentation," IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 3, pp. 5759–5773, 2025. https://doi.org/10.1109/TNNLS.2024.3378295
- Z. Xing, G. Ma, L. Wang, L. Yang, X. Guo, and S. Chen, "Toward Visual Interaction: Hand Segmentation by Combining 3-D Graph Deep Learning and Laser Point Cloud for Intelligent Rehabilitation," IEEE Internet of Things Journal, vol. 12, no. 12, pp. 21328–21338, 2025. https://doi.org/10.1109/JIOT.2025.3546874
- L. Wang, D. Li, S. Dong, X. Meng, X. Zhang, and D. Hong, "PyramidMamba: Rethinking Pyramid Feature Fusion With Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery," International Journal of Applied Earth Observation and Geoinformation, vol. 144, p. 104884, 2025. https://doi.org/10.1016/j.jag.2025.104884
- J. Wang et al., "A Multi-Scale Remote Sensing Semantic Segmentation Model with Boundary Enhancement Based on UNetFormer," Scientific Reports, vol. 15, no. 1, p. 14737, 2025. https://doi.org/10.1038/s41598-025-99663-9
- Z. Wang, N. Xu, Z. You, and S. Zhang, "DiffMamba: Semantic Diffusion Guided Feature Modeling Network for Semantic Segmentation of Remote Sensing Images," GIScience & Remote Sensing, vol. 62, no. 1, 2025. https://doi.org/10.1080/15481603.2025.2484829
- S. Zhu, L. Zhao, Q. Xiao, J. Ding, and X. Li, "GLFFNET: Global–Local Feature Fusion Network for High-Resolution Remote Sensing Image Semantic Segmentation," Remote Sensing, vol. 17, no. 6, p. 1019, 2025. https://doi.org/10.3390/rs17061019
- J. Li, H. Zhang, L. Chen, B. He, and H. Chen, "CSNET: A Remote Sensing Image Semantic Segmentation Network Based on Coordinate Attention and Skip Connections," Remote Sensing, vol. 17, no. 12, p. 2048, 2025. https://doi.org/10.3390/rs17122048
- L. Wu, L. Fang, J. Yue, B. Zhang, P. Ghamisi, and M. He, "Deep Bilateral Filtering Network for Point-Supervised Semantic Segmentation in Remote Sensing Images," IEEE Transactions on Image Processing, vol. 31, pp. 7419–7434, 2022. https://doi.org/10.1109/TIP.2022.3222904
- H. Zeng, S. Peng, and D. Li, "DeepLabv3+ Semantic Segmentation Model Based on Feature Cross Attention Mechanism," Journal of Physics: Conference Series, vol. 1678, no. 1, p. 012106, 2020. https://doi.org/10.1088/1742-6596/1678/1/012106
- Y. Li, J. Gao, Y. Du, Y. Xiao, Z. Gao, and H. Huang, "HiTrans-SAM: Hierarchical Transformer Encoder and SAM-Augmented Inputs for Multi-Scale Remote Sensing Image Segmentation," IEEE Access, vol. 13, pp. 177957–177969, 2025. https://doi.org/10.1109/ACCESS.2025.3617388
- C. Gong, J. Liu, M. Gong, J. Li, U. A. Bhatti, and J. Ma, "Robust Medical Zero-Watermarking Algorithm Based on Residual-DenseNet," IET Biometrics, vol. 11, no. 6, pp. 547–556, 2022. https://doi.org/10.1049/bme2.12100
- H. Wu, J. Gui, J. Zhang, J. T. Kwok, and Z. Wei, "Feedback Pyramid Attention Networks for Single Image Super-Resolution," IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 9, pp. 4881–4892, 2023. https://doi.org/10.1109/TCSVT.2023.3250657
- Y. Zhao, Y. Jiang, L. Huang, and K. Xia, "SEF-UNet: Advancing Abdominal Multi-Organ Segmentation with SEFormer and Depthwise Cascaded Upsampling," PeerJ Computer Science, vol. 10, p. e2238, 2024. https://doi.org/10.7717/peerj-cs.2238
- J. Wang, Z. Zheng, A. Ma, X. Lu, and Y. Zhong, "LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation," Proceedings of the NeurIPS Datasets and Benchmarks Track, 2021. https://doi.org/10.5281/zenodo.5706578
- S. Ji, S. Wei, and M. Lu, "Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set," IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 1, pp. 574–586, 2019. https://doi.org/10.1109/TGRS.2018.2858817
- Y. Wang, L. Yang, X. Liu, and P. Yan, "An Improved Semantic Segmentation Algorithm for High-Resolution Remote Sensing Images Based on DeepLabv3+," Scientific Reports, vol. 14, no. 1, p. 9716, 2024. https://doi.org/10.1038/s41598-024-60375-1
- N. S. Jonnala et al., "DSIA U-Net: Deep Shallow Interaction with Attention Mechanism UNet for Remote Sensing Satellite Images," Scientific Reports, vol. 15, no. 1, p. 549, 2025. https://doi.org/10.1038/s41598-024-84134-4
- S. Peng, H. Xie, N. Liu, and Y. Zeng, "Semantic Segmentation of Multispectral Remote Sensing Imagery for Coastal Wetlands with SegFormer," Remote Sensing, vol. 18, no. 5, p. 745, 2026. https://doi.org/10.3390/rs18050745
- Z. Chang, M. Xu, Y. Wei, and J. Lian, "CW-SwinUNet: A Novel Semantic Segmentation Approach for Very-High-Resolution Remote Sensing Imagery," International Journal of Remote Sensing, vol. 46, no. 22, pp. 8614–8639, 2025. https://doi.org/10.1080/01431161.2025.2571233
© 2026 The Author(s). Published by IJERESM. This work is licensed under the Creative Commons Attribution 4.0 International License.
Archiving: All articles are permanently archived in Zenodo IJERESM Community.
