Road Segmentation in High-Resolution Overhead Imagery Using Deep Encoder–Decoder Architecture

Tavakoli, MohammadReza; Abedini, Abbas

doi:10.22059/eoge.2026.411573.1219

Road Segmentation in High-Resolution Overhead Imagery Using Deep Encoder–Decoder Architecture

Document Type : Original Article

Authors

¹ University of Tehran

² Corresponding author, School of Surveying and Geospatial Engineering, College of Engineering University of Tehran, Tehran, Iran.

10.22059/eoge.2026.411573.1219

Abstract

This study develops and comparatively evaluates deep-learning-based road segmentation approaches for the automated extraction of road networks from high-resolution overhead imagery, motivated by the need for scalable road-map updating and reliable mask generation for downstream GIS workflows. Experiments were conducted on the Massachusetts Roads Dataset, which comprises 1,171 aerial image tiles of size 1500 × 1500 pixels at approximately 1 m spatial resolution. Three segmentation architectures—U-Net, FPN, and MA-Net—were assessed under a controlled experimental setup in which all models share the same high-capacity feature extractor, enabling isolation of the effects of decoder design and feature-fusion strategy. Performance was evaluated using pixel-level metrics including Accuracy, IoU, F1-score, Precision, and Recall, and was further supported by qualitative visual inspection of predicted masks and error-map localization analysis. Quantitative results indicate that U-Net achieves the strongest overall performance (Accuracy = 0.97, IoU = 0.89, F1 = 0.94, Precision = 0.99, Recall = 0.92), followed by FPN and MA-Net.Visual comparisons show that all methods successfully recover the dominant road layout, while performance differences are more pronounced in thin streets, intersections, and visually complex regions. Error-map analysis further reveals that disagreements are concentrated around connectivity-critical structures such as junctions and narrow links, with MA-Net exhibiting the most widespread error patterns and U-Net demonstrating more spatially localized discrepancies. Overall, the findings confirm that, under a shared backbone capacity, decoder inductive bias and feature integration strategy significantly influence road segmentation quality: U-Net provides the most reliable overlap and precision–recall balance, FPN remains highly competitive due to effective multi-scale fusion, and MA-Net appears more sensitive to background clutter, as reflected in its broader error-map disagreement patterns.

Keywords

Main Subjects

Remote Sensing

Earth Observation and Geomatics Engineering

Article View: 110
PDF Download: 76

Road Segmentation in High-Resolution Overhead Imagery Using Deep Encoder–Decoder Architecture

Volume 9, Issue 2
December 2025
Pages 118-131

Files

Share

How to cite

Statistics

Road Segmentation in High-Resolution Overhead Imagery Using Deep Encoder–Decoder Architecture

Volume 9, Issue 2December 2025Pages 118-131

Files

Share

How to cite

Statistics

Volume 9, Issue 2
December 2025
Pages 118-131