Document Type : Original Article
Authors
1
University of Tehran
2
Corresponding author, School of Surveying and Geospatial Engineering, College of Engineering University of Tehran, Tehran, Iran.
10.22059/eoge.2026.411573.1219
Abstract
This study develops and comparatively evaluates deep-learning-based road segmentation approaches for the automated extraction of road networks from high-resolution overhead imagery, motivated by the need for scalable road-map updating and reliable mask generation for downstream GIS workflows. Experiments were conducted on the Massachusetts Roads Dataset, which comprises 1,171 aerial image tiles of size 1500 × 1500 pixels at approximately 1 m spatial resolution. Three segmentation architectures—U-Net, FPN, and MA-Net—were assessed under a controlled experimental setup in which all models share the same high-capacity feature extractor, enabling isolation of the effects of decoder design and feature-fusion strategy. Performance was evaluated using pixel-level metrics including Accuracy, IoU, F1-score, Precision, and Recall, and was further supported by qualitative visual inspection of predicted masks and error-map localization analysis. Quantitative results indicate that U-Net achieves the strongest overall performance (Accuracy = 0.97, IoU = 0.89, F1 = 0.94, Precision = 0.99, Recall = 0.92), followed by FPN and MA-Net.Visual comparisons show that all methods successfully recover the dominant road layout, while performance differences are more pronounced in thin streets, intersections, and visually complex regions. Error-map analysis further reveals that disagreements are concentrated around connectivity-critical structures such as junctions and narrow links, with MA-Net exhibiting the most widespread error patterns and U-Net demonstrating more spatially localized discrepancies. Overall, the findings confirm that, under a shared backbone capacity, decoder inductive bias and feature integration strategy significantly influence road segmentation quality: U-Net provides the most reliable overlap and precision–recall balance, FPN remains highly competitive due to effective multi-scale fusion, and MA-Net appears more sensitive to background clutter, as reflected in its broader error-map disagreement patterns.
Keywords
Main Subjects