Deep Residual U-Net for Building Footprint Extraction

Summary

This project features a fully reproducible, end-to-end PyTorch deep learning pipeline designed to extract building footprints from high-resolution satellite imagery. Evaluated on the SpaceNet Rio de Janeiro dataset, the framework automates the conversion of GeoJSON building polygons into high-fidelity raster masks, trains a high-performance residual U-Net for semantic segmentation, and executes robust inference on individual image tiles as well as large-scale satellite mosaics. Designed for maximum flexibility, the entire pipeline is configuration-driven via a single YAML file, controlling all parameters from dataset splits and learning rate schedules to multi-loss weights and sliding-window mosaic inference settings.


HuggingFace Space GitHub HuggingFace Dataset HuggingFace Models TensorBoard Logs

Example Predictions

Input Image with Ground Truth Building Mask

Input Image and Ground Truth Building Mask

Model Segmentation Output & Heatmap

Model Prediction and Building Mask


Model Architecture

The segmentation model is a Residual U-Net. The encoder extracts multi-scale spatial features, the decoder reconstructs a full-resolution mask, and skip connections preserve fine building boundaries.

Residual U-Net Architecture

Notebooks