Deep Residual U-Net for Building Footprint Extraction

Summary

This project features a fully reproducible, end-to-end PyTorch deep learning pipeline designed to extract building footprints from high-resolution satellite imagery. Evaluated on the SpaceNet Rio de Janeiro dataset, the framework automates the conversion of GeoJSON building polygons into high-fidelity raster masks, trains a high-performance residual U-Net for semantic segmentation, and executes robust inference on individual image tiles as well as large-scale satellite mosaics. Designed for maximum flexibility, the entire pipeline is configuration-driven via a single YAML file, controlling all parameters from dataset splits and learning rate schedules to multi-loss weights and sliding-window mosaic inference settings.

Quick Links & Resources

Example Predictions

Input Image with Ground Truth Building Mask

Input Image and Ground Truth Building Mask

Model Segmentation Output & Heatmap

Model Prediction and Building Mask

Model Architecture

The segmentation model is a Residual U-Net. The encoder extracts multi-scale spatial features, the decoder reconstructs a full-resolution mask, and skip connections preserve fine building boundaries.

Residual U-Net Architecture

Notebooks

building_mask.ipynb - Building footprint rasterization.
model.ipynb - U-Net architecture test and validation.
truth_coords.ipynb - GeoJSON geographic coordinate alignment.
segmentation_on_test.ipynb - Image prediction analysis on hold-out testing data.
segmentation_on_mosaic.ipynb - Sliding window inference over large-scale satellite mosaics.

Harsh Shinde