Tree / Orchard Detection & Counting from High-Resolution Satellite Images
Summary
This project presents an end-to-end geospatial machine learning pipeline designed to detect and count individual trees and orchards in very high-resolution multispectral satellite imagery. The pipeline processes raw 4-band GeoTIFF data (Blue, Green, Red, and NIR) and utilizes a custom PyTorch U-Net architecture trained with density map regression. Instead of traditional bounding-box object detection, the approach convolves point annotations with Gaussian kernels to generate smooth density heatmaps, where pixel integration yields highly accurate counts even in complex horticultural terrains. The framework supports tile slicing (512×512 px), automated VOC XML-to-point-mask conversions, and full-image spatial metadata preservation for real-world environmental monitoring and sustainable agricultural planning.
Quick Links & Resources
Model Architecture and Training
Approach - Density Map Regression
Instead of traditional object detection, we use density map regression. Each point annotation is convolved with a Gaussian kernel (sigma = 2) to produce a smooth density map. The integral (pixel sum) of the predicted density map directly gives the tree count.
U-Net Architecture
A lightweight 3-level U-Net built in PyTorch:

Training Sample Visualization
Below is a sample from the training set showing the RGB image, ground-truth mask, and the corresponding Gaussian density map:

Training Curves

Predictions and Results
Prediction on Train Data

Prediction on Test Data

Each prediction panel shows:
- Original RGB - the input satellite tile rendered as true-color.
- Predicted Density Map - heatmap where the sum gives the predicted tree count.
- Overlay - predicted density heatmap blended on top of the original image, highlighting detected tree locations.
Notebooks
Kaggle - tree-orchard-detection
| Notebook | Description |
|---|---|
| code.ipynb | Complete end-to-end workflow - data loading, model definition, training, evaluation, and visualization |
| inference.ipynb | Load the pre-trained best_model.pth and run inference on any GeoTIFF - no training required |
