Architecture¶
This page explains how GeoMind works under the hood - without exposing internal code. Understanding the architecture helps you write better queries and troubleshoot issues.
Why GeoMind?¶
Traditional satellite imagery workflows are slow, expensive, and complex:
Full Scene Download -> Local Storage -> Manual Processing -> Result
~720 MB Disk I/O Complex Code
10-30 min Heavy Storage Expert Knowledge Needed
GeoMind uses cloud-native data access to stream only the bytes you need:
HTTP Range Request -> Stream Chunks -> Process in Memory -> Result
~1-5 MB No disk Fast
10-30 sec Zero Storage Plain English Query
Key Benefits
- Approximately 100x less data downloaded - stream only the chunks you need via Zarr
- No local storage required - everything is processed in memory
- Natural language interface - no coding or GIS expertise needed
- Free AI model - default model (
nvidia/nemotron-3-nano-30b-a3b:free) costs nothing - Automatic geocoding - just say "Iceland" or "Central Park", no coordinates needed
- Multi-step reasoning - one query can trigger multiple tools automatically
How GeoMind Thinks¶
When you type a natural language query, GeoMind follows a structured decision loop:
flowchart TD
A["You type a query"] --> B["LLM analyzes your intent"]
B --> C{"Needs a tool call?"}
C -->|Yes| D["Select appropriate tool"]
D --> E["Execute tool with parameters"]
E --> F["Process tool results"]
F --> C
C -->|No| G["Generate natural language response"]
G --> H["Display result and open images"]
style A fill:#009DD1,color:#fff
style G fill:#007a8a,color:#fff
style H fill:#0c4a6e,color:#fff
The key insight: the LLM can make multiple tool calls in sequence. A single query like "get me recent image of Scotland and its NDVI" triggers 3 tool calls automatically.
System Architecture¶
graph TD
UserQuery["User Query (natural language)"]:::user
UserQuery --> Agent
subgraph GeoMind_Agent ["GeoMind Agent"]
Agent["Function Calling Loop (max 10 iterations)"]
LLM["OpenRouter API\nnvidia/nemotron-3-nano-30b-a3b:free\nUnderstands NL, Plans actions, Calls functions"]
Agent <--> LLM
end
Agent --> |Tool Calls| Tools
subgraph Tools_Layer ["Tools Layer"]
Geocoding["Geocoding\ngeocoding_location\nget_bbox_from_location"]
STAC["STAC Search\nsearch_imagery\nlist_recent_imagery\nget_item_details"]
Processing["Processing\ncreate_rgb_composite\ncalculate_ndvi\nget_band_statistics"]
end
Geocoding --> OSM["Nominatim (OSM API)\nPlace name to coordinates"]
STAC --> STAC_API["Sentinel-2 STAC API\nSearch by bbox, datetime, cloud cover"]
Processing --> Zarr["Zarr Data (Cloud)\nvia HTTP/fsspec\nChunked access (~1-5 MB/band)"]
Zarr --> Output["outputs/\nrgb_composite.png\nndvi.png"]
Data Flow - Step by Step¶
Here is what happens when you ask: "Create RGB for recent image of London"
Step 1: Geocoding¶
The agent converts "London" to geographic coordinates:
| Input | Output |
|---|---|
"London" |
lat: 51.5074, lon: -0.1278 |
Then a bounding box is computed with a ~15 km buffer:
Step 2: STAC Catalog Search¶
The Sentinel-2 STAC catalog is queried for matching scenes:
{
"collection": "sentinel-2-l2a",
"bbox": [-0.23, 51.41, -0.02, 51.60],
"datetime": "2026-02-14/2026-02-28",
"query": {"eo:cloud_cover": {"lt": 50}}
}
Response: a list of matching STAC items, each containing Zarr asset URLs pointing to cloud-hosted data.
Step 3: Cloud-Native Data Access (Zarr)¶
Instead of downloading the full satellite scene (~720 MB), GeoMind uses HTTP range requests to stream only the specific chunks needed:
| Data transferred | ~720 MB per scene |
| Wait time | 10–30 minutes |
| Storage required | Full scene on disk |
| Data transferred | ~1–5 MB (only needed chunks) |
| Wait time | 10–30 seconds |
| Storage required | None - processed in memory |
Only the specific Zarr chunks for the requested bands are fetched via HTTP range requests - for example, a subset of /B04/, /B03/, /B02/ tiles rather than the entire scene.
Data is read directly from EODC object storage using fsspec + zarr. Nothing is written to your local disk.
Step 4: Image Processing¶
Once band data is streamed, GeoMind processes it:
- Scale/Offset correction:
Reflectance = (DN x 0.0001) + (-0.1) - Band stacking: B04 (Red) + B03 (Green) + B02 (Blue) to RGB array
- Percentile normalization: 2-98% stretch for optimal visual contrast
- Rendering: Matplotlib generates the PNG image
- Output: Saved to
outputs/rgb_composite_XXXX.png
The Agent Loop¶
The core of GeoMind is the function-calling loop - the same pattern used by modern AI assistants.
flowchart TD
A["1. User message added to conversation history"]
B["2. LLM receives full context + available tools"]
C{"3. LLM decides"}
D["Call a tool"]
E["Execute tool"]
F["Add result to history"]
G["4. Generate final response"]
H["Display result to user"]
A --> B --> C
C -->|Tool needed| D --> E --> F --> B
C -->|No more tools| G --> H
style A fill:#1e3a5f,color:#fff
style H fill:#1e3a5f,color:#fff
Max 10 Iterations
The agent loop has a safety limit of 10 iterations per query. In practice, most queries resolve in 2–3 iterations.
Technology Stack¶
| Layer | Technology | Purpose |
|---|---|---|
| AI Layer | OpenRouter API (OpenAI Python Client) | LLM function calling and natural language understanding |
nvidia/nemotron-3-nano-30b-a3b:free (default) |
Free, capable model for geospatial reasoning | |
| Geocoding | geopy (Nominatim/OSM) |
Convert place names to coordinates |
| STAC Catalog | pystac-client |
Query Sentinel-2 imagery catalog |
| Data Access | zarr + fsspec + aiohttp + s3fs |
Cloud-native HTTP range requests to stream chunked data |
| Processing | numpy |
Band math, reflectance calculation, array operations |
| Visualization | matplotlib |
Image rendering and NDVI colormaps |
| Data Framework | xarray + dask |
Lazy loading and chunked computation |
| Configuration | python-dotenv |
Environment variable management |
Installed Dependencies¶
Running pip install geomind-ai installs all required packages:
| Package | Purpose |
|---|---|
openai |
LLM function-calling via OpenRouter |
pystac-client |
STAC catalog search for Sentinel-2 |
xarray + zarr + dask |
Cloud-native chunked data access |
geopy |
Geocoding (place name to coordinates) |
fsspec + s3fs + aiohttp |
HTTP range requests for streaming |
matplotlib + numpy |
Image processing and rendering |
python-dotenv |
Environment variable management |
External APIs Used¶
| API | Endpoint | Purpose | Cost |
|---|---|---|---|
| OpenRouter | https://openrouter.ai/api/v1 |
LLM inference (function calling) | Free (default model) |
| Nominatim | OpenStreetMap | Geocoding (place name to coordinates) | Free (rate-limited) |
| Sentinel-2 STAC API | - | Sentinel-2 catalog search | Free |
| Cloud Object Storage | - | Zarr data streaming (HTTP range requests) | Free |
100% Free to Use
GeoMind uses entirely free services by default. The default LLM model, Nominatim geocoding, STAC API, and EODC data access are all free of charge.
Configuration Defaults¶
| Setting | Default Value | Purpose |
|---|---|---|
| Default model | nvidia/nemotron-3-nano-30b-a3b:free |
LLM for natural language processing |
| STAC Collection | sentinel-2-l2a |
Level-2A surface reflectance |
| Max cloud cover | 50% |
Default cloud filter threshold |
| Buffer distance | 15 km |
Bounding box buffer around geocoded points |
| Max search results | 20 |
Maximum items returned per search |
| Search lookback | 14 days |
Default time window for recent imagery |
| Output directory | ./outputs/ |
Where generated images are saved |
| Reflectance scale | 0.0001 |
DN to reflectance conversion factor |
| Reflectance offset | -0.1 |
DN to reflectance offset |
| Config directory | ~/.geomind/ |
Where the API key is stored |