Architecture¶

This page explains how GeoMind works under the hood - without exposing internal code. Understanding the architecture helps you write better queries and troubleshoot issues.

Why GeoMind?¶

Traditional satellite imagery workflows are slow, expensive, and complex:

Full Scene Download  ->  Local Storage  ->  Manual Processing  ->  Result
     ~720 MB              Disk I/O           Complex Code
    10-30 min          Heavy Storage     Expert Knowledge Needed

GeoMind uses cloud-native data access to stream only the bytes you need:

HTTP Range Request  ->  Stream Chunks  ->  Process in Memory  ->  Result
     ~1-5 MB             No disk             Fast
    10-30 sec         Zero Storage      Plain English Query

Key Benefits

Approximately 100x less data downloaded - stream only the chunks you need via Zarr
No local storage required - everything is processed in memory
Natural language interface - no coding or GIS expertise needed
Free AI model - default model (nvidia/nemotron-3-nano-30b-a3b:free) costs nothing
Automatic geocoding - just say "Iceland" or "Central Park", no coordinates needed
Multi-step reasoning - one query can trigger multiple tools automatically

How GeoMind Thinks¶

When you type a natural language query, GeoMind follows a structured decision loop:

flowchart TD
    A["You type a query"] --> B["LLM analyzes your intent"]
    B --> C{"Needs a tool call?"}
    C -->|Yes| D["Select appropriate tool"]
    D --> E["Execute tool with parameters"]
    E --> F["Process tool results"]
    F --> C
    C -->|No| G["Generate natural language response"]
    G --> H["Display result and open images"]

    style A fill:#009DD1,color:#fff
    style G fill:#007a8a,color:#fff
    style H fill:#0c4a6e,color:#fff

The key insight: the LLM can make multiple tool calls in sequence. A single query like "get me recent image of Scotland and its NDVI" triggers 3 tool calls automatically.

System Architecture¶

graph TD
    UserQuery["User Query (natural language)"]:::user
    UserQuery --> Agent

    subgraph GeoMind_Agent ["GeoMind Agent"]
        Agent["Function Calling Loop (max 10 iterations)"]
        LLM["OpenRouter API\nnvidia/nemotron-3-nano-30b-a3b:free\nUnderstands NL, Plans actions, Calls functions"]
        Agent <--> LLM
    end

    Agent --> |Tool Calls| Tools

    subgraph Tools_Layer ["Tools Layer"]
        Geocoding["Geocoding\ngeocoding_location\nget_bbox_from_location"]
        STAC["STAC Search\nsearch_imagery\nlist_recent_imagery\nget_item_details"]
        Processing["Processing\ncreate_rgb_composite\ncalculate_ndvi\nget_band_statistics"]
    end

    Geocoding --> OSM["Nominatim (OSM API)\nPlace name to coordinates"]
    STAC --> STAC_API["Sentinel-2 STAC API\nSearch by bbox, datetime, cloud cover"]
    Processing --> Zarr["Zarr Data (Cloud)\nvia HTTP/fsspec\nChunked access (~1-5 MB/band)"]

    Zarr --> Output["outputs/\nrgb_composite.png\nndvi.png"]

Data Flow - Step by Step¶

Here is what happens when you ask: "Create RGB for recent image of London"

Step 1: Geocoding¶

The agent converts "London" to geographic coordinates:

Input	Output
`"London"`	`lat: 51.5074, lon: -0.1278`

Then a bounding box is computed with a ~15 km buffer:

bbox: [-0.23, 51.41, -0.02, 51.60]

Step 2: STAC Catalog Search¶

The Sentinel-2 STAC catalog is queried for matching scenes:

{
    "collection": "sentinel-2-l2a",
    "bbox": [-0.23, 51.41, -0.02, 51.60],
    "datetime": "2026-02-14/2026-02-28",
    "query": {"eo:cloud_cover": {"lt": 50}}
}

Response: a list of matching STAC items, each containing Zarr asset URLs pointing to cloud-hosted data.

Step 3: Cloud-Native Data Access (Zarr)¶

Instead of downloading the full satellite scene (~720 MB), GeoMind uses HTTP range requests to stream only the specific chunks needed:

Traditional DownloadGeoMind (Cloud-Native)


Data transferred	~720 MB per scene
Wait time	10–30 minutes
Storage required	Full scene on disk


Data transferred	~1–5 MB (only needed chunks)
Wait time	10–30 seconds
Storage required	None - processed in memory

Only the specific Zarr chunks for the requested bands are fetched via HTTP range requests - for example, a subset of /B04/, /B03/, /B02/ tiles rather than the entire scene.

Data is read directly from EODC object storage using fsspec + zarr. Nothing is written to your local disk.

Step 4: Image Processing¶

Once band data is streamed, GeoMind processes it:

Scale/Offset correction: Reflectance = (DN x 0.0001) + (-0.1)
Band stacking: B04 (Red) + B03 (Green) + B02 (Blue) to RGB array
Percentile normalization: 2-98% stretch for optimal visual contrast
Rendering: Matplotlib generates the PNG image
Output: Saved to outputs/rgb_composite_XXXX.png

The Agent Loop¶

The core of GeoMind is the function-calling loop - the same pattern used by modern AI assistants.

flowchart TD
    A["1. User message added to conversation history"]
    B["2. LLM receives full context + available tools"]
    C{"3. LLM decides"}
    D["Call a tool"]
    E["Execute tool"]
    F["Add result to history"]
    G["4. Generate final response"]
    H["Display result to user"]

    A --> B --> C
    C -->|Tool needed| D --> E --> F --> B
    C -->|No more tools| G --> H

    style A fill:#1e3a5f,color:#fff
    style H fill:#1e3a5f,color:#fff

Max 10 Iterations

The agent loop has a safety limit of 10 iterations per query. In practice, most queries resolve in 2–3 iterations.

Technology Stack¶

Layer	Technology	Purpose
AI Layer	OpenRouter API (OpenAI Python Client)	LLM function calling and natural language understanding
	`nvidia/nemotron-3-nano-30b-a3b:free` (default)	Free, capable model for geospatial reasoning
Geocoding	`geopy` (Nominatim/OSM)	Convert place names to coordinates
STAC Catalog	`pystac-client`	Query Sentinel-2 imagery catalog
Data Access	`zarr` + `fsspec` + `aiohttp` + `s3fs`	Cloud-native HTTP range requests to stream chunked data
Processing	`numpy`	Band math, reflectance calculation, array operations
Visualization	`matplotlib`	Image rendering and NDVI colormaps
Data Framework	`xarray` + `dask`	Lazy loading and chunked computation
Configuration	`python-dotenv`	Environment variable management

Installed Dependencies¶

Running pip install geomind-ai installs all required packages:

Package	Purpose
`openai`	LLM function-calling via OpenRouter
`pystac-client`	STAC catalog search for Sentinel-2
`xarray` + `zarr` + `dask`	Cloud-native chunked data access
`geopy`	Geocoding (place name to coordinates)
`fsspec` + `s3fs` + `aiohttp`	HTTP range requests for streaming
`matplotlib` + `numpy`	Image processing and rendering
`python-dotenv`	Environment variable management

External APIs Used¶

API	Endpoint	Purpose	Cost
OpenRouter	`https://openrouter.ai/api/v1`	LLM inference (function calling)	Free (default model)
Nominatim	OpenStreetMap	Geocoding (place name to coordinates)	Free (rate-limited)
Sentinel-2 STAC API	-	Sentinel-2 catalog search	Free
Cloud Object Storage	-	Zarr data streaming (HTTP range requests)	Free

100% Free to Use

GeoMind uses entirely free services by default. The default LLM model, Nominatim geocoding, STAC API, and EODC data access are all free of charge.

Configuration Defaults¶

Setting	Default Value	Purpose
Default model	`nvidia/nemotron-3-nano-30b-a3b:free`	LLM for natural language processing
STAC Collection	`sentinel-2-l2a`	Level-2A surface reflectance
Max cloud cover	`50%`	Default cloud filter threshold
Buffer distance	`15 km`	Bounding box buffer around geocoded points
Max search results	`20`	Maximum items returned per search
Search lookback	`14 days`	Default time window for recent imagery
Output directory	`./outputs/`	Where generated images are saved
Reflectance scale	`0.0001`	DN to reflectance conversion factor
Reflectance offset	`-0.1`	DN to reflectance offset
Config directory	`~/.geomind/`	Where the API key is stored