Skip to content

Architecture

This page explains how GeoMind works under the hood - without exposing internal code. Understanding the architecture helps you write better queries and troubleshoot issues.


Why GeoMind?

Traditional satellite imagery workflows are slow, expensive, and complex:

Full Scene Download  ->  Local Storage  ->  Manual Processing  ->  Result
     ~720 MB              Disk I/O           Complex Code
    10-30 min          Heavy Storage     Expert Knowledge Needed

GeoMind uses cloud-native data access to stream only the bytes you need:

HTTP Range Request  ->  Stream Chunks  ->  Process in Memory  ->  Result
     ~1-5 MB             No disk             Fast
    10-30 sec         Zero Storage      Plain English Query

Key Benefits

  • Approximately 100x less data downloaded - stream only the chunks you need via Zarr
  • No local storage required - everything is processed in memory
  • Natural language interface - no coding or GIS expertise needed
  • Free AI model - default model (nvidia/nemotron-3-nano-30b-a3b:free) costs nothing
  • Automatic geocoding - just say "Iceland" or "Central Park", no coordinates needed
  • Multi-step reasoning - one query can trigger multiple tools automatically

How GeoMind Thinks

When you type a natural language query, GeoMind follows a structured decision loop:

flowchart TD
    A["You type a query"] --> B["LLM analyzes your intent"]
    B --> C{"Needs a tool call?"}
    C -->|Yes| D["Select appropriate tool"]
    D --> E["Execute tool with parameters"]
    E --> F["Process tool results"]
    F --> C
    C -->|No| G["Generate natural language response"]
    G --> H["Display result and open images"]

    style A fill:#009DD1,color:#fff
    style G fill:#007a8a,color:#fff
    style H fill:#0c4a6e,color:#fff

The key insight: the LLM can make multiple tool calls in sequence. A single query like "get me recent image of Scotland and its NDVI" triggers 3 tool calls automatically.


System Architecture

graph TD
    UserQuery["User Query (natural language)"]:::user
    UserQuery --> Agent

    subgraph GeoMind_Agent ["GeoMind Agent"]
        Agent["Function Calling Loop (max 10 iterations)"]
        LLM["OpenRouter API\nnvidia/nemotron-3-nano-30b-a3b:free\nUnderstands NL, Plans actions, Calls functions"]
        Agent <--> LLM
    end

    Agent --> |Tool Calls| Tools

    subgraph Tools_Layer ["Tools Layer"]
        Geocoding["Geocoding\ngeocoding_location\nget_bbox_from_location"]
        STAC["STAC Search\nsearch_imagery\nlist_recent_imagery\nget_item_details"]
        Processing["Processing\ncreate_rgb_composite\ncalculate_ndvi\nget_band_statistics"]
    end

    Geocoding --> OSM["Nominatim (OSM API)\nPlace name to coordinates"]
    STAC --> STAC_API["Sentinel-2 STAC API\nSearch by bbox, datetime, cloud cover"]
    Processing --> Zarr["Zarr Data (Cloud)\nvia HTTP/fsspec\nChunked access (~1-5 MB/band)"]

    Zarr --> Output["outputs/\nrgb_composite.png\nndvi.png"]

Data Flow - Step by Step

Here is what happens when you ask: "Create RGB for recent image of London"

Step 1: Geocoding

The agent converts "London" to geographic coordinates:

Input Output
"London" lat: 51.5074, lon: -0.1278

Then a bounding box is computed with a ~15 km buffer:

bbox: [-0.23, 51.41, -0.02, 51.60]

The Sentinel-2 STAC catalog is queried for matching scenes:

{
    "collection": "sentinel-2-l2a",
    "bbox": [-0.23, 51.41, -0.02, 51.60],
    "datetime": "2026-02-14/2026-02-28",
    "query": {"eo:cloud_cover": {"lt": 50}}
}

Response: a list of matching STAC items, each containing Zarr asset URLs pointing to cloud-hosted data.

Step 3: Cloud-Native Data Access (Zarr)

Instead of downloading the full satellite scene (~720 MB), GeoMind uses HTTP range requests to stream only the specific chunks needed:

Data transferred ~720 MB per scene
Wait time 10–30 minutes
Storage required Full scene on disk
Data transferred ~1–5 MB (only needed chunks)
Wait time 10–30 seconds
Storage required None - processed in memory

Only the specific Zarr chunks for the requested bands are fetched via HTTP range requests - for example, a subset of /B04/, /B03/, /B02/ tiles rather than the entire scene.

Data is read directly from EODC object storage using fsspec + zarr. Nothing is written to your local disk.

Step 4: Image Processing

Once band data is streamed, GeoMind processes it:

  1. Scale/Offset correction: Reflectance = (DN x 0.0001) + (-0.1)
  2. Band stacking: B04 (Red) + B03 (Green) + B02 (Blue) to RGB array
  3. Percentile normalization: 2-98% stretch for optimal visual contrast
  4. Rendering: Matplotlib generates the PNG image
  5. Output: Saved to outputs/rgb_composite_XXXX.png

The Agent Loop

The core of GeoMind is the function-calling loop - the same pattern used by modern AI assistants.

flowchart TD
    A["1. User message added to conversation history"]
    B["2. LLM receives full context + available tools"]
    C{"3. LLM decides"}
    D["Call a tool"]
    E["Execute tool"]
    F["Add result to history"]
    G["4. Generate final response"]
    H["Display result to user"]

    A --> B --> C
    C -->|Tool needed| D --> E --> F --> B
    C -->|No more tools| G --> H

    style A fill:#1e3a5f,color:#fff
    style H fill:#1e3a5f,color:#fff

Max 10 Iterations

The agent loop has a safety limit of 10 iterations per query. In practice, most queries resolve in 2–3 iterations.


Technology Stack

Layer Technology Purpose
AI Layer OpenRouter API (OpenAI Python Client) LLM function calling and natural language understanding
nvidia/nemotron-3-nano-30b-a3b:free (default) Free, capable model for geospatial reasoning
Geocoding geopy (Nominatim/OSM) Convert place names to coordinates
STAC Catalog pystac-client Query Sentinel-2 imagery catalog
Data Access zarr + fsspec + aiohttp + s3fs Cloud-native HTTP range requests to stream chunked data
Processing numpy Band math, reflectance calculation, array operations
Visualization matplotlib Image rendering and NDVI colormaps
Data Framework xarray + dask Lazy loading and chunked computation
Configuration python-dotenv Environment variable management

Installed Dependencies

Running pip install geomind-ai installs all required packages:

Package Purpose
openai LLM function-calling via OpenRouter
pystac-client STAC catalog search for Sentinel-2
xarray + zarr + dask Cloud-native chunked data access
geopy Geocoding (place name to coordinates)
fsspec + s3fs + aiohttp HTTP range requests for streaming
matplotlib + numpy Image processing and rendering
python-dotenv Environment variable management

External APIs Used

API Endpoint Purpose Cost
OpenRouter https://openrouter.ai/api/v1 LLM inference (function calling) Free (default model)
Nominatim OpenStreetMap Geocoding (place name to coordinates) Free (rate-limited)
Sentinel-2 STAC API - Sentinel-2 catalog search Free
Cloud Object Storage - Zarr data streaming (HTTP range requests) Free

100% Free to Use

GeoMind uses entirely free services by default. The default LLM model, Nominatim geocoding, STAC API, and EODC data access are all free of charge.


Configuration Defaults

Setting Default Value Purpose
Default model nvidia/nemotron-3-nano-30b-a3b:free LLM for natural language processing
STAC Collection sentinel-2-l2a Level-2A surface reflectance
Max cloud cover 50% Default cloud filter threshold
Buffer distance 15 km Bounding box buffer around geocoded points
Max search results 20 Maximum items returned per search
Search lookback 14 days Default time window for recent imagery
Output directory ./outputs/ Where generated images are saved
Reflectance scale 0.0001 DN to reflectance conversion factor
Reflectance offset -0.1 DN to reflectance offset
Config directory ~/.geomind/ Where the API key is stored