Building Terra ML: From a 2008 Thesis to Open Source

How a fuzzy logic classifier born in a Venezuelan university became a satellite image classification platform with 6 ML models, ensemble methods, and 484 unit tests.

By Ivan R. Labrador Gonzalez · April 2026 · github.com/ivanrlg/TerraML

In November 2008, I defended a thesis at Universidad de Los Andes in Merida, Venezuela. The title was “Desarrollo de un Clasificador de Imagenes Satelitales Basado en Logica Difusa” (Development of a Satellite Image Classifier Based on Fuzzy Logic). The idea was straightforward: take multispectral satellite images, where every pixel carries reflectance values across multiple bands of light, and classify land cover automatically. Not with rigid thresholds, but with degrees of membership. Fuzzy logic.

The result was 81.87% Overall Accuracy with a Kappa coefficient of 0.7637, outperforming three classical methods of that era. But everything ran on MATLAB and IDRISI, software that costs thousands of dollars per license. The thesis was defended, the diploma handed out, and 105 pages of algorithms were archived. Nobody could replicate the work without buying the same expensive software.

18 years later, the world had changed: free satellite imagery from Sentinel-2 Copernicus, open-source libraries like GDAL and ML.NET, and GitHub to share code with the world. That thesis was the spark. What exists today is something much larger.

484
Unit Tests
6
ML Classifiers
3
Classification Modes
MIT
License

What Terra ML Is Today

Terra ML is an open-source satellite image classification platform built in C# 13 and .NET 10. It is not a port of a thesis. It is a new project, professionally engineered, inspired by that thesis but going far beyond it.

The platform offers three classification modes: pure fuzzy logic (the original method, fully explainable), hybrid mode (fuzzy logic as an intelligent preprocessor for machine learning), and pure ML (raw spectral values fed directly into classifiers, for baseline comparison). On top of that, it includes 6 production ML classifiers, 2 ensemble methods, k-fold cross-validation, a complete Blazor web application, a CLI, and Docker support.

Terra ML classification method selection: Pure Fuzzy, 6 Hybrid ML classifiers, 6 Pure ML classifiers, and 2 Ensemble methods
The Blazor web app lets you choose between 16 classification methods across all three modes.
🛰 Satellite Image Raster Reader (GDAL) Classification Mode Pure Fuzzy Fuzzy Inference Hybrid Fuzzy Features Pure ML ML Classifier (6 models) 🗺 Classified Map
Figure 1: Terra ML classification pipeline. Three paths, one platform.

How Fuzzy Classification Works

A satellite image is not a photograph. It is a stack of layers, each capturing a different wavelength of light: visible, near-infrared, shortwave infrared. Every pixel has a numerical value per layer. Each material on the ground has its own spectral fingerprint: water absorbs most light and appears dark across all bands; forests explode with reflectance in near-infrared; urban areas have a distinctive signature in shortwave infrared.

Classical approaches draw rigid lines: “if infrared is above 100, it’s forest.” Fuzzy logic works differently. It says: “this pixel looks 97% like urban, 2% like forest, and 0.3% like water.” Degrees of membership. This is more realistic because in nature, transitions between land cover types are gradual, not sharp lines on a map.

The classification process follows four steps.

1. TRAINING Expert selects sample pixels → mean & std per class / band 2. FUZZIFICATION Gaussian curves evaluate each pixel → membership degrees (0 to 1) 3. INFERENCE Fuzzy AND (minimum) across all bands → firing strength 4. DEFUZZIFICATION Class with highest firing strength wins
Figure 2: The four-stage fuzzy classification pipeline.
Terra ML Explore and Train page showing satellite imagery with labeled training areas for Water, Agriculture, Forest, Urban, and Cloud classes
The Explore & Train page: select training areas directly on the satellite image, assign classes, and see sample counts in real time.

Training begins when an expert identifies pixels with known land cover (water, forest, urban). Terra ML computes the mean and standard deviation of each class across each spectral band. These statistics become Gaussian membership functions: bell curves defined by μ(x) = exp(−½((x − mean) / σ)²).

When a new pixel arrives, fuzzification evaluates it against every curve, producing a membership degree between 0 and 1 for each class in each band. Then, inference applies the fuzzy AND operator (minimum) across all bands per class. The logic is intuitive: you are only as “Urban” as your weakest band says you are. The weakest link breaks the chain.

Finally, defuzzification assigns the class with the highest firing strength. If Urban scores 0.97 and Forest scores 0.02, the pixel is classified as Urban. And crucially: if someone asks “why was this pixel classified as forest?” you can show exactly which curve on which band gave which membership degree. Total explainability.

Three Ways to Classify: The Evolution

Terra ML does not stop at fuzzy logic. It offers three paths, and the reason for each reflects an honest evolution of the project.

Path 1: Pure Fuzzy Logic

The original method from the thesis. Works with very little training data, is completely explainable, and produces reasonable results. But it has a fundamental limitation: its decision rule is fixed. Take the minimum membership across bands, pick the maximum. Elegant, but it cannot learn complex inter-class patterns. When two classes have very similar spectral signatures, the firing strengths may differ by only 0.02. At that margin, sensor noise can flip the classification.

Path 2: Hybrid Mode (Fuzzy + ML)

This is where fuzzy logic finds its best version: not as the final classifier, but as an intelligent preprocessor. A pixel with 4 spectral bands gives an ML model 4 numbers without context. But if that pixel first passes through the fuzzy engine, you get 39 numbers with semantic context: the 4 original raw values, 28 membership degrees (how much the pixel resembles each of the 7 classes in each of the 4 bands), and 7 firing strengths (the fuzzy inference result per class).

The doctor analogy. It is the difference between giving a doctor just the numbers from a blood test versus giving the numbers plus an expert interpretation of each value. The ML model no longer has to discover from scratch that a value of 130 in VNIR1 means “typical Urban.” The fuzzy logic already tells it: “this pixel has a 0.99 membership degree for Urban in VNIR1.”
Pure ML Input Band 1 · Band 2 · Band 3 · Band 4 4 features Hybrid Input (Fuzzy-enriched) 4 Raw Values + 28 Membership Degrees (7 classes × 4 bands) + 7 Firing Strengths (1 per class) 39 features ML Model ML Model VS
Figure 3: 4 numbers without context vs. 39 numbers with semantic meaning.

Path 3: Pure ML

Raw spectral values go directly into a machine learning classifier, bypassing fuzzy logic entirely. This serves as a baseline for comparison, or for datasets where fuzzy preprocessing does not add value. Terra ML does not force you into one approach. It gives you the tools to compare and decide based on your own data.

Side-by-side comparison of original satellite image and classified output showing Water, Agriculture, Forest, Urban, and Cloud classes
Original satellite image (left) vs. classified output (right). Each color represents a land cover class assigned by the MLP Neural Network classifier.

Six Classifiers, Two Ensemble Methods

Terra ML integrates six production-ready classifiers, each with its own strengths.

ClassifierEngineBest For
Random ForestML.NETGeneral-purpose, robust with noisy data
SDCA MaxEntropyML.NETLarge datasets, fast convergence
LightGBMML.NETHigh accuracy, gradient boosting
SVM (One-vs-All)ML.NETHigh-dimensional feature spaces
Logistic RegressionML.NETCalibrated probabilities, interpretable
MLP Neural NetworkTorchSharpComplex non-linear patterns

Beyond individual classifiers, Terra ML offers two ensemble methods. The VotingClassifier combines predictions from multiple models through majority or weighted voting. The StackingClassifier goes further: it trains base classifiers on k-fold subsets, then trains a meta-learner (Logistic Regression) on out-of-fold predictions, discovering which classifier is most reliable in which situations while preventing data leakage through stratified splitting.

The ModelComparisonEngine ties everything together: it runs k-fold cross-validation for all classifiers in both hybrid and pure ML modes, producing a ranking by Overall Accuracy and Kappa coefficient. Concrete, data-driven evidence of what works best for your specific dataset.

Architecture

The codebase follows a clean layered architecture with strict separation of concerns.

PRESENTATION LAYER Blazor Web App (Radzen, 7 pages) CLI (System.CommandLine + Spectre) FUZZYSAT.CORE Fuzzy Logic ML Module Training Validation Raster I/O Ensemble Zero dependency on UI or infrastructure INFRASTRUCTURE Docker GDAL Native ML.NET + TorchSharp
Figure 4: Layered architecture. The Core has zero dependency on UI or infrastructure.

FuzzySat.Core contains the entire classification engine: membership functions (Gaussian, Triangular, Trapezoidal, Generalized Bell), the inference engine with configurable AND operators (minimum or algebraic product), two defuzzifiers (MaxWeight and WeightedAverage), all six ML classifiers, feature extractors, ensemble methods, validation metrics (confusion matrix, Kappa, per-class accuracy), and GDAL-based raster I/O with spectral index calculators (NDVI, NDWI, NDBI). Pure logic, no UI dependencies.

On top of that sit two interfaces: a Blazor Server web application with 7 pages that guide users from project setup through model comparison (built with Radzen components), and a CLI with 5 commands (classify, train, validate, info, visualize) for scripting and automation.

Deployment is flexible: run directly with dotnet run, or spin up Docker with docker compose up and access the web UI at localhost:8080. No SDK or GDAL installation required.

The 2008 Benchmark

The original thesis compared the fuzzy logic classifier against three traditional methods available in IDRISI, all evaluated on the same ASTER imagery of Merida, Venezuela, with 7 land cover classes and a ground-truth sample of 171 validation points.

ClassifierOverall AccuracyKappa
Fuzzy Logic (thesis)81.87%0.7637
Maximum Likelihood74.27%0.6650
Decision Tree (CART)63.74%0.5312
Minimum Distance56.14%0.4233

A Kappa of 0.76 is considered “substantial agreement” in the remote sensing literature. The fuzzy classifier outperformed all three classical methods by significant margins, validating the core thesis that degrees of membership better reflect the gradual transitions between land cover types found in nature.

2008 vs. 2026

AspectOriginal Thesis (2008)Terra ML (2026)
PlatformMATLAB + IDRISI (proprietary)C# / .NET 10 + GDAL (open source)
ImageryASTER (4 bands, 15m)Any GeoTIFF incl. Sentinel-2 (13 bands, 10m)
MF TypesGaussian onlyGaussian, Triangular, Trapezoidal, Gen. Bell
ML IntegrationNone6 classifiers + 2 ensemble methods
ModesFuzzy onlyFuzzy, Hybrid, Pure ML
InterfaceMATLAB GUIBlazor Web + CLI + Docker
ValidationSingle confusion matrixk-fold cross-validation + model comparison
TestsNone484 (math correctness verified)
LicenseAcademic onlyMIT (open source)

484 Tests That Verify Math, Not Just Compilation

The test suite (349 Core + 119 Web + 16 CLI) goes beyond “does it compile.” One test confirms the Gaussian function at exactly one sigma distance produces exp(−0.5) ≈ 0.6065. Another builds a 7-class confusion matrix with 171 samples and verifies 81.87% accuracy, matching the original thesis result. Another trains a Random Forest on synthetic data and confirms correct classification. Others create real GeoTIFF files with GDAL, write them, read them back, and verify pixel values.

These are not ceremonial tests. They are the mathematical contract between the thesis and the code.

Terra ML area statistics showing pixel distribution across 7 land cover classes with pie chart and bar chart
Classification results: pixel distribution across land cover classes, with Agriculture (57.2%) and Urban (22.0%) as dominant classes.

Where It Stands (Honestly)

I want to be upfront about something: Terra ML is a growing project, not a finished product. The classification engine works, the 484 tests pass, the architecture is solid. But the full validation pipeline against real-world Sentinel-2 imagery has not been executed yet. The 81.87% accuracy figure comes from the original 2008 thesis, on a specific ASTER dataset. Whether those numbers hold, improve, or change with modern imagery and the hybrid approach is something I have not yet measured.

I am building this in my spare time, and spare time is not something I have a lot of. Some weekends I push a dozen commits; other weeks, nothing. That is the reality of a solo open-source project built with passion but constrained by life.

Pure fuzzy logic classification worked well in 2008. An 81.87% accuracy is no small achievement, especially beating all three alternative methods. But image classification has advanced enormously since then. What has been observed during development is that machine learning approaches, especially when fed with features enriched by fuzzy logic, tend to produce better results. There are more tests to run and more conclusions to draw, but the trend is promising.

Each mode has its place. Pure fuzzy logic offers total explainability and requires very little training data, making it excellent for education and for understanding the fundamentals of spectral classification. Hybrid mode combines physical understanding with learning capability. Pure ML serves as a baseline. And ensemble methods combine multiple perspectives for robustness.

Terra ML does not compete with ArcGIS, IDRISI, or Google Earth Engine. Those are tools with decades of development and enormous teams. But Terra ML is a serious, open-source, free platform that puts satellite classification tools within reach of students, researchers, and the geospatial community. It is not perfect. It is alive, growing, and getting better with each commit.

What Comes Next (and an Invitation)

The next step is real-world validation with Sentinel-2 imagery from Copernicus: define training areas, classify, measure accuracy, compare all methods side by side, and publish the results. Whatever they may be.

The vision is for Terra ML to serve as a bridge. For remote sensing students who want to understand classification from the mathematical foundations, with code they can read and modify. For researchers who need explainable models where every decision can be traced back to specific membership degrees. For the open-source community working with satellite imagery that wants a free alternative to proprietary solutions. And for anyone who wants to experiment with satellite classification without barriers: no licenses, no costs, free imagery, and Docker to spin everything up in a single command.

And if any of that resonates with you: the door is open. Terra ML is not “my” project in the possessive sense. It is open source because I believe it should grow beyond what one person can build in their spare time. If you are a remote sensing researcher who wants to add a new classifier, a .NET developer who sees something that could be better, a student who wants to learn by contributing, or someone with a completely different perspective on how this should evolve, you are welcome. The CONTRIBUTING.md explains how to get started. Issues labeled good-first-issue and help-wanted are there for a reason.

The code is on GitHub. The license is MIT. Contributions are welcome.

github.com/ivanrlg/TerraML

From a 2008 thesis to a modern platform. The thesis opened a door. Terra ML walked through it.

Tech Stack

Terra ML Open Source Core C# 13 / .NET 10 GDAL · ML.NET · TorchSharp Interfaces Blazor Server · Radzen System.CommandLine · Spectre.Console DevOps GitHub Actions CI/CD Docker · xUnit · FluentAssertions Satellite Data Sentinel-2 · Landsat 8/9 · ASTER Any GeoTIFF
Figure 5: Terra ML technology ecosystem.

Leave a Reply

Your email address will not be published. Required fields are marked *