Skip to main content

Apache Ignite and GridGain.
Working code.
Build, deploy, operate.

Tutorials and guides from practitioners who build distributed systems for a living. Start with a cluster on your laptop, load a real dataset, and build applications with working code that applies in production. Every example is tested, every code block runs.

Ignite 2Ignite 3GridGain 8GridGain 9
-- Customer, Invoice, and InvoiceLine share a colocation key.
-- This 3-table JOIN executes without shuffling data between nodes.

SELECT
c.FirstName || ' ' || c.LastName AS customer,
COUNT(DISTINCT i.InvoiceId) AS orders,
CAST(SUM(il.UnitPrice * il.Quantity)
AS DECIMAL(10,2)) AS revenue
FROM Customer c
JOIN Invoice i ON c.CustomerId = i.CustomerId
JOIN InvoiceLine il ON i.InvoiceId = il.InvoiceId
GROUP BY c.CustomerId, c.FirstName, c.LastName
ORDER BY revenue DESC
LIMIT 5;
A 3-table JOIN across 2,700+ records that executes on a single node. Data locality places each customer's invoices and line items on the same partition, so the query never crosses the network.

Getting Started

Everything starts with a running cluster. Three steps take you from a bare laptop to querying a distributed database with 15,000 records across 11 tables. No cloud account, no manual configuration. Apache Ignite runs out of the box; GridGain requires a free license.

1 Start Your Cluster

Docker Compose pulls the images, starts 3 nodes, and initializes the cluster. Two minutes, one command. You get a topology that mirrors production: multiple server nodes, partition replication, and a client connection endpoint on localhost:10800.

Start a Local Dev Cluster →

2 Load the Dataset

The Music Store dataset models a digital media business: artists, albums, tracks, customers, invoices. 11 tables across 2 distribution zones with colocation configured for realistic partition behavior. Load it once via SQL script. Every tutorial on the site uses this schema, so it becomes familiar context as you work through the content.

Included in the first tutorial →

3 Write Code Against a Real Cluster

With the cluster running and the dataset loaded, pick your entry point. Use SQL for exploration and ad-hoc queries. Use the Java Table API for typed access through RecordView and KeyValueView. Work through transactions for consistency guarantees across colocated tables. Or deploy compute jobs that run where the data lives.

Apache Ignite 2 and GridGain 8 share the cache-centric API (IgniteCache, SqlFieldsQuery). Apache Ignite 3 and GridGain 9 share the schema-driven API (RecordView, KeyValueView). Tutorials tab between the two so you see the version that matches your project.

Start the first tutorial →
// Connect and query the cache via the thin client
try (IgniteClient client = Ignition.startClient(
new ClientConfiguration()
.setAddresses("localhost:10800"))) {

ClientCache<?, ?> invoices = client.cache("Invoice");

// Fetch all invoices for customer 42.
// Affinity colocation: every record is on one node.
SqlFieldsQuery query = new SqlFieldsQuery(
"SELECT id, total FROM Invoice " +
"WHERE customerId = ?")
.setArgs(42);

List<List<?>> rows = invoices.query(query).getAll();
}

Structured Learning Paths

Two paths, matched to where your architecture is today. Each path connects tutorials into a progression where each piece builds on the last. Not sure which path fits?

Cache-Centric Foundations

From a bare laptop to a working cache-aside implementation backed by real benchmark data. You start a single-node Docker cluster, learn the cache API, connect Java clients, then add a MariaDB database and see the cache offload read pressure under sustained production-like workload. The path closes with transactions and a grounded mental model of how entries are distributed across the cluster. This path covers the transition from Single System to Specialized Persistence in the Data Architecture Maturity Model.
6 tutorials~5.25 hours
 
Start the path

Beyond Key-Value: When Caching Isn't Enough

For developers whose caching layer has created new challenges. Each module maps to an architectural transition from Specialized Persistence through Managed Consistency in the Data Architecture Maturity Model. Covers SQL-on-cache queries, cross-key ACID transactions, colocated compute, cache-store consistency patterns, distributed data structures, and real-time event processing.
6 tutorials~6 hours
 
Start the path

Design for Data Locality

Affinity and data locality are the defining design concerns on Apache Ignite 2 and GridGain 8. This path teaches the subject as a design discipline for senior developers building real systems. You design keys for colocation, verify colocation at runtime, write affinity-aware compute that scales, and learn when to deliberately break colocation for reference data.
4 tutorials + 2 guides~5 hours
 
Start the path

Distributed SQL Foundations

From a bare laptop to working code against a running cluster. This path covers cluster setup, SQL fundamentals with a real dataset, connecting the Java thin client, typed data access through RecordView and KeyValueView, and multi-table ACID transactions. This path serves teams at the Managed Consistency or Acceleration Database stage of the Data Architecture Maturity Model.
5 tutorials~5 hours
 
Start the path

Schema Design for Distributed SQL

Distributed SQL behaves differently from a single-node RDBMS. This path teaches you to analyze access patterns, build colocation chains, choose distribution zones, translate those decisions into Java annotations, pick the right data access pattern for each use case, and evolve schemas on a running cluster.
3 tutorials~4.5 hours
 
Start the path

Data Pipeline APIs

High-throughput data ingestion with the Data Streamer API and distributed compute that runs where data lives. This path covers the two major data movement APIs: streaming data into the cluster and processing it on the nodes that hold it. Future content adds GridGain CDC for change data replication.
1 tutorial~2.5 hours
 
Start the path

Caching and Key-Value Access

Use the distributed platform as a high-performance caching layer. This path starts with the KeyValueView and RecordView APIs, builds up to data structure patterns that replace traditional cache systems, and finishes with the cache-aside integration pattern for existing applications. GridGain tabs add client-side near-cache.
1 tutorial~3 hours
 
Start the path

Apache Ignite and GridGain

Apache Ignite is an open-source distributed database for applications that need both speed and consistency. It handles colocation, partitioning, and replication behind a SQL engine, Table API, compute framework, ACID transactions, and cluster management. GridGain extends Ignite with enterprise capabilities: role-based access control, LDAP authentication, transparent data encryption, cross-cluster disaster recovery, point-in-time backups, and commercial support.

Code you write against org.apache.ignite runs on both products without changes. The Java packages, SQL dialect, and thin client protocol are identical. Switching between products is a build-file change, not a code change. Where Docker images, Maven coordinates, or license configuration differ, tutorials use tabbed sections to show both setups. Enterprise-only features (security, disaster recovery, columnar storage) are labeled with the required GridGain edition. See what each edition includes →