Lecture 5: In-Depth with Apache Ignite

Learning Objectives

Understand distributed caching grids.
Configure Ignite's REST API.
Implement cluster scaling strategies.
Analyze Ignite's hardware needs.

Prerequisites

Basic caching concepts knowledge.
Familiarity with Java/JVM ecosystem.
Command-line interface proficiency.

Section 1: The Distributed Caching Grid

Ignite's Architecture: More Than a Cache

Throughout this course, we've explored several caching solutions, from the lightweight and blazing-fast Memcached to the versatile Redis. Now, we turn our attention to Apache Ignite, a platform that significantly expands the definition of a caching server. While Redis and Memcached are primarily in-memory key-value stores, Ignite is best described as a distributed In-Memory Data Grid (IMDG) and, more broadly, an in-memory computing platform. This distinction is critical. Ignite is not just a place to store data temporarily; it's a system designed to process that data at in-memory speeds, directly where it resides (Jainandunsing, 2025).

The term "heavyweight" is often used to describe Ignite in comparison to its peers, and for good reason. It runs on the Java Virtual Machine (JVM), which introduces a layer of abstraction and resource overhead. This JVM dependency means Ignite is inherently "hungry" for memory and CPU cycles (Jainandunsing, 2025). However, this overhead buys you an incredible amount of power: distributed SQL queries, transactional consistency (ACID), compute grid capabilities, machine learning libraries, and robust fault tolerance. For a simple use case like caching sessions for 20 users, Ignite might seem like overkill, but its true value emerges when you plan for future scale and functionality.

Core Concepts of the In-Memory Data Grid (IMDG)

To truly grasp Ignite, we must deconstruct its core architectural principles. An IMDG pools the RAM of multiple computers into a single, cohesive data fabric. This allows applications to access and process vast datasets with the low latency of memory access, sidestepping the performance bottlenecks of traditional disk-based databases.

Distributed Architecture: Nodes and Data Distribution

An Ignite cluster is composed of multiple interconnected processes called nodes. Each node is a separate JVM process that communicates with others to form the grid. These nodes can play different roles:

Server Nodes: These are the workhorses of the cluster. They hold data partitions, participate in data indexing, and execute computations. A cluster must have at least one server node.
Client Nodes (Thick Clients): These nodes act as a gateway for applications to interact with the cluster. They are fully-fledged members of the cluster topology, aware of all server nodes and data partitioning. However, they do not store data themselves. This allows you to scale your application tier (adding more clients) without triggering data rebalancing on the server side.

The magic of Ignite's scalability and resilience lies in how it manages data across these server nodes. This is primarily achieved through two mechanisms: data partitioning and replication.

Cache Modes: Partitioned vs. Replicated

When you create a cache in Ignite, you must decide on its distribution mode. This choice has profound implications for performance, scalability, and memory usage.

PARTITIONED (Default Mode): In this mode, the entire dataset is split into a fixed number of partitions. Ignite then distributes these partitions as evenly as possible among all available server nodes. For example, in a 4-node cluster with 1024 partitions, each node would hold approximately 256 partitions. This is the cornerstone of horizontal scalability. As you add more nodes, the data is rebalanced, giving you more memory, CPU, and network capacity to handle larger datasets and higher throughput. It is the ideal choice for large caches where no single node could hold the entire dataset.
REPLICATED: In this mode, a complete copy of the cache's data is stored on every single server node in the cluster. Any update to a key on one node is synchronously propagated to all other nodes. This mode provides exceptional read performance because any node can serve a read request locally without any network overhead. However, it comes at a significant cost: write operations are much slower as they require a "one-phase commit" protocol across the entire cluster, and the total size of the cache is limited by the amount of RAM available on the single smallest node. This mode is best suited for small, frequently-read datasets, like configuration data or metadata.
LOCAL: As the name implies, data in a `LOCAL` cache is not distributed. It exists only in the memory of the node where it was created. This mode is less common in distributed use cases but can be useful for storing node-specific temporary data.

Data Affinity and Collocation of Compute

One of Ignite's most powerful, and often misunderstood, features is data affinity. This is the logic that determines which partition a particular key belongs to. By default, Ignite uses a consistent hash of the key to map it to a partition. However, you can control this mapping by designating an "affinity key."

Why is this important? Imagine a financial application with `Trade` and `Trader` objects. It is highly likely that queries will involve joining trades with the traders who made them. In a traditional database, this would require fetching data from different tables, potentially from different disks or even different machines. With Ignite, you can define the `TraderID` as the affinity key for both `Trade` and `Trader` objects. This ensures that all trades for a specific trader are stored on the same physical node as the trader's own record. This is known as data collocation.

When you then run a distributed SQL query or a compute job, Ignite is smart enough to route the computation to the node where the data resides. The query `SELECT * FROM Trade t JOIN Trader tr ON t.traderId = tr.id WHERE tr.name = 'John Doe'` can be executed entirely on a single node without any data being shuffled across the network. This principle of "shipping the computation to the data" is a fundamental paradigm shift that enables massive performance gains for complex data processing.

Hardware Considerations: The Cost of Power

As noted, Ignite's capabilities demand more substantial hardware than simpler caching solutions (Jainandunsing, 2025). A minimal single-node setup for a small workload might require:

CPU: 2 cores @ 2.0+ GHz
RAM: 2 GB (1 GB for the Ignite JVM heap, 1 GB for the Operating System)
Storage: 5–10 GB of SSD for logs and optional persistence.
Network: 1 Gbps NIC, especially critical for internode communication in a cluster.

The JVM is the primary reason for these requirements. The JVM itself consumes a baseline of memory, and its garbage collection (GC) processes require CPU cycles. When configuring an Ignite node, one of the most critical parameters is the JVM heap size, set using the `-Xms` (initial size) and `-Xmx` (maximum size) flags. A poorly tuned heap can lead to long GC pauses, which freeze the node and can cause it to be dropped from the cluster. Therefore, allocating sufficient RAM is not just about holding data, but also about ensuring the smooth operation of the JVM. For large-scale deployments, provisioning nodes with tens or even hundreds of gigabytes of RAM is common practice.

Example: Basic Cache Configuration

Below is a snippet from a typical Ignite XML configuration file (`default-config.xml`), demonstrating how to define both a `PARTITIONED` and a `REPLICATED` cache.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
        http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
        
        <!-- Other configurations like discoverySpi go here -->
        
        <property name="cacheConfiguration">
            <list>
                <!-- Definition for a partitioned cache (for scalable data) -->
                <bean class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="userSessionCache"/>
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="backups" value="1"/> <!-- Store one backup copy of each partition -->
                </bean>

                <!-- Definition for a replicated cache (for small, read-heavy data) -->
                <bean class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="configurationCache"/>
                    <property name="cacheMode" value="REPLICATED"/>
                </bean>
            </list>
        </property>
    </bean>
</beans>

This configuration defines two caches. The `userSessionCache` is partitioned, which is ideal for storing large amounts of user session data, and it includes one backup copy for fault tolerance. The `configurationCache` is replicated, making it suitable for storing small amounts of configuration data that needs to be accessed quickly from any node.

Did You Know?

Apache Ignite began its life as GridGain, a commercial product developed by GridGain Systems. In 2014, GridGain contributed the core source code to the Apache Software Foundation (ASF), where it was accepted into the Apache Incubator program. It graduated to a Top-Level Project in 2015. GridGain continues to offer a commercially supported enterprise edition built on top of the open-source Apache Ignite project, adding features like advanced security, data center replication, and enterprise management tools.

Section 1 Summary

Apache Ignite is an In-Memory Data Grid (IMDG), offering more capabilities than a simple cache, including distributed compute and SQL.
Its architecture relies on a cluster of server and client nodes.
The primary cache modes are `PARTITIONED` for scalability and `REPLICATED` for read performance on small datasets.
Data affinity and collocation are advanced features that dramatically improve performance by moving computations to the data.
Ignite's power comes with heavier hardware requirements due to its JVM-based nature.

Reflective Questions

When would you choose a `REPLICATED` cache over a `PARTITIONED` one, despite the higher memory usage and slower write performance? Provide a specific use-case example.
How does the concept of "data affinity" in Ignite challenge traditional database designs where compute and storage are separate tiers? What are the potential drawbacks of this tightly coupled approach?

Section 2: REST API Configuration and Management

Interacting with the Grid: A Multitude of APIs

A distributed platform like Apache Ignite is only as useful as its interfaces. Ignite provides a rich set of APIs to cater to different languages, platforms, and performance requirements. The primary methods of interaction include:

Thick Clients (Client Nodes): As discussed, these are full-fledged cluster members that are topology-aware but do not store data. They are typically used by JVM-based applications that require the full power of Ignite's APIs, including compute grid and service grid functionalities.
Thin Clients: These are lightweight clients available for various languages (Java, .NET, Python, C++, Node.js, etc.) that connect to the cluster via a standard TCP socket. They communicate using a streamlined binary protocol and are the recommended approach for most application integrations, offering a balance of performance and simplicity.
JDBC/ODBC Drivers: For data analysts and BI tools, Ignite provides drivers that allow standard SQL queries to be executed against the grid as if it were a relational database.
REST API: For maximum interoperability and ease of use in web-centric architectures, Ignite exposes its functionality via a standard HTTP/S-based REST API. This allows any application capable of making HTTP requests to interact with the grid, without needing any specific Ignite libraries.

In this section, we will focus exclusively on the REST API, detailing its configuration, usage, security, and ideal use cases.

Enabling and Configuring the REST API

By default, the REST API in Apache Ignite is disabled for security reasons. Enabling it is a straightforward process that involves modifying the main Ignite configuration file. The REST functionality is handled by a module known as the `ClientConnector`. To enable it, you must add a `clientConnectorConfiguration` bean to your `IgniteConfiguration` (Jainandunsing, 2025).

Here is the essential XML snippet to be placed within your `` block:

<property name="clientConnectorConfiguration">
    <bean class="org.apache.ignite.configuration.ClientConnectorConfiguration">
        <property name="port" value="10800"/>
        <property name="host" value="127.0.0.1"/>
        <property name="threadPoolSize" value="8"/>
    </bean>
</property>

Let's break down these properties:

`port`: This specifies the TCP port on which the embedded Jetty server will listen for HTTP requests. The default and conventional port is `10800`.
`host`: This is a critical security setting. It defines the network interface to which the REST endpoint will bind. By setting it to `127.0.0.1` (localhost), you ensure that the API is only accessible from the machine where the Ignite node is running. To expose it to the network, you would set it to `0.0.0.0` or a specific network IP, but this should only be done with a firewall and other security measures in place.
`threadPoolSize`: This controls the number of worker threads allocated to handle incoming REST requests. The default is typically sufficient for moderate loads, but it can be increased for high-concurrency scenarios.

Once this configuration is added and the Ignite node is restarted, it will begin listening on port 10800 for REST commands.

Core REST Commands and Usage

The Ignite REST API follows a command-based pattern, where all operations are sent as parameters in the URL's query string. The general format is `http://{host}:{port}/ignite?cmd={command}&{param1}={value1}&...`

Let's explore the most fundamental cache operations using `curl`, a versatile command-line tool for making HTTP requests.

Cache Operations

Get or Create Cache (`cache_get_or_create`): Before you can put data in a cache, the cache must exist. This command ensures a cache with the specified name is available.
curl "http://localhost:10800/ignite?cmd=getorcreate&cacheName=myRestCache"
Put (`cache_put`): Adds or updates a key-value pair in the cache. The key and value are passed as URL parameters.
curl -X POST "http://localhost:10800/ignite?cmd=put&cacheName=myRestCache&key=user123&val=session_data_string"
Get (`cache_get`): Retrieves the value for a given key. The response is a JSON object.
curl "http://localhost:10800/ignite?cmd=get&cacheName=myRestCache&key=user123"
Expected Response: `{"successStatus":0,"response":"session_data_string","sessionToken":null,"error":null}`
Remove (`cache_remove`): Deletes a key-value pair.
curl -X POST "http://localhost:10800/ignite?cmd=rmv&cacheName=myRestCache&key=user123"
Size (`cache_size`): Returns the number of entries in the cache.
curl "http://localhost:10800/ignite?cmd=size&cacheName=myRestCache"
Expected Response: `{"successStatus":0,"response":100,"sessionToken":null,"error":null}`

Security Considerations: A Critical Warning

Convenience often comes at the cost of security, and the REST API is no exception. Exposing an unauthenticated, unencrypted administrative endpoint to your data grid is extremely dangerous. As stated in security guidance, you should always bind to `127.0.0.1` during development and testing (Jainandunsing, 2025). For production environments where remote access is necessary, you must implement layers of security:

Firewall Rules: Use a network firewall (like `iptables` on Linux or cloud security groups) to restrict access to port 10800, allowing connections only from trusted application servers.
Enable TLS/SSL: The `ClientConnectorConfiguration` allows you to specify a keystore and truststore to enable HTTPS. This encrypts the traffic between the client and the Ignite node, preventing eavesdropping and man-in-the-middle attacks.
Enable Authentication: Apache Ignite supports authentication for its clients. You can implement custom security plugins or use enterprise features to secure the REST endpoint with credentials, ensuring that only authorized applications can issue commands.

Use Cases and Limitations

The REST API is an excellent tool for certain scenarios:

Prototyping and Exploration: It provides a quick and easy way to interact with the grid without setting up a full development environment.
Integration with non-JVM languages: For scripting languages like Python or shell scripts, making an HTTP call is often simpler than setting up a binary client connection.
Administrative and Monitoring Tools: Building a simple dashboard to view cache sizes or retrieve specific keys can be done easily with the REST API.

However, it has significant limitations for high-performance applications. Every request involves the overhead of HTTP connection setup, header parsing, and string-to-object serialization/deserialization. For latency-sensitive workloads or high-throughput data ingestion, the binary thin client protocol is vastly superior as it uses a persistent connection and a more efficient data format.

Example: A Simple `curl` Workflow

This sequence of commands demonstrates a complete lifecycle of a cache entry managed entirely via the REST API.

# 1. Create a cache named 'web_sessions'
# The command returns true if the cache was created, or false if it already existed.
curl "http://localhost:10800/ignite?cmd=getorcreate&cacheName=web_sessions"

# 2. Add a session for user 'alice'
# Note the use of -X POST, which is good practice for operations that change state.
curl -X POST "http://localhost:10800/ignite?cmd=put&cacheName=web_sessions&key=alice&val=active_session_token_123"

# 3. Retrieve alice's session to verify it was stored
curl "http://localhost:10800/ignite?cmd=get&cacheName=web_sessions&key=alice"
# Expected output snippet: "response":"active_session_token_123"

# 4. Check the total number of sessions in the cache
curl "http://localhost:10800/ignite?cmd=size&cacheName=web_sessions"
# Expected output snippet: "response":1

# 5. Remove alice's session
curl -X POST "http://localhost:10800/ignite?cmd=rmv&cacheName=web_sessions&key=alice"

# 6. Verify removal by checking the size again
curl "http://localhost:10800/ignite?cmd=size&cacheName=web_sessions"
# Expected output snippet: "response":0

Did You Know?

While the REST API is convenient for basic key-value operations, it only scratches the surface of Ignite's capabilities. Many advanced features, like executing distributed computations (sending a Java `Runnable` or `Callable` to be executed on a remote node) or performing complex, multi-step transactions, are not exposed via the REST API. For these advanced use cases, using a thick or thin client is mandatory.

Section 2 Summary

The REST API is one of several ways to interact with Ignite, enabled via the `ClientConnectorConfiguration`.
It operates over HTTP on a configurable port (default 10800) using URL parameters to define commands.
It is excellent for scripting, prototyping, and integration with non-JVM platforms.
Exposing the REST API on a network is a significant security risk and requires firewalling, TLS/SSL encryption, and authentication.
For high-performance applications, Ignite's binary thin client protocol is the superior choice due to lower overhead.

Reflective Questions

What are the specific security vulnerabilities introduced by exposing the Ignite REST API on a public network with default settings (no TLS, no authentication)? Describe a potential attack scenario.
For a high-throughput microservice written in Go, would you recommend using the REST API or the Go thin client to interact with an Ignite cluster? Justify your answer based on performance, complexity, and reliability.

Section 3: Cluster Scaling Strategies

The Imperative of Scale

The primary reason to choose a distributed system like Apache Ignite is its ability to scale. As an application's user base grows, so does the volume of data and the number of requests per second. A single-node caching solution will eventually hit a ceiling, limited by the CPU, RAM, and network I/O of its host machine. Distributed systems are designed to overcome this by pooling the resources of many machines. Understanding how to manage this scaling process is fundamental to operating Ignite effectively in a production environment.

Horizontal vs. Vertical Scaling

Scaling strategies can be broadly categorized into two types:

Vertical Scaling (Scaling Up): This involves increasing the resources of a single node. For Ignite, this would mean moving the JVM process to a more powerful server with more CPU cores, a larger amount of RAM (allowing for a bigger `-Xmx` heap setting), or faster networking. Vertical scaling is conceptually simple and doesn't involve changes to the cluster topology. However, it has hard physical limits, often comes with a higher price tag, and typically requires downtime to perform the hardware upgrade.
Horizontal Scaling (Scaling Out): This involves adding more nodes (machines) to the cluster. This is Ignite's native scaling model. By adding new server nodes, you increase the cluster's total memory capacity, aggregate CPU power, and network throughput. When using `PARTITIONED` caches, Ignite automatically redistributes data across the new nodes, allowing the cluster to grow its capacity almost linearly. This approach is more flexible, cost-effective (using commodity hardware), and crucially, can be performed with zero downtime.

While vertical scaling has its place, the rest of our discussion will focus on horizontal scaling, as it is the key to building resilient, large-scale systems with Ignite.

Node Discovery: How a Cluster Forms (The Discovery SPI)

For a collection of individual Ignite processes to become a cohesive cluster, they must first find each other. This process is called discovery and is managed by a pluggable component known as the Discovery Service Provider Interface (SPI). Ignite provides several implementations of this SPI to suit different environments.

TcpDiscoverySpi: The Standard for Discovery

The default and most widely used implementation is the `TcpDiscoverySpi`. It uses a TCP/IP-based ring to manage cluster membership. When a node starts, it attempts to connect to one of the other nodes already in the cluster. Once connected, it joins the ring, and all nodes update their view of the cluster topology. The key part of this SPI is the `IpFinder`, which tells a starting node where to look for existing members.

`TcpDiscoveryVmIpFinder` (Static IP Finder): This is the simplest and most recommended approach for production. You provide a static list of IP addresses and/or hostnames where cluster nodes are expected to run. A new node will iterate through this list, trying to connect to one of them. Once it successfully connects to a single active node, it receives the topology of the entire cluster from that node and can then connect to all other members. This method is robust, predictable, and avoids the unreliability of network broadcasts.
`TcpDiscoveryMulticastIpFinder` (Multicast IP Finder): This finder uses IP multicast to broadcast its presence on the network. Other nodes listen for these multicast packets to discover each other dynamically. While this is very convenient for development environments or ad-hoc clusters (as you don't need to pre-configure IP addresses), it is often unreliable in production. Many corporate and cloud networks disable or heavily restrict multicast traffic for security and performance reasons.

Maintaining Stability: The Baseline Topology (BLT)

In a dynamic distributed system, nodes may join and leave—sometimes intentionally (scaling out, maintenance) and sometimes unintentionally (network failures, crashes). A `PARTITIONED` cache reacts to these topology changes by initiating data rebalancing. For example, when a new node joins, some data partitions are moved to it from existing nodes. When a node leaves, its data partitions (and any backups it held) must be re-created on the remaining nodes.

This rebalancing process is resource-intensive, consuming CPU and network bandwidth. If a node disconnects for a few seconds due to a transient network glitch and then immediately rejoins, you wouldn't want the entire cluster to start a massive, unnecessary rebalancing effort. This is where the Baseline Topology (BLT) comes in.

The BLT is a predefined set of server nodes that are considered the "official" members of the cluster. The cluster will only trigger rebalancing when a node joins or permanently leaves this baseline. By default, the BLT is automatically adjusted after a short delay when the topology changes. However, for production stability, it is often manually controlled. For example, before taking a node down for maintenance, an administrator would remove it from the BLT. After the node is brought back online, it is manually added back. This gives administrators fine-grained control over when data rebalancing occurs, preventing it during temporary disruptions.

The Zero-Downtime Scaling Process

Combining these concepts, we can outline the process for horizontally scaling an Ignite cluster without any service interruption:

Prepare the New Node: Provision a new server (physical or virtual) and install Java and the Apache Ignite binaries.
Configure Discovery: Copy the XML configuration from an existing node. Ensure the `TcpDiscoveryVmIpFinder` includes the IP addresses of the existing cluster nodes.
Start the New Node: Launch the `ignite.sh` script. The node will start, use the IP finder to discover the existing cluster, and join the topology. Initially, it joins as a new node but is not yet part of the Baseline Topology.
Update the Baseline Topology: An administrator (or an automated process) adds the new node's consistent ID to the BLT using the control script (`control.sh`).
Trigger Rebalancing: Once the BLT is updated to include the new node, the cluster recognizes the change and automatically begins rebalancing data partitions onto it. This process happens in the background while the cluster continues to serve requests.
Scaling Complete: After rebalancing finishes, the new node is a fully-fledged member of the cluster, holding its share of the data and contributing its resources to the grid.

This graceful, controlled process is what enables Ignite clusters to grow from a few nodes to hundreds without ever needing to be taken offline.

Example: Static IP Finder Configuration

Here is an example of the `discoverySpi` configuration within `ignite.cfg` for a three-node cluster using the recommended `TcpDiscoveryVmIpFinder`.

<property name="discoverySpi">
    <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
        <!-- Set the local address for this specific node -->
        <property name="localAddress" value="10.0.1.1"/>

        <property name="ipFinder">
            <!-- Use the static IP finder for production -->
            <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                <property name="addresses">
                    <list>
                        <!-- List of all potential server nodes in the cluster -->
                        <value>10.0.1.1:47500..47509</value>
                        <value>10.0.1.2:47500..47509</value>
                        <value>10.0.1.3:47500..47509</value>
                    </list>
                </property>
            </bean>
        </property>
    </bean>
</property>

This same configuration would be deployed on all three nodes (with the `localAddress` property optionally changed for clarity, though not strictly required). When any node starts, it will try to connect to the IPs in the list on the default discovery port range (47500-47509) to find its peers and form the cluster.

Did You Know?

For modern, containerized deployments, managing static IP lists can be cumbersome. To address this, Apache Ignite provides specialized discovery SPIs for cloud-native environments. The `TcpDiscoveryKubernetesIpFinder` integrates with the Kubernetes API to automatically discover the IP addresses of other Ignite pods within the same service. Similarly, the `TcpDiscoveryS3IpFinder` for AWS allows nodes to register their IP addresses in a shared S3 bucket, enabling dynamic discovery in an EC2 environment without relying on multicast or static lists.

Section 3 Summary

Apache Ignite is designed for horizontal scaling (adding more nodes), which provides linear scalability and high availability.
Node discovery is handled by the Discovery SPI, with the static `TcpDiscoveryVmIpFinder` being the most reliable choice for production.
The Baseline Topology (BLT) is a crucial concept for cluster stability, providing control over when data rebalancing is triggered.
Data rebalancing is the automatic process of redistributing data partitions across the cluster when the BLT changes.
By managing the BLT, administrators can add or remove nodes from a live cluster with zero downtime.

Reflective Questions

Why is `TcpDiscoveryVmIpFinder` (static IPs) generally preferred over multicast for production environments? What are the specific network and operational risks of using multicast?
Imagine you need to perform a major Java version upgrade on all nodes in a live, 10-node Ignite cluster. Describe the step-by-step process you would follow, using concepts like rolling updates and manual Baseline Topology management, to complete this upgrade with zero application downtime.

Glossary

In-Memory Data Grid (IMDG): A distributed system that stores data entirely in the pooled RAM of multiple computers, enabling ultra-low latency data access and processing.
Data Partitioning: The method of splitting a dataset into smaller chunks (partitions) and distributing them across multiple server nodes in a cluster. This is the basis for horizontal scalability.
Data Replication: The practice of storing a full copy of a dataset on every node in the cluster to improve read performance and provide fault tolerance.
Discovery SPI: The Service Provider Interface in Ignite responsible for how nodes find each other to form a cluster. Implementations include static IP, multicast, and cloud-specific finders.
Baseline Topology (BLT): A specific set of server nodes that are considered active, persistent members of the cluster. Changes to the BLT are used to safely trigger data rebalancing.
Data Rebalancing: The process where Ignite automatically moves data partitions between server nodes to maintain an even distribution when the cluster topology changes.
Server Node: A member of an Ignite cluster that holds data partitions and executes computations. The core component of the data grid.
Client Node: A member of an Ignite cluster that acts as a gateway for an application but does not store data partitions. Also known as a "thick client".

References

Apache Software Foundation. (n.d.). Apache Ignite documentation. Apache Ignite. Retrieved from https://ignite.apache.org/docs/latest/

Jainandunsing, K. (2025). CACHING SERVERS HARDWARE REQUIREMENTS & SOFTWARE CONFIGURATIONS.

Malov, V. (2018). High-performance in-memory computing with Apache Ignite. Packt Publishing.

Oracle Corporation. (2024). Java virtual machine guide. Oracle Help Center. Retrieved from https://docs.oracle.com/en/java/javase/11/gctuning/

Tanenbaum, A. S., & Van Steen, M. (2017). Distributed systems: Principles and paradigms (3rd ed.). Pearson Education.

Back to Course Index