Skip to content

Clustering Internals

Contributor-facing documentation for FrogDB’s clustering architecture: slot routing internals, orchestrator protocol, Raft consensus, and failover algorithm.

For operator-facing setup and configuration, see Operations: Clustering.


FrogDB supports both single-node and cluster operation. The architecture uses:

  • Orchestrated control plane rather than gossip
  • 16384 hash slots for Redis Cluster client compatibility
  • Full dataset replication — replicas copy all data from primary
  • RocksDB WAL streaming for incremental replication
TermDefinition
NodeA FrogDB server instance
Internal ShardThread-per-core partition within a node (N per node)
SlotHash slot 0-16383 (cluster distribution unit)
PrimaryNode owning slots for writes
ReplicaNode replicating from a primary
OrchestratorExternal service managing cluster topology
+---------------------------------------------------------+
| Cluster Level |
| +-------------+ +-------------+ +-------------+ |
| | Node 1 | | Node 2 | | Node 3 | |
| | Slots 0-5460| |Slots 5461- | |Slots 10923- | |
| | | | 10922 | | 16383 | |
| +------+------+ +------+------+ +------+------+ |
+---------+----------------+----------------+--------------+
| | |
+---------v----------------v----------------v--------------+
| Node Level (Internal Shards) |
| +--------+ +--------+ +--------+ +--------+ |
| |Shard 0 | |Shard 1 | |Shard 2 | |Shard N | |
| +--------+ +--------+ +--------+ +--------+ |
| |
| internal_shard = hash(key) % num_internal_shards |
+---------------------------------------------------------+

Routing:

  1. Cluster routing: slot = CRC16(key) % 16384 -> determines which node
  2. Internal routing: internal_shard = hash(key) % N -> determines which thread within node

Hash tags guarantee colocation at both levels:

fn route_key(key: &[u8]) -> (SlotId, InternalShardId) {
let hash_input = extract_hash_tag(key).unwrap_or(key);
let slot = crc16(hash_input) % 16384;
let shard = xxhash64(hash_input) % num_shards;
(slot, shard)
}

AspectOrchestrated (FrogDB)Gossip (Redis)
Topology sourceExternal orchestratorNode consensus
Node discoveryOrchestrator tells nodesNodes discover each other
Failure detectionOrchestrator monitorsNodes vote on failures
Topology changesDeterministic, immediateEventually consistent
DebuggingExplicit stateDerived from gossip

Benefits: Deterministic (no convergence delays), simpler (no gossip protocol), debuggable (topology is explicit), container-friendly (stateless nodes).

The orchestrator pushes topology to all nodes via admin API:

{
"cluster_id": "frogdb-prod-1",
"epoch": 42,
"nodes": [
{
"id": "node-abc123",
"address": "10.0.0.1:6379",
"admin_address": "10.0.0.1:6380",
"role": "primary",
"slots": [{"start": 0, "end": 5460}]
}
]
}

The epoch is a monotonic counter incremented on any topology change. Nodes detect staleness by comparing their local epoch against incoming topology updates.

Detection MethodDescription
Orchestrator pushReceives topology with higher epoch
Client redirectReceives -MOVED from another node
Replica syncPrimary sends epoch in replication handshake
Periodic pollNode polls orchestrator every topology-refresh-interval-ms

Before executing a command, the node validates slot ownership:

enum SlotValidationResult {
NoKeys, // Keyless command
Owned, // Execute locally
CrossSlot, // Multi-slot error
Redirect { slot: SlotId, owner: NodeId, owner_addr: SocketAddr },
Migrating { slot: SlotId, target: NodeId }, // We're source
Importing { slot: SlotId, source: NodeId }, // We're target
}
ResultCommand Action
NoKeysExecute locally
OwnedExecute locally
CrossSlotReturn -CROSSSLOT
RedirectReturn -MOVED <slot> <host>:<port>
MigratingExecute if key exists locally, else -ASK <slot> <target>
ImportingExecute only if ASKING flag set, else -MOVED

Validation timing in pipeline: Parse -> Lookup -> Arity -> Extract keys -> CROSSSLOT check -> Slot ownership -> Execute


  1. Node starts standalone, registers with orchestrator
  2. Orchestrator pushes initial topology
  3. Node may receive empty slots (immediately own), importing slots (enter IMPORTING), or replica role (begin PSYNC)
  1. Orchestrator updates topology (remove node’s slots)
  2. Node enters MIGRATING state for owned slots, transfers keys
  3. Final topology pushed, node stops serving cluster traffic
fn handle_demotion(old_role: Role, new_topology: &Topology) {
match old_role {
Role::Primary => {
self.read_only = true;
self.wal.flush_sync()?;
let new_primary = new_topology.find_primary_for(self.old_slots)?;
self.start_replication(new_primary);
}
Role::Replica => {
let assigned_primary = new_topology.find_my_primary(self.id)?;
if assigned_primary != self.current_primary {
self.stop_replication();
self.start_replication(assigned_primary);
}
}
}
}

Nodes do NOT gossip with each other. They only connect directly for:

PurposeDirectionProtocol
ReplicationReplica -> PrimaryPSYNC + WAL stream
Slot MigrationSource -> TargetMIGRATE protocol

All topology knowledge comes from the orchestrator.


Each node exposes an admin API on a separate port:

EndpointMethodDescription
/admin/clusterPOSTReceive topology update
/admin/clusterGETReturn current topology
/admin/healthGETHealth check
/admin/replicationGETReplication status
/admin/readyGETNode readiness for traffic
CheckHealthyUnhealthy
AcceptorAccepting connectionsNot listening
Shard WorkersAll responding within 100msAny unresponsive
MemoryBelow critical threshold (95%)Above critical
PersistenceQueue depth < high watermarkBlocked > 30s
Cluster StateHas valid topologyNo/stale topology

All nodes in a cluster should have the same num-shards configuration. Hash tag colocation is computed per-node using xxhash64(tag) % num_shards, so different shard counts change which internal shard keys land on (though colocation is always preserved within each node).

During migration between nodes with different shard counts, the source communicates its shard count and the target redistributes keys to its own internal shards.