Neo4j Enterprise Clustering¶
This document describes how to configure and manage Neo4j Enterprise clusters using the Neo4j Kubernetes Operator.
Overview¶
The Neo4j Kubernetes Operator supports Neo4j 5.26 LTS (the final SemVer release) and any CalVer release (2025.x, 2026.x, and onward) for Enterprise clustering, with multiple discovery mechanisms and advanced features like read replicas and multi-zone deployments.
Cluster Architecture¶
A Neo4j Enterprise cluster consists of:
- Servers: Neo4j server instances that self-organize into primary and secondary roles automatically
- Discovery service: Enables cluster members to find each other
- Routing service: Routes client connections to appropriate cluster members
Server Self-Organization: In the server-based architecture, you deploy a number of servers and Neo4j automatically assigns primary and secondary roles based on database requirements and cluster state.
Discovery Methods¶
The operator automatically uses LIST discovery with static pod FQDNs — the recommended approach for Kubernetes deployments. See the Neo4j Operations Manual for background.
How Cluster Discovery Works¶
The operator uses the LIST resolver with pre-computed pod FQDNs from the StatefulSet headless service. Each pod's DNS name is known upfront ({cluster}-server-{n}.{cluster}-headless.{ns}.svc.cluster.local), so the operator injects a fixed peer list at startup — no Kubernetes API calls required for discovery.
Note: Do not confuse this with the Neo4j
K8Sresolver type, which queries the Kubernetes API directly. This operator always uses theLISTresolver.
Discovery ports:
- Port 6000 (tcp-tx): V2 cluster communication — used by this operator for discovery endpoints
- Port 5000 (tcp-discovery): V1 discovery — deprecated, never used by this operator
Kubernetes Services Created¶
The operator automatically creates:
- {cluster-name}-headless — StatefulSet headless service (pod FQDNs)
- {cluster-name}-internals — cluster-internal routing
- {cluster-name}-client — external Bolt/HTTP access
Discovery Configuration (Injected Automatically)¶
The operator injects version-specific discovery settings into every pod's startup script. Do not set these in spec.config — the operator manages them.
Neo4j 5.26.x (SemVer):
dbms.cluster.discovery.resolver_type=LIST
dbms.cluster.discovery.version=V2_ONLY
dbms.cluster.discovery.v2.endpoints=<cluster>-server-0.<cluster>-headless.<ns>.svc.cluster.local:6000,...
internal.dbms.cluster.discovery.system_bootstrapping_strategy=me # server-0 only
Neo4j 2025.x+ / 2026.x+ (CalVer):
dbms.cluster.discovery.resolver_type=LIST
dbms.cluster.endpoints=<cluster>-server-0.<cluster>-headless.<ns>.svc.cluster.local:6000,...
# No dbms.cluster.discovery.version — V2 is the only supported protocol
Ref: 5.26.x discovery docs · 2025.x+ discovery docs
Cluster Formation¶
The operator uses a ME/OTHER bootstrap strategy with Parallel pod management for fast, split-brain-free cluster formation.
Key Configuration¶
- Bootstrap strategy: server-0 uses
me(preferred bootstrapper); all other servers useother(join when ready) - Minimum primaries: Set to
TOTAL_SERVERSon initial formation — all servers must mutually discover each other before RAFT elects a leader, preventing premature solo bootstrap - On restart (data already exists): minimum primaries check is skipped so servers rejoin immediately without blocking StatefulSet rolling updates
- Pod Management:
Parallel— all pods start simultaneously
How It Works¶
- All server pods start in parallel — Single StatefulSet with
Parallelpod management - Servers discover each other — Via static pod FQDNs in the LIST endpoint list (port 6000)
- RAFT coordination — server-0's
mehint makes it the preferred bootstrapper; others wait withotherhint - All N servers must see each other —
dbms.cluster.minimum_initial_system_primaries_count=Nprevents any single node from forming a solo cluster (split-brain) - Cluster forms once quorum reached — RAFT elects server-0 as bootstrap leader; others join
- Servers self-organize — Neo4j automatically assigns primary and secondary roles per database
Benefits¶
- Split-brain prevention — All servers must be mutually visible before formation completes
- Reliable formation — Deterministic peer addresses (one FQDN per pod) unlike K8S ClusterIP which returns a single VIP
- Fast restarts — Minimum primaries check skipped on pod restarts so rolling updates aren't blocked
TLS-Enabled Clusters¶
TLS-enabled clusters use the same parallel formation approach with additional optimizations:
- Automatic trust configuration: The operator sets
dbms.ssl.policy.cluster.trust_all=truefor intra-cluster communication - Parallel startup maintained: TLS doesn't change the pod startup behavior
- Reliable formation: With proper configuration, TLS clusters form as reliably as non-TLS clusters
For detailed TLS configuration, see the TLS Configuration Guide.
Cluster Formation Strategy¶
The operator uses a unified clustering approach for all deployments:
Unified Cluster Formation¶
- All deployments use Neo4j's clustering infrastructure (even single-node)
- Automatic handling of discovery configuration based on Neo4j version
- Coordinated startup ensures data consistency
Parallel Pod Management¶
- Uses parallel pod startup for faster cluster formation
- All pods start simultaneously and coordinate during bootstrap
- Prevents split-brain scenarios through proper coordination
Formation Requirements¶
All primaries must be present for initial cluster formation:
| Cluster Size | Formation Requirement | Rationale |
|---|---|---|
| 2 servers | 2 servers required | Minimum cluster size |
| 3 servers | 3 servers required | Odd number for optimal fault tolerance |
| 4+ servers | All servers required | Ensures consistent initial state |
This approach ensures that clusters form with a complete and consistent initial membership.
Cluster Formation Process¶
- Resource Creation: The operator creates all Kubernetes resources (StatefulSets, Services, RBAC)
- Parallel Pod Startup: All pods start simultaneously (not sequentially)
- Discovery Phase: Pods discover each other via Kubernetes service discovery
- Coordination Phase: All pods wait for complete membership before forming cluster
- Service Ready: Cluster accepts connections after successful formation
Important Considerations¶
- Complete Membership: All configured server nodes must be available for initial cluster formation
- Startup Time: Cluster formation typically completes within 2-3 minutes
- Pod Readiness: Pods are marked ready only after successful cluster formation
- Scaling: After initial formation, clusters can be scaled following Neo4j's online scaling procedures
Basic Cluster Configuration¶
Simple 3-Node Cluster¶
apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jEnterpriseCluster
metadata:
name: simple-cluster
spec:
image:
repo: neo4j
tag: 5.26-enterprise
topology:
servers: 3 # 3 servers will self-organize into appropriate roles
storage:
className: standard
size: 10Gi
# Kubernetes discovery is automatically configured by the operator
# No manual discovery configuration needed!
Note: The operator automatically handles all clustering configuration including: - Kubernetes discovery setup - RBAC resource creation - Service creation for cluster communication - Neo4j configuration for optimal clustering
Using kubectl Commands¶
# Check cluster status
kubectl get neo4jenterprisecluster my-cluster -o yaml
# View cluster details
kubectl describe neo4jenterprisecluster my-cluster
# Check pod status
kubectl get pods -l app.kubernetes.io/instance=my-cluster
Default Database Topology¶
When a Neo4j cluster starts for the first time, it automatically creates a default database called neo4j. This database is created with Neo4j's built-in defaults: 1 primary and 0 secondaries, regardless of how many servers are in your cluster.
This means that on a 3-server cluster, the default neo4j database will only be hosted on a single server — which may be surprising if you expect it to span the entire cluster.
Configuring Default Topology at Bootstrap¶
You can control the default topology for all newly created databases (including the auto-created neo4j database) using initial.* settings in spec.config:
Important: These are bootstrap-only settings. They only take effect when the cluster is created for the first time. Adding or changing them on an existing cluster has no effect.
Changing Database Topology After Bootstrap¶
To change the topology of an existing database, use the ALTER DATABASE Cypher command:
kubectl exec <cluster>-server-0 -c neo4j -- cypher-shell -u neo4j -p <password> \
"ALTER DATABASE neo4j SET TOPOLOGY 3 PRIMARIES 1 SECONDARY"
Alternatively, you can manage database topology declaratively using the Neo4jDatabase CRD. See the Neo4jDatabase API Reference for details.
Cannot Skip Default Database Creation¶
Neo4j does not provide a way to prevent the default neo4j database from being created. It is always created at cluster bootstrap. The only post-bootstrap options are:
- Change its topology using
ALTER DATABASE - Rename the default using the
dbms.setDefaultDatabase()procedure (do not use the deprecateddbms.default_databaseconfig setting):This must be executed as a database operation after bootstrap, not viakubectl exec <cluster>-server-0 -c neo4j -- cypher-shell -u neo4j -p <password> \ "CALL dbms.setDefaultDatabase('mydb');"spec.config. - Manage it declaratively via a
Neo4jDatabaseCRD (the operator will warn that you are shadowing the default)
Advanced Configuration¶
Multi-Zone Deployment¶
Configure topology spread and anti-affinity for high availability:
spec:
topology:
servers: 5 # 5 servers will self-organize into appropriate roles
placement:
antiAffinity:
enabled: true
topologyKey: topology.kubernetes.io/zone
type: preferredDuringSchedulingIgnoredDuringExecution
topologySpread:
enabled: true
topologyKey: topology.kubernetes.io/zone
maxSkew: 1
whenUnsatisfiable: DoNotSchedule
📖 For comprehensive topology placement options, including zone distribution strategies, anti-affinity configurations, and troubleshooting tips, see the Topology Placement Guide.
TLS Configuration¶
Enable TLS encryption for cluster communication:
spec:
tls:
mode: cert-manager
issuerRef:
name: neo4j-cluster-issuer
kind: ClusterIssuer
duration: 8760h # 1 year
renewBefore: 720h # 30 days
Port Configuration¶
The operator uses the following default ports:
| Port | Name | Purpose |
|---|---|---|
| 7687 | bolt | Client Bolt connections |
| 7474 | http | HTTP API |
| 7473 | https | HTTPS API |
| 6000 | tcp-tx | V2 cluster traffic + discovery endpoints (always used) |
| 5000 | tcp-discovery | V1 discovery — deprecated, not used by this operator |
| 7688 | routing | Routing service |
| 7000 | raft | RAFT consensus |
spec:
config:
server.cluster.listen_address: 0.0.0.0:6000 # V2 cluster + discovery
server.routing.listen_address: 0.0.0.0:7688
server.cluster.raft.listen_address: 0.0.0.0:7000
Health Monitoring¶
The operator provides comprehensive health monitoring:
# Check cluster health
kubectl get neo4jenterprisecluster my-cluster -o jsonpath='{.status.phase}'
# Get cluster status
kubectl describe neo4jenterprisecluster my-cluster
# View cluster logs
kubectl logs -l app.kubernetes.io/instance=my-cluster
Scaling Operations¶
Scale Up/Down¶
# Scale servers by editing the resource
kubectl patch neo4jenterprisecluster my-cluster --type='merge' -p='{"spec":{"topology":{"servers":7}}}'
# Or edit the resource directly
kubectl edit neo4jenterprisecluster my-cluster
Rolling Upgrades¶
spec:
upgradeStrategy:
strategy: RollingUpgrade
preUpgradeHealthCheck: true
postUpgradeHealthCheck: true
maxUnavailableDuringUpgrade: 1
upgradeTimeout: 30m
healthCheckTimeout: 5m
stabilizationTimeout: 3m
autoPauseOnFailure: true
Troubleshooting¶
Common Issues¶
- Cluster Formation Issues (Rare with Current Configuration)
- The parallel startup strategy and topology-aligned minimum primaries configuration eliminate most formation issues
- Check pod logs:
kubectl logs {cluster-name}-server-0for any startup errors -
Verify all pods can resolve DNS:
kubectl exec {cluster-name}-server-0 -- nslookup {cluster-name}-headless -
Discovery (LIST resolver via headless service)
- The operator uses LIST discovery against the StatefulSet's
-headlessservice (the legacy-discoveryClusterIP service exists for backward compatibility but is NOT used by the V2 discovery path). - Verify the headless service exists:
kubectl get service {cluster-name}-headless - Check endpoints include all pods:
kubectl describe endpoints {cluster-name}-headless - Check that pod FQDNs resolve:
kubectl exec {cluster-name}-server-0 -- nslookup {cluster-name}-server-0.{cluster-name}-headless.{ns}.svc.cluster.local - Inspect the startup script for correct LIST endpoints:
kubectl get configmap {cluster-name}-config -o yaml | grep -A2 resolver_type -
Verify pod has correct ServiceAccount:
kubectl get pod {cluster-name}-server-0 -o jsonpath='{.spec.serviceAccountName}' -
Quorum Loss
- Check primary node health
- Verify minimum cluster size configuration
Debug Commands¶
# Check cluster member status
kubectl exec -it my-cluster-server-0 -- cypher-shell -u neo4j -p password "SHOW SERVERS"
# Check database allocation
kubectl exec -it my-cluster-server-0 -- cypher-shell -u neo4j -p password "SHOW DATABASES"
# View cluster logs
kubectl logs my-cluster-server-0 -c neo4j
Best Practices¶
- Use odd numbers of servers (3, 5, 7) for optimal fault tolerance. Even numbers are allowed but may have less optimal quorum behavior.
- Configure proper resource limits based on workload
- Use multi-zone deployment for high availability
- Enable TLS for secure cluster communication
- Monitor cluster health regularly
- Use rolling upgrades for zero-downtime updates
- Trust automatic discovery - the operator handles all Kubernetes discovery configuration
Migration from Neo4j 4.x¶
When migrating from Neo4j 4.x clustering to 5.x:
- Remove all manual discovery configuration - the operator handles discovery automatically
- Update from
causal_clustering.*todbms.cluster.*- but discovery settings are managed by operator - Update port configurations for new routing service if customized
- Remove static endpoint lists - no longer needed with Kubernetes discovery
- Test cluster formation in staging environment first
Important: The operator automatically handles all cluster discovery. Remove any manual discovery configuration from your cluster specifications.