Testing Guide¶
This guide explains the comprehensive testing strategy for the Neo4j Enterprise Operator, covering unit tests, integration tests, and end-to-end testing practices.
Testing Strategy Overview¶
The operator uses a multi-layered testing approach:
- Unit Tests: Fast tests for individual functions and components
- Integration Tests: Full workflow testing with Kubernetes API server
- End-to-End Tests: Real cluster testing with Kind clusters
- Performance Tests: Reconciliation efficiency and resource usage validation
Test Infrastructure¶
Testing Framework¶
- Ginkgo/Gomega: BDD-style testing framework for integration tests
- Envtest: Kubernetes API server for integration testing
- Kind: Kubernetes in Docker for real cluster testing
- Go Testing: Standard Go testing for unit tests
Test Environments¶
- Development:
neo4j-operator-devKind cluster - Testing:
neo4j-operator-testKind cluster - CI/CD: Automated testing in GitHub Actions
Unit Tests¶
Unit tests are fast, require no Kubernetes cluster, and test individual functions and components.
Running Unit Tests¶
# Run all unit tests (no cluster required)
make test-unit
# Run specific package tests
go test ./internal/controller -v
go test ./internal/validation -v
go test ./api/v1beta1 -v
# Run specific test functions
go test ./internal/controller -run TestGetStatefulSetName -v
go test ./internal/validation -run TestTopologyValidator -v
Unit Test Structure¶
Unit tests are located alongside the code they test:
internal/controller/
├── neo4jenterprisecluster_controller.go
├── neo4jenterprisecluster_controller_test.go
├── plugin_controller.go
├── plugin_controller_unit_test.go # Unexported method tests
└── plugin_controller_test.go # Integration-style tests
Writing Unit Tests¶
func TestGetStatefulSetName(t *testing.T) {
r := &Neo4jPluginReconciler{}
tests := []struct {
name string
deployment *DeploymentInfo
expected string
}{
{
name: "cluster deployment",
deployment: &DeploymentInfo{
Type: "cluster",
Name: "my-cluster",
},
expected: "my-cluster-server",
},
// Add more test cases...
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := r.getStatefulSetName(tt.deployment)
assert.Equal(t, tt.expected, result)
})
}
}
Integration Tests¶
Integration tests use envtest to provide a real Kubernetes API server without requiring a full cluster.
Test Cluster Management¶
# Create test cluster (includes cert-manager for TLS tests)
make test-cluster
# Clean operator resources (keep cluster running)
make test-cluster-clean
# Reset cluster (delete and recreate)
make test-cluster-reset
# Delete test cluster entirely
make test-cluster-delete
# Complete test environment cleanup
make test-destroy
Running Integration Tests¶
# Full integration test suite (automatically creates cluster and deploys operator)
make test-integration
# Alternative: step-by-step approach
make test-cluster # Create test cluster
make test-integration # Run tests (uses existing cluster)
make test-cluster-delete # Clean up cluster
# Run specific test suites
ginkgo run -focus "Neo4jEnterpriseCluster" ./test/integration
ginkgo run -focus "should create backup" ./test/integration
ginkgo run -focus "Plugin Installation" ./test/integration
# CI-optimized test commands (for advanced use)
make test-integration-ci # Assumes cluster and operator already deployed
make test-integration-ci-full # Full suite in CI environment
Integration Test Structure¶
Integration tests are located in test/integration/ and follow consistent patterns:
var _ = Describe("Neo4jPlugin Integration Tests", func() {
const (
timeout = time.Second * 300 // 5-minute timeout for CI
interval = time.Second * 5
)
Context("Plugin Installation on Cluster", func() {
It("Should install APOC plugin on Neo4jEnterpriseCluster", func() {
ctx := context.Background()
namespace := createUniqueNamespace()
By("Creating namespace")
Expect(k8sClient.Create(ctx, namespace)).Should(Succeed())
By("Creating admin secret")
// Create required secrets...
By("Creating Neo4jEnterpriseCluster")
cluster := &neo4jv1beta1.Neo4jEnterpriseCluster{
ObjectMeta: metav1.ObjectMeta{
Name: "plugin-test-cluster",
Namespace: namespace.Name,
},
Spec: neo4jv1beta1.Neo4jEnterpriseClusterSpec{
Image: neo4jv1beta1.ImageSpec{
Repo: "neo4j",
Tag: "5.26.0-enterprise",
},
Topology: neo4jv1beta1.TopologyConfiguration{
Servers: 2,
},
// Resource constraints for CI compatibility
Resources: &corev1.ResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse("100m"),
corev1.ResourceMemory: resource.MustParse("1.5Gi"),
},
Limits: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse("500m"),
corev1.ResourceMemory: resource.MustParse("1.5Gi"),
},
},
Storage: neo4jv1beta1.StorageSpec{
Size: "1Gi",
ClassName: "standard",
},
},
}
Expect(k8sClient.Create(ctx, cluster)).Should(Succeed())
By("Waiting for cluster to be ready")
Eventually(func() string {
currentCluster := &neo4jv1beta1.Neo4jEnterpriseCluster{}
err := k8sClient.Get(ctx, types.NamespacedName{
Name: "plugin-test-cluster",
Namespace: namespace.Name,
}, currentCluster)
if err != nil {
return ""
}
return currentCluster.Status.Phase
}, timeout, interval).Should(Equal("Ready"))
// Continue with plugin testing...
})
})
})
Current Architecture Testing (August 2025)¶
Server-Based Architecture Tests¶
Tests verify the new server-based architecture:
By("Verifying server StatefulSet exists with correct name")
serverSts := &appsv1.StatefulSet{}
Eventually(func() error {
return k8sClient.Get(ctx, types.NamespacedName{
Name: clusterName + "-server", // Server-based naming
Namespace: namespace.Name,
}, serverSts)
}, timeout, interval).Should(Succeed())
Expect(*serverSts.Spec.Replicas).To(Equal(int32(2)))
Centralized Backup Testing¶
Tests verify centralized backup architecture:
By("Verifying centralized backup StatefulSet")
backupSts := &appsv1.StatefulSet{}
Eventually(func() error {
return k8sClient.Get(ctx, types.NamespacedName{
Name: clusterName + "-backup", // Centralized backup
Namespace: namespace.Name,
}, backupSts)
}, timeout, interval).Should(Succeed())
Expect(*backupSts.Spec.Replicas).To(Equal(int32(1))) // Single backup pod
Dual Deployment Support Testing¶
Tests verify both cluster and standalone support:
Context("Plugin Installation on Standalone", func() {
It("Should install GDS plugin on Neo4jEnterpriseStandalone", func() {
// Test standalone deployment with plugin installation
standalone := &neo4jv1beta1.Neo4jEnterpriseStandalone{
ObjectMeta: metav1.ObjectMeta{
Name: standaloneName,
Namespace: namespace.Name,
},
Spec: neo4jv1beta1.Neo4jEnterpriseStandaloneSpec{
Image: neo4jv1beta1.ImageSpec{
Repo: "neo4j",
Tag: "5.26.0-enterprise",
},
// Standalone-specific configuration...
},
}
// Test plugin installation on standalone...
})
})
Test Configuration Guidelines¶
Resource Requirements for CI¶
All integration tests use minimal resources to avoid CI scheduling issues:
resources:
requests:
cpu: "100m" # Minimal CPU for CI compatibility
memory: "1.5Gi" # Required for Neo4j Enterprise database operations
limits:
cpu: "500m" # Reasonable limit for testing
memory: "1.5Gi" # Neo4j Enterprise minimum for database operations
Storage Configuration¶
storage:
size: "1Gi" # Minimal size for testing
className: "standard" # Default storage class in Kind
Timeout Configuration¶
const (
timeout = time.Second * 300 // 5-minute timeout for CI environments
interval = time.Second * 5 // Check every 5 seconds
)
Resource Cleanup Patterns¶
Critical Cleanup Requirements¶
Proper resource cleanup is critical to prevent CI failures and resource exhaustion:
1. MANDATORY AfterEach Pattern¶
All integration tests MUST include AfterEach blocks to prevent resource leaks:
AfterEach(func() {
// Critical: Clean up resources immediately to prevent CI resource exhaustion
if cluster != nil {
By("Cleaning up cluster resource")
// Remove finalizers first
if len(cluster.GetFinalizers()) > 0 {
cluster.SetFinalizers([]string{})
_ = k8sClient.Update(ctx, cluster)
}
// Delete the resource
_ = k8sClient.Delete(ctx, cluster)
cluster = nil
}
// Clean up any remaining resources in namespace
if testNamespace != "" {
cleanupCustomResourcesInNamespace(testNamespace)
}
})
Why This Pattern Is Critical: - Prevents resource accumulation that causes "Insufficient memory" errors - Ensures cleanup even if tests fail (inline cleanup won't run on failure) - Removes finalizers to prevent resources stuck in Terminating state - Cleans namespace resources that might not have owner references
2. Common Mistakes to Avoid¶
❌ No AfterEach block - Causes resource leaks if tests fail ❌ Inline cleanup only - Won't execute if test panics or fails ❌ Missing namespace cleanup - Leaves behind ConfigMaps, Services, etc. ❌ Not removing finalizers - Resources stay in Terminating state ❌ Relying on test suite cleanup - Not sufficient for resource-intensive tests
2. Handle All Resource Types¶
Clean up all resources that might have finalizers:
// Neo4j resources
Expect(k8sClient.Delete(ctx, cluster)).Should(Succeed())
Expect(k8sClient.Delete(ctx, standalone)).Should(Succeed())
Expect(k8sClient.Delete(ctx, plugin)).Should(Succeed())
Expect(k8sClient.Delete(ctx, database)).Should(Succeed())
Expect(k8sClient.Delete(ctx, backup)).Should(Succeed())
// Kubernetes resources (usually auto-cleaned by owner references)
// PVCs, Services, StatefulSets are cleaned automatically
3. Use Helper Functions¶
// Helper function to create unique namespace
func createUniqueNamespace() *corev1.Namespace {
return &corev1.Namespace{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("test-%d", time.Now().UnixNano()),
},
}
}
Test Suite Cleanup Helpers¶
The integration test suite provides cleanup utilities:
// Clean up all custom resources in namespace
cleanupCustomResourcesInNamespace(namespace)
// Force remove finalizers if needed
forceRemoveFinalizers(resource)
Testing Best Practices¶
Resource Naming Patterns¶
Test resources should use predictable naming:
// Cluster naming
clusterName := "test-cluster-" + unique-suffix
// Expected StatefulSet names (server-based architecture)
expectedServerSts := clusterName + "-server"
expectedBackupSts := clusterName + "-backup"
// Standalone naming
standaloneName := "test-standalone-" + unique-suffix
expectedStandaloneSts := standaloneName // No suffix for standalone
Memory Requirements¶
Critical for Neo4j Enterprise: Tests must allocate sufficient memory:
Resources: &corev1.ResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceMemory: resource.MustParse("1.5Gi"), // MINIMUM for Enterprise
},
Limits: corev1.ResourceList{
corev1.ResourceMemory: resource.MustParse("1.5Gi"), // Prevent OOMKill
},
},
Why 1.5Gi is required: - Neo4j Enterprise needs minimum memory for database operations - Lower values cause Out of Memory kills (exit code 137) - Database creation and topology operations fail with insufficient memory
Waiting Patterns¶
Cluster Readiness (Condition-Based)¶
Eventually(func() string {
cluster := &neo4jv1beta1.Neo4jEnterpriseCluster{}
err := k8sClient.Get(ctx, clusterKey, cluster)
if err != nil {
return ""
}
return cluster.Status.Phase
}, timeout, interval).Should(Equal("Ready"))
Standalone Readiness (Boolean-Based)¶
Eventually(func() bool {
standalone := &neo4jv1beta1.Neo4jEnterpriseStandalone{}
err := k8sClient.Get(ctx, standaloneKey, standalone)
if err != nil {
return false
}
return standalone.Status.Ready
}, timeout, interval).Should(BeTrue())
Neo4j Cluster Formation Verification¶
By("Verifying Neo4j cluster formation")
Eventually(func() error {
// Connect to first server and check cluster status
return exec.Command("kubectl", "exec",
clusterName+"-server-0", "--",
"cypher-shell", "-u", "neo4j", "-p", password,
"SHOW SERVERS").Run()
}, timeout, interval).Should(Succeed())
Performance Testing¶
Reconciliation Efficiency Tests¶
It("Should maintain efficient reconciliation rates", func() {
// Monitor reconciliation frequency
// Verify <100 reconciliations per minute under normal conditions
})
Resource Usage Tests¶
It("Should use optimal resource patterns", func() {
// Verify centralized backup uses <30% resources of sidecar approach
// Check server-based StatefulSet efficiency
})
CI/CD Testing¶
GitHub Actions Integration¶
Tests run automatically in CI with: - Parallel Execution: Multiple test suites run concurrently - Resource Constraints: CI-optimized resource limits - Timeout Handling: Extended timeouts for image pull delays - Cleanup Automation: Automatic test environment cleanup
CI-Specific Configuration¶
# Environment variables for CI
export CI=true
export KUBEBUILDER_ASSETS="$(pwd)/bin/k8s/1.31.0-linux-amd64"
export KUBECONFIG=~/.kube/config
Troubleshooting Test Failures¶
Common Test Issues¶
1. Namespace Stuck in Terminating¶
Symptoms: Test namespaces remain in "Terminating" state indefinitely
Diagnosis:
# Check for resources with finalizers
kubectl get all,neo4jenterpriseclusters,neo4jenterprisestandalones,neo4jplugins -n <namespace> -o yaml | grep -A5 finalizers
# Check for PVCs
kubectl get pvc -n <namespace>
Solutions:
# Force cleanup test resources
make test-cluster-clean
# Reset test cluster entirely
make test-cluster-reset
# Manual finalizer removal (if needed)
kubectl patch neo4jenterprisecluster <name> -n <namespace> \
-p '{"metadata":{"finalizers":[]}}' --type=merge
2. Out of Memory (OOMKilled) Failures¶
Symptoms: Pods exit with code 137, "OOMKilled" in pod status
Diagnosis:
# Check pod status
kubectl describe pod <pod-name> | grep -E "(OOMKilled|Memory|Exit.*137)"
# Monitor memory usage
kubectl top pod <pod-name> --containers
Solutions: - Increase memory limits to minimum 1.5Gi for Neo4j Enterprise - Reduce concurrent test execution - Use minimal storage and CPU allocations
3. Test Timeouts¶
Symptoms: Tests fail with "Timed out after 300s"
Common Causes: - Image pull delays in CI environments - Insufficient resources for cluster formation - Missing RBAC permissions
Solutions:
# Check operator status
kubectl get pods -n neo4j-operator
# Check operator logs
kubectl logs -n neo4j-operator deployment/neo4j-operator-controller-manager
# Verify cert-manager (required for TLS tests)
kubectl get pods -n cert-manager
# Check cluster formation
kubectl get events --sort-by='.firstTimestamp'
4. Ginkgo Test Suite Conflicts¶
Symptoms: "Ginkgo does not support rerunning suites" error
Cause: Multiple RunSpecs() calls in same package
Solution: Ensure only one test suite per package:
// Correct: One RunSpecs per package
func TestControllers(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Controller Suite")
}
// Include all tests in the same suite via Describe blocks
Test Coverage and Quality¶
Coverage Targets¶
# Generate coverage report
make test-coverage
# View coverage in browser
go tool cover -html=coverage.out
Coverage Goals¶
- Unit Tests: >80% coverage for controller logic
- Integration Tests: All major workflows covered
- E2E Tests: Critical user journeys verified
Quality Checks¶
Integration tests should verify: - Resource Creation: All expected Kubernetes resources created - Status Updates: Proper status conditions and phase transitions - Error Handling: Graceful handling of failure scenarios - Resource Cleanup: Proper finalizer handling and cleanup - Performance: Efficient reconciliation and resource usage
CI Workflow Emulation for Troubleshooting (Added 2025-08-22)¶
When encountering CI failures or testing memory-constrained environments, use the comprehensive CI workflow emulation:
Quick Start¶
What It Does¶
The test-ci-local target provides a complete emulation of the GitHub Actions CI workflow:
- Environment Setup
- Sets
CI=true GITHUB_ACTIONS=trueenvironment variables - Creates
logs/directory for comprehensive debug output -
Cleans up any previous test environment
-
Unit Test Phase
- Runs unit tests with CI environment variables
- Logs Go version, kubectl version, and environment details
-
Saves output to
logs/ci-local-unit.log -
Integration Test Phase
- Creates test cluster with CI-appropriate resource constraints
- Deploys Neo4j operator
- Runs integration tests with 512Mi memory limits (same as CI)
-
Saves output to
logs/ci-local-integration.log -
Cleanup Phase
- Complete environment destruction
- Saves cleanup output to
logs/ci-local-cleanup.log
Key Differences from Local Testing¶
| Aspect | Local Development | CI Environment | CI Emulation |
|---|---|---|---|
| Memory Limits | 1.5Gi | 512Mi | 512Mi ✅ |
| Environment Variables | Local defaults | CI=true, GITHUB_ACTIONS=true | CI=true, GITHUB_ACTIONS=true ✅ |
| Resource Constraints | Generous | Limited (~7GB total) | Limited ✅ |
| Debug Logging | Console only | Limited | Comprehensive files ✅ |
| Troubleshooting | Manual | Minimal | Auto-provided commands ✅ |
Debug Output Files¶
Generated debug files provide comprehensive troubleshooting information:
logs/ci-local-unit.log- Unit test output with environment information
- Go version and tool versions
-
Complete test execution logs
-
logs/ci-local-integration.log - Test cluster creation and operator deployment
- Integration test execution with CI constraints
-
Resource allocation and memory limit information
-
logs/ci-local-cleanup.log - Environment cleanup operations
- Resource removal confirmation
Automatic Troubleshooting Commands¶
If integration tests fail, the target automatically provides troubleshooting commands:
# Check operator logs
kubectl logs -n neo4j-operator-system deployment/neo4j-operator-controller-manager
# Check pod status
kubectl get pods --all-namespaces
# Check events
kubectl get events --all-namespaces --sort-by='.lastTimestamp'
Usage Scenarios¶
1. Debugging CI Failures
# CI failed with memory issues? Reproduce locally:
make test-ci-local
# Check specific integration logs
cat logs/ci-local-integration.log | grep -E "(OOMKilled|Memory|Insufficient)"
2. Testing Resource Constraints
# Test with CI memory limits before pushing
make test-ci-local
# Verify resource requirements are appropriate
grep -A5 "memory" logs/ci-local-integration.log
3. Validating CI Fixes
# After fixing CI issues, validate locally
make test-ci-local
# Confirm tests pass with CI constraints
echo "Exit code: $?"
Performance Analysis¶
The CI emulation includes performance timing information:
# View test execution timeline
grep "Started at\|Finished at" logs/ci-local-*.log
# Analyze test duration by phase
grep -E "PHASE|✅|❌" logs/ci-local-integration.log
Best Practices¶
- Use Before CI Push: Run
make test-ci-localbefore pushing changes that affect tests - Review All Logs: Check all three log files for complete understanding
- Memory Optimization: Use findings to optimize resource requirements
- Document Issues: Add findings to troubleshooting guides
Writing New Tests¶
Adding Unit Tests¶
- Create test file alongside source code
- Follow naming conventions:
*_test.gofor integration,*_unit_test.gofor unit tests - Test unexported methods from within package
- Use table-driven tests for multiple scenarios
Adding Integration Tests¶
- Add to
test/integration/directory - Use Ginkgo BDD style for readability
- Include proper cleanup with finalizer removal
- Set appropriate timeouts (5 minutes for CI)
- Use minimal resources for CI compatibility
- Test both success and failure scenarios
Test Documentation¶
Document test scenarios: - Purpose: What functionality is being tested - Setup: Required resources and configuration - Expected Results: What should happen in success case - Cleanup: How resources are cleaned up - CI Considerations: Any special requirements for CI
This comprehensive testing strategy ensures the Neo4j Enterprise Operator works reliably across different environments and deployment scenarios.