Skip to content

Backup Disk Space Management Guide

This guide covers disk space management for Neo4j backups in Kubernetes environments.

Overview

Neo4j backups can consume significant disk space, especially in production environments with: - Large databases - Frequent backup schedules - Multiple backup types (FULL, DIFF, AUTO) - Long retention policies

Note: Commands referencing backup-sidecar apply to standalone deployments. For clusters, use the centralized {cluster}-backup-0 pod (container backup) and the /backups mount.

Automatic Cleanup

Backup Sidecar Retention

Standalone deployments use a backup sidecar that manages disk space with configurable retention policies. Cluster backups run in a centralized {cluster}-backup pod and keep the most recent backups by default.

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jEnterpriseStandalone
metadata:
  name: production-standalone
spec:
  # ... other configuration ...
  env:
    - name: BACKUP_RETENTION_DAYS
      value: "14"  # Keep backups for 14 days
    - name: BACKUP_RETENTION_COUNT
      value: "20"  # Keep maximum 20 backups

Default retention settings: - BACKUP_RETENTION_DAYS: 7 days - BACKUP_RETENTION_COUNT: 10 backups

The sidecar automatically: 1. Removes backups older than retention days 2. Keeps only the most recent N backups 3. Runs cleanup before and after each backup

Manual Cleanup

Using the Cleanup Script

For test environments or emergency cleanup:

# Run the cleanup script
./hack/cleanup-test-resources.sh

# What it does:
# - Removes completed jobs older than 1 hour
# - Deletes failed and evicted pods
# - Identifies orphaned PVCs
# - Shows disk usage by namespace
# - Cleans Docker system (for Kind clusters)

Manual Commands

Check disk usage:

# Check PV usage
kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,CLAIM:.spec.claimRef.name

# Check node disk usage
kubectl describe nodes | grep -A5 "Allocated resources:"

# Check specific PVC usage
kubectl exec <neo4j-pod> -- df -h /data

Clean up old backups manually:

# Delete backups older than 7 days
kubectl exec <neo4j-pod> -c backup-sidecar -- \
  find /data/backups -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;

# Keep only 5 most recent backups
kubectl exec <neo4j-pod> -c backup-sidecar -- bash -c \
  'cd /data/backups && ls -t | tail -n +6 | xargs -r rm -rf'

Best Practices

1. Storage Sizing

Calculate required storage:

Required Storage = Database Size × Backup Compression Ratio × Number of Retained Backups × Safety Factor

Example:
- Database Size: 100GB
- Compression Ratio: 0.3 (70% compression)
- Retained Backups: 10
- Safety Factor: 1.5
- Required: 100GB × 0.3 × 10 × 1.5 = 450GB

2. Backup Strategy

Optimize backup types:

# Daily full backups with short retention
apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: daily-full
spec:
  schedule: "0 2 * * *"  # 2 AM daily
  options:
    backupType: FULL
    compress: true
  retention:
    maxAge: "3d"
    maxCount: 3

# Hourly differential backups
apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: hourly-diff
spec:
  schedule: "0 * * * *"  # Every hour
  options:
    backupType: DIFF
    compress: true
  retention:
    maxAge: "1d"
    maxCount: 24

3. Monitoring

Set up alerts for disk usage:

# Prometheus alert example
groups:
- name: neo4j-backups
  rules:
  - alert: BackupDiskSpaceHigh
    expr: |
      (1 - (node_filesystem_avail_bytes{mountpoint="/data"} /
      node_filesystem_size_bytes{mountpoint="/data"})) > 0.8
    for: 10m
    annotations:
      summary: "Backup disk usage above 80%"

4. External Storage

For production, consider external storage:

S3 Storage

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: s3-backup
spec:
  storage:
    type: s3
    bucket: my-neo4j-backups
    path: production/cluster-1
  retention:
    maxAge: "30d"  # S3 lifecycle policies handle cleanup

PVC with StorageClass

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd  # Use appropriate storage class
  resources:
    requests:
      storage: 500Gi

Troubleshooting

Disk Full Errors

Symptoms:

java.io.IOException: No space left on device

Quick fixes: 1. Run cleanup script: ./hack/cleanup-test-resources.sh 2. Delete old backups (cluster): kubectl exec <cluster>-backup-0 -c backup -- rm -rf /backups/old-* 3. Delete old backups (standalone): kubectl exec <standalone-pod> -c backup-sidecar -- rm -rf /data/backups/old-* 4. Increase PVC size (if storage class supports expansion)

Prevention

  1. Set appropriate retention policies

    env:
    - name: BACKUP_RETENTION_DAYS
      value: "3"  # Shorter for test environments
    - name: BACKUP_RETENTION_COUNT
      value: "5"  # Fewer backups for test
    

  2. Use compressed backups

    options:
      compress: true  # Reduces backup size by 60-80%
    

  3. Monitor disk usage proactively

    # Add to monitoring scripts
    kubectl exec <pod> -- df -h /data | awk '$5+0 > 80 {print "WARNING: " $0}'
    

Summary

Effective disk space management requires: - Automatic cleanup via sidecar retention policies - Regular monitoring of disk usage - Appropriate backup strategies (FULL vs DIFF) - External storage for production environments - Proactive cleanup in test environments

The backup sidecar's built-in cleanup functionality handles most scenarios automatically, but manual intervention may be needed for test environments or exceptional situations.