Backup and Restore¶

This comprehensive guide explains how to use the Neo4j Kubernetes Operator to back up and restore your Neo4j Enterprise clusters and standalone instances. The operator provides advanced backup and restore capabilities through Neo4jBackup and Neo4jRestore Custom Resources, supporting multiple storage backends, scheduled backups, point-in-time recovery, and more.

Quick Start (5 minutes)¶

New to backup and restore? Start here for an immediate working backup solution.

Step 1: Create Your First Backup¶

# 1. Create admin credentials (if not already done)
kubectl create secret generic neo4j-admin-secret \
  --from-literal=username=neo4j --from-literal=password=admin123

# 2. Apply a simple backup to local PVC storage
kubectl apply -f examples/backup-restore/backup-pvc-simple.yaml

Step 2: Monitor Progress¶

# Watch backup status
kubectl get neo4jbackups simple-backup -w

# Check backup job logs
kubectl logs job/simple-backup-backup

Step 3: What You Just Created¶

Backup Resource: Backs up your single-node-cluster to local PVC storage
Compression: Automatically compresses backup data
Verification: Validates backup integrity after creation
Retention: Keeps the 5 most recent backups

Success Indicator: Status should show Completed with BackupSuccessful condition.

Next Steps by User Type¶

Teams/Production: Continue to Cloud Storage Authentication → Scheduled Backups
Developers: Try Database-Specific Backups → Restore Testing
Enterprise: Jump to Point-in-Time Recovery → Advanced Configuration

Prerequisites¶

Neo4j Enterprise cluster or standalone running version 5.26.0+ (semver) or 2025.01.0+ (CalVer)
Kubernetes cluster with the Neo4j Operator installed
Appropriate storage backend configured (S3, GCS, Azure, or PVC)
Admin credentials for the Neo4j instance

Supported Deployment Types¶

Both Neo4jEnterpriseCluster and Neo4jEnterpriseStandalone are fully supported as backup and restore targets. The clusterRef field (on restore) and target.name / target.clusterRef fields (on backup) can reference either type — the operator detects the deployment type automatically.

Neo4j Version Requirements¶

Backup and restore require Neo4j Enterprise 5.26.0 or later, or CalVer 2025.01.0+.

Supported Versions: - Semver: 5.26.0, 5.26.1 (5.26.x is the last semver LTS — no 5.27+ exists) - CalVer: 2025.01.0, 2025.04.0, 2026.01.0, etc. - Enterprise tags required: neo4j:5.26.0-enterprise, neo4j:2025.01.0-enterprise

Backup Architecture¶

How Backups Work¶

The operator runs a Kubernetes Job that executes neo4j-admin database backup directly inside the same Neo4j Enterprise image as your cluster. No sidecar containers, no separate tooling images.

End-to-end flow:

You create a Neo4jBackup resource.
The operator creates a Kubernetes Job using the same Neo4j Enterprise image as your cluster.
The Job runs neo4j-admin database backup --from=<server-pod-fqdn>:6362 --to-path=<destination> against each server pod.
For cloud storage destinations (s3://, gs://, azb://), neo4j-admin streams data directly to the cloud — no intermediate local copy is required. For large databases, configure tempStorage to provision a PVC for staging (see Temporary Storage for Cloud Operations).
The operator updates the Neo4jBackup status as the Job progresses.

Backup listen address: The operator automatically configures server.backup.listen_address=0.0.0.0:6362 in each server pod's neo4j.conf, so backup Jobs can reach any server pod without additional manual configuration.

Pod naming: Backup Jobs connect to {cluster-name}-server-0, {cluster-name}-server-1, etc., using their full Kubernetes DNS FQDNs on port 6362.

RBAC¶

The operator automatically creates a neo4j-backup-sa ServiceAccount in the same namespace as your backup resource. Backup Jobs run as this service account.

No Role or RoleBinding is created — backup Jobs invoke neo4j-admin directly against the backup port; no Kubernetes API access is needed by the backup process itself.

If you are using Workload Identity (AWS IRSA, GKE Workload Identity, or Azure Workload Identity), you attach annotations to the neo4j-backup-sa ServiceAccount via the cloud.identity.autoCreate.annotations field (see Cloud Storage Authentication).

Backup Types¶

Type	Description	`backupType` value
Full	Complete snapshot of all database files	`FULL`
Differential	Only pages changed since the last full backup	`DIFF`

Differential backups are significantly smaller and faster for large databases. The operator selects the correct parent backup automatically (most recent full by default; or most recent differential if preferDiffAsParent: true is set — requires CalVer 2025.04+).

Storage Backends¶

Backend	Type	Best For
PVC	`pvc`	Development, testing, air-gapped environments
AWS S3	`s3`	Production on AWS
MinIO / S3-compatible	`s3` + `endpointURL`	On-premises, air-gapped, or self-hosted object storage
Google Cloud Storage	`gcs`	Production on GCP
Azure Blob Storage	`azure`	Production on Azure

Cloud Storage Authentication¶

Cloud backup Jobs need permission to write to your bucket. The operator supports two authentication paths.

Choosing Your Authentication Method¶

Scenario	Recommended Method
AWS EKS with IRSA configured	Workload Identity (IRSA)
GKE with Workload Identity enabled	Workload Identity (GKE WI)
AKS with Workload Identity enabled	Workload Identity (Azure WI)
Any cloud — quick setup, non-production	Explicit credentials via `credentialsSecretRef`
On-prem Kubernetes (no cloud IAM)	Explicit credentials via `credentialsSecretRef`
Compliance requires no long-lived keys	Workload Identity

Security recommendation: Prefer Workload Identity in production. Explicit credentials work everywhere and are simpler to set up initially, but rotate them regularly and store them only in Kubernetes Secrets.

AWS S3 Authentication¶

Path 1: Explicit Credentials¶

Create a Kubernetes Secret with your AWS credentials:

kubectl create secret generic aws-backup-creds \
  --from-literal=AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE \
  --from-literal=AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
  --from-literal=AWS_REGION=us-east-1

Reference the Secret in your Neo4jBackup:

storage:
  type: s3
  bucket: my-neo4j-backups
  path: cluster-backups
  cloud:
    provider: aws
    credentialsSecretRef: aws-backup-creds

The operator mounts all keys from the Secret as environment variables in the backup Job pod, which neo4j-admin picks up automatically.

Path 2: AWS IRSA (IAM Roles for Service Accounts)¶

Annotate the automatically-created neo4j-backup-sa ServiceAccount with your IAM role ARN:

storage:
  type: s3
  bucket: my-neo4j-backups
  path: cluster-backups
  cloud:
    provider: aws
    identity:
      autoCreate:
        annotations:
          eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/neo4j-backup-role

The IAM role must allow these actions on your bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::my-neo4j-backups",
        "arn:aws:s3:::my-neo4j-backups/*"
      ]
    }
  ]
}

The IAM role must also have a trust relationship with the EKS OIDC provider for the neo4j-backup-sa ServiceAccount in your namespace.

MinIO and S3-Compatible Storage¶

MinIO is a high-performance, S3-compatible object store popular for on-premises and air-gapped Kubernetes environments. The operator supports MinIO (and other S3-compatible stores such as Ceph RGW and Cloudflare R2) using two additional fields on CloudBlock:

Field	Purpose
`endpointURL`	Custom S3 API endpoint (injected as `AWS_ENDPOINT_URL_S3`)
`forcePathStyle`	Path-style addressing required by MinIO (injects `-Daws.s3.forcePathStyle=true` into the JVM)

Step 1: Create the credentials Secret¶

MinIO uses the same secret key names as AWS. The AWS_REGION value is required by the SDK but ignored by MinIO — any value works.

kubectl create secret generic minio-backup-credentials \
  --from-literal=AWS_ACCESS_KEY_ID=minioadmin \
  --from-literal=AWS_SECRET_ACCESS_KEY=minioadmin \
  --from-literal=AWS_REGION=us-east-1

Step 2: Create the backup resource¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: minio-backup
spec:
  target:
    kind: Cluster
    name: my-cluster
  storage:
    type: s3
    bucket: neo4j-backups        # bucket must already exist in MinIO
    path: cluster/full
    cloud:
      provider: aws
      credentialsSecretRef: minio-backup-credentials
      endpointURL: http://minio.minio.svc:9000  # adjust namespace/port
      forcePathStyle: true                       # required for MinIO
  options:
    backupType: FULL
    compress: true
    tempStorage:
      size: "20Gi"

External MinIO over TLS: Change endpointURL to https://minio.example.com. Ensure your MinIO TLS certificate is trusted by the container (or use a properly signed cert). Self-signed certs require mounting the CA into the pod — use additionalArgs to pass --ssl-certificate-authorities if needed.

Verify the backup reached MinIO¶

kubectl run minio-client --rm -it --restart=Never \
  --image=minio/mc -- /bin/sh -c "
    mc alias set local http://minio.minio.svc:9000 minioadmin minioadmin
    mc ls local/neo4j-backups/cluster/"

Troubleshooting MinIO¶

Symptom	Likely cause	Fix
`NoSuchBucket`	Bucket not created	`mc mb local/neo4j-backups`
`connection refused`	Wrong endpoint URL	Verify `endpointURL` and MinIO pod readiness
`SignatureDoesNotMatch`	Wrong credentials	Check secret key values
Path-style not working	`forcePathStyle` missing	Confirm `forcePathStyle: true` in spec
`SSL handshake failed`	TLS mismatch	Use `http://` for in-cluster; mount CA for self-signed certs

Full examples with scheduled incremental backups: examples/backup-restore/backup-minio.yaml.

Google Cloud Storage Authentication¶

Path 1: Explicit Credentials (Service Account Key)¶

The key inside the Secret must be named GOOGLE_APPLICATION_CREDENTIALS_JSON:

kubectl create secret generic gcs-backup-creds \
  --from-literal=GOOGLE_APPLICATION_CREDENTIALS_JSON="$(cat /path/to/service-account-key.json)"

Reference the Secret in your Neo4jBackup:

storage:
  type: gcs
  bucket: my-neo4j-backups
  path: cluster-backups
  cloud:
    provider: gcp
    credentialsSecretRef: gcs-backup-creds

The operator writes the JSON value to a file inside the backup Job pod and sets GOOGLE_APPLICATION_CREDENTIALS to point to it. neo4j-admin authenticates using the Application Default Credentials chain.

Path 2: GKE Workload Identity¶

Annotate the neo4j-backup-sa ServiceAccount with the GCP service account to impersonate:

storage:
  type: gcs
  bucket: my-neo4j-backups
  path: cluster-backups
  cloud:
    provider: gcp
    identity:
      autoCreate:
        annotations:
          iam.gke.io/gcp-service-account: neo4j-backup@my-project.iam.gserviceaccount.com

You must also bind the Kubernetes ServiceAccount to the GCP service account and grant the GCP SA storage access:

# Allow the Kubernetes SA to impersonate the GCP SA
gcloud iam service-accounts add-iam-policy-binding \
  neo4j-backup@my-project.iam.gserviceaccount.com \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:my-project.svc.id.goog[my-namespace/neo4j-backup-sa]"

# Grant GCS objectAdmin to the GCP SA
gsutil iam ch \
  serviceAccount:neo4j-backup@my-project.iam.gserviceaccount.com:objectAdmin \
  gs://my-neo4j-backups

Replace my-namespace with the Kubernetes namespace where your Neo4jBackup resource lives.

Azure Blob Storage Authentication¶

Path 1: Explicit Credentials (Storage Account Key)¶

kubectl create secret generic azure-backup-creds \
  --from-literal=AZURE_STORAGE_ACCOUNT=mystorageaccount \
  --from-literal=AZURE_STORAGE_KEY=<your-storage-account-key>

Reference the Secret in your Neo4jBackup:

storage:
  type: azure
  bucket: neo4j-backups   # This is the Azure container name
  path: cluster-backups
  cloud:
    provider: azure
    credentialsSecretRef: azure-backup-creds

Path 2: Azure Workload Identity¶

Annotate the neo4j-backup-sa ServiceAccount with your Azure client ID:

storage:
  type: azure
  bucket: neo4j-backups
  path: cluster-backups
  cloud:
    provider: azure
    identity:
      autoCreate:
        annotations:
          azure.workload.identity/client-id: <AZURE_CLIENT_ID>

The Azure AD application / managed identity identified by AZURE_CLIENT_ID must have the Storage Blob Data Contributor role on the storage container (or account). Your AKS cluster must have the Azure Workload Identity webhook installed and the federated credential configured for the neo4j-backup-sa ServiceAccount.

Backup Operations¶

One-Time Backup Examples¶

Backup to PVC (Local Storage)¶

The simplest backup option — no cloud credentials needed:

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: simple-backup
spec:
  target:
    kind: Cluster
    name: single-node-cluster
  storage:
    type: pvc
    pvc:
      name: backup-storage
      size: 50Gi
      storageClassName: standard
  options:
    compress: true
    verify: true
  retention:
    maxCount: 5

Best for: Development, testing, getting started, air-gapped environments.

Cluster Backup to S3 with Explicit Credentials¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: cluster-backup-s3
spec:
  target:
    kind: Cluster
    name: my-neo4j-cluster
  storage:
    type: s3
    bucket: my-backup-bucket
    path: neo4j-backups/cluster
    cloud:
      provider: aws
      credentialsSecretRef: aws-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
  retention:
    maxAge: "30d"
    maxCount: 10

Best for: Production AWS environments with static credentials. Swap credentialsSecretRef for identity.autoCreate.annotations to use IRSA instead.

Cluster Backup to S3 with IRSA¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: cluster-backup-s3-irsa
spec:
  target:
    kind: Cluster
    name: my-neo4j-cluster
  storage:
    type: s3
    bucket: my-backup-bucket
    path: neo4j-backups/cluster
    cloud:
      provider: aws
      identity:
        autoCreate:
          annotations:
            eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/neo4j-backup-role
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
  retention:
    maxAge: "30d"
    maxCount: 10

Best for: Production EKS environments — no long-lived credentials stored in Secrets.

Cluster Backup to GCS with Explicit Credentials¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: cluster-backup-gcs
spec:
  target:
    kind: Cluster
    name: my-neo4j-cluster
  storage:
    type: gcs
    bucket: my-gcs-backup-bucket
    path: neo4j-backups/cluster
    cloud:
      provider: gcp
      credentialsSecretRef: gcs-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
  retention:
    maxAge: "30d"
    maxCount: 10

Cluster Backup to Azure with Explicit Credentials¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: cluster-backup-azure
spec:
  target:
    kind: Cluster
    name: production-cluster
  storage:
    type: azure
    bucket: neo4j-backups   # Azure container name
    path: cluster/production
    cloud:
      provider: azure
      credentialsSecretRef: azure-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"

Database Backup Examples¶

To back up a specific database rather than the whole cluster, use kind: Database. clusterRef is required when targeting a specific database — it identifies which cluster or standalone instance owns the database.

Database Backup to GCS¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: database-backup-gcs
spec:
  target:
    kind: Database
    name: myapp-db          # The database name
    clusterRef: my-cluster  # Required: the cluster that owns this database
  storage:
    type: gcs
    bucket: my-gcs-backup-bucket
    path: neo4j-backups/myapp
    cloud:
      provider: gcp
      credentialsSecretRef: gcs-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
    encryption:
      enabled: true
      keySecret: backup-encryption-key
      algorithm: AES256

Database Backup to S3¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: database-backup-s3
spec:
  target:
    kind: Database
    name: production-db
    clusterRef: production-cluster
  storage:
    type: s3
    bucket: my-backup-bucket
    path: databases/production-db
    cloud:
      provider: aws
      credentialsSecretRef: aws-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"

Differential Backups¶

Differential backups capture only the pages changed since the last full backup, making them faster and smaller:

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: differential-backup
spec:
  target:
    kind: Cluster
    name: my-neo4j-cluster
  storage:
    type: s3
    bucket: my-backup-bucket
    path: neo4j-backups/differential
    cloud:
      provider: aws
      credentialsSecretRef: aws-backup-creds
  options:
    backupType: DIFF
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"

`preferDiffAsParent` (CalVer 2025.04+ only)¶

By default, differential backups use the most recent full backup as their parent. On CalVer 2025.04 and later, you can instruct the operator to use the most recent differential backup as the parent instead, creating a chain of incrementally smaller backups:

options:
  backupType: DIFF
  preferDiffAsParent: true  # CalVer 2025.04+ only
  tempStorage:\
    size: "50Gi"

This option is ignored on Neo4j 5.26.x (semver) and CalVer versions before 2025.04.

The `tempPath` Option¶

For cloud storage destinations, neo4j-admin may use local disk space during streaming. Set tempPath to a dedicated temporary directory to avoid filling the pod's working filesystem:

options:
  tempStorage:\
    size: "50Gi"

Strongly recommended for all cloud storage backups. The directory is created automatically if it does not exist.

Scheduled Backup Examples¶

Daily Scheduled Backup to S3¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: daily-backup
spec:
  target:
    kind: Cluster
    name: production-cluster
  schedule: "0 2 * * *"  # Daily at 2 AM UTC
  storage:
    type: s3
    bucket: production-backups
    path: daily
    cloud:
      provider: aws
      credentialsSecretRef: aws-backup-creds
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
  retention:
    maxAge: "168h"
    maxCount: 7
    deletePolicy: Delete

Schedule: Daily at 2 AM UTC — adjust as needed. Retention keeps 7 days' worth of backups.

Cloud storage retention note: The retention settings above control how many backup records the operator tracks and attempts to clean up in PVC storage. For cloud storage (S3, GCS, Azure), retention cleanup is handled by your bucket's lifecycle rules, not the operator. The operator logs a notice when PVC retention cleanup is skipped for cloud-backed backups. Configure lifecycle rules directly in your cloud provider:

S3: S3 Lifecycle Rules

GCS: Object Lifecycle Management

Azure: Blob Lifecycle Management

Weekly Backup with Long Retention¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: weekly-backup
spec:
  target:
    kind: Cluster
    name: production-cluster
  schedule: "0 1 * * 0"  # Weekly on Sunday at 1 AM UTC
  storage:
    type: gcs
    bucket: long-term-backups
    path: weekly
    cloud:
      provider: gcp
      identity:
        autoCreate:
          annotations:
            iam.gke.io/gcp-service-account: neo4j-backup@my-project.iam.gserviceaccount.com
  options:
    compress: true
    verify: true
    tempStorage:\
      size: "50Gi"
    encryption:
      enabled: true
      keySecret: backup-encryption-key
  retention:
    maxAge: "90d"
    maxCount: 12
    deletePolicy: Archive

Best for: Enterprise compliance, long-term archival. Configure GCS Lifecycle Management to expire objects after 90 days.

Suspended Backups¶

Temporarily pause a scheduled backup without deleting the resource:

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jBackup
metadata:
  name: maintenance-backup
spec:
  target:
    kind: Cluster
    name: my-cluster
  schedule: "0 3 * * *"
  suspend: true  # Suspends the backup schedule
  storage:
    type: s3
    bucket: backups
    path: maintenance
    cloud:
      provider: aws
      credentialsSecretRef: aws-backup-creds

Restore Operations¶

How Restore Works¶

When you create a Neo4jRestore resource:

The operator creates a Kubernetes Job that runs neo4j-admin database restore using the same Neo4j Enterprise image.
The Job restores the database files from the specified source location.
After the restore Job completes successfully, the operator automatically creates or starts the database via Bolt:
If the database does not exist: runs CREATE DATABASE <dbname> automatically.
If the database exists but is stopped: runs START DATABASE <dbname> automatically.
No manual post-restore Cypher is required to bring the database online.

Simple Restore Examples¶

Restore from a Backup Reference¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: restore-from-backup
spec:
  clusterRef: my-neo4j-cluster
  databaseName: neo4j
  source:
    type: backup
    backupRef: daily-backup
  options:
    verifyBackup: true
    replaceExisting: true
  force: false
  stopCluster: true

After the Job completes, the operator automatically runs START DATABASE neo4j (or CREATE DATABASE neo4j if it was a new database). You do not need to run any Cypher manually.

Best for: Quick recovery from an existing Neo4jBackup resource.

Restore from a Storage Location¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: restore-from-s3
spec:
  clusterRef: recovery-cluster
  databaseName: myapp-db
  source:
    type: storage
    storage:
      type: s3
      bucket: backup-bucket
      path: neo4j-backups/cluster
      cloud:
        provider: aws
        credentialsSecretRef: aws-backup-creds
    backupPath: backup-20250104-120000.backup
  options:
    verifyBackup: true
    replaceExisting: true
    tempStorage:
      size: "50Gi"
  force: true
  stopCluster: true

Best for: Cross-cluster recovery, disaster recovery from a known backup path.

Restore to a Standalone Instance¶

clusterRef can reference a Neo4jEnterpriseStandalone as well as a Neo4jEnterpriseCluster:

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: restore-to-standalone
spec:
  clusterRef: my-standalone   # References Neo4jEnterpriseStandalone
  databaseName: neo4j
  source:
    type: backup
    backupRef: standalone-backup
  options:
    verifyBackup: true
    replaceExisting: true
  stopCluster: true

Point-in-Time Recovery (PITR)¶

PITR restores your database to a specific point in time using a base backup combined with transaction logs.

PITR Configuration¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: pitr-restore
spec:
  clusterRef: recovery-cluster
  databaseName: production-db
  source:
    type: pitr
    pointInTime: "2025-01-04T12:30:00Z"
    pitr:
      baseBackup:
        type: backup
        backupRef: daily-backup
      logStorage:
        type: s3
        bucket: transaction-logs
        path: neo4j-logs/production
        cloud:
          provider: aws
          credentialsSecretRef: aws-backup-creds
      logRetention: "168h"
      recoveryPointObjective: "5m"
      validateLogIntegrity: true
      compression:
        enabled: true
        algorithm: gzip
        level: 6
      encryption:
        enabled: true
        keySecret: log-encryption-key
        algorithm: AES256
  options:
    verifyBackup: true
    replaceExisting: true
  force: true
  stopCluster: true
  timeout: "2h"

Best for: Compliance requirements, precise recovery to a moment before a bad event.

PITR with Storage-Based Base Backup¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: pitr-storage-restore
spec:
  clusterRef: disaster-recovery
  databaseName: critical-app
  source:
    type: pitr
    pointInTime: "2025-01-04T14:45:30Z"
    pitr:
      baseBackup:
        type: storage
        storage:
          type: gcs
          bucket: base-backups
          path: production/base-backup-20250104
          cloud:
            provider: gcp
            credentialsSecretRef: gcs-backup-creds
        backupPath: /backup/base-backup-20250104
      logStorage:
        type: gcs
        bucket: transaction-logs
        path: production/logs
        cloud:
          provider: gcp
          credentialsSecretRef: gcs-backup-creds
      validateLogIntegrity: true
  options:
    verifyBackup: true
  force: true
  stopCluster: true

Restore with Hooks¶

Pre and post-restore hooks execute custom operations at key points in the restore lifecycle.

Restore with Cypher Hooks¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: restore-with-hooks
spec:
  clusterRef: my-cluster
  databaseName: myapp
  source:
    type: backup
    backupRef: production-backup
  options:
    verifyBackup: true
    replaceExisting: true
    preRestore:
      cypherStatements:
        - "CALL db.checkpoint()"
    postRestore:
      cypherStatements:
        - "CALL db.awaitIndexes()"
        - "CALL dbms.security.clearAuthCache()"
  force: false
  stopCluster: true

Note: The operator still automatically runs CREATE DATABASE or START DATABASE after the restore Job and post-restore hooks complete.

Restore with Job Hooks¶

apiVersion: neo4j.neo4j.com/v1beta1
kind: Neo4jRestore
metadata:
  name: restore-with-job-hooks
spec:
  clusterRef: staging-cluster
  databaseName: app-data
  source:
    type: backup
    backupRef: staging-backup
  options:
    verifyBackup: true
    preRestore:
      job:
        template:
          container:
            image: my-registry/data-prep:latest
            command: ["/bin/sh"]
            args: ["-c", "/scripts/pre-restore.sh"]
            env:
              - name: CLUSTER_NAME
                value: staging-cluster
              - name: DATABASE_NAME
                value: app-data
        timeout: "10m"
    postRestore:
      job:
        template:
          container:
            image: my-registry/data-validator:latest
            command: ["/bin/sh"]
            args: ["-c", "/scripts/validate-restore.sh"]
            env:
              - name: NEO4J_URI
                value: "neo4j://staging-cluster:7687"
              - name: NEO4J_PASSWORD
                valueFrom:
                  secretKeyRef:
                    name: staging-admin-secret
                    key: password
        timeout: "15m"
  stopCluster: true

Decision Guide: Choose Your Backup Strategy¶

Quick Decision Tree¶

Are you just getting started?
├── YES → PVC backup (beginner)
└── NO ↓

Do you need production-grade cloud durability?
├── YES → Cloud storage (S3 / GCS / Azure)
└── NO → PVC backup is sufficient

Are you on a managed Kubernetes service (EKS, GKE, AKS)?
├── YES → Use Workload Identity (no long-lived credentials)
└── NO → Use explicit credentialsSecretRef

Do you need compliance / precise recovery points?
├── YES → Point-in-Time Recovery (PITR)
└── NO → Regular backup / restore is sufficient

Do you need smaller, faster incremental backups?
├── YES → Differential backups (DIFF)
└── NO → Full backups suffice

Storage Backend Comparison¶

Factor	PVC	S3	GCS	Azure
Setup Complexity	Simple	Medium	Medium	Medium
Cost	Low	Medium	Medium	Medium
Durability	Cluster-dependent	99.999999999%	99.999999999%	99.999999999%
Multi-region	No	Yes	Yes	Yes
Encryption at rest	Optional	Built-in	Built-in	Built-in
Best For	Dev/Test	AWS prod	GCP prod	Azure prod

Backup Frequency Recommendations¶

Environment	Frequency	Retention	Storage
Development	Manual	3–5 backups	PVC
Staging	Daily	7 days	Cloud
Production	Daily + Weekly	30d + 90d	Cloud
Critical Systems	Daily + PITR	90d + compliance	Multi-region cloud

Monitoring Backup and Restore Operations¶

Checking Backup Status¶

# List all backups
kubectl get neo4jbackups

# Get detailed backup status
kubectl describe neo4jbackup daily-backup

# View backup history
kubectl get neo4jbackup daily-backup -o jsonpath='{.status.history}'

# Check backup job logs
kubectl logs job/daily-backup-backup

Checking Restore Status¶

# List all restores
kubectl get neo4jrestores

# Get detailed restore status
kubectl describe neo4jrestore restore-operation

# Check restore job logs
kubectl logs job/restore-operation-restore

# Monitor restore progress
kubectl get neo4jrestore restore-operation -w

Backup and Restore Events¶

# View events for backup operations
kubectl get events --field-selector involvedObject.name=daily-backup

# View events for restore operations
kubectl get events --field-selector involvedObject.name=restore-operation

Best Practices¶

Backup Best Practices¶

Regular Testing: Regularly run restore procedures in a test environment to verify your backups are usable.
Multiple Retention Tiers: Use different retention policies for daily vs. weekly backups.
Encryption: Enable encryption for sensitive data, especially in multi-tenant environments.
Verification: Always enable verify: true to catch corrupted backups early.
Cross-Region: Store backups in a different region from your cluster for disaster recovery.
tempPath for cloud: Always set tempPath for cloud-destination backups to avoid filling the pod filesystem.
Lifecycle Rules: Configure bucket lifecycle rules for cloud storage — the operator does not manage cloud object expiry directly.
Monitoring: Set up alerting on Neo4jBackup status conditions.

Restore Best Practices¶

Stop the cluster: Use stopCluster: true to ensure consistency during restore.
Verify before restoring: Enable verifyBackup: true.
Test in non-production first: Always validate the restore procedure on a non-production cluster before relying on it in an incident.
Document procedures: Keep a written runbook for common recovery scenarios.
Automatic database creation: The operator handles CREATE DATABASE and START DATABASE after restore — no manual Cypher needed.

Security Best Practices¶

Prefer Workload Identity: Use IAM roles / Workload Identity in managed Kubernetes environments instead of long-lived static credentials.
Rotate credentials: Rotate credentialsSecretRef Secrets regularly if you use explicit credentials.
Least-privilege IAM: Grant only the bucket-level permissions actually needed (PutObject, GetObject, ListBucket, DeleteObject).
Network Policies: Restrict egress from backup Job pods to the backup port (6362) and your cloud storage endpoints.
Encrypt backups: Use backup encryption for sensitive databases, especially in regulated industries.

Advanced Configuration¶

Cloud Storage Authentication (Full Reference)¶

The cloud block inside storage supports the following fields:

storage:
  type: s3 | gcs | azure
  bucket: <bucket-or-container-name>
  path: <prefix-within-bucket>
  cloud:
    provider: aws | gcp | azure

    # Path 1: Explicit credentials
    credentialsSecretRef: <secret-name>
    # Secret keys by provider:
    #   AWS: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION
    #   GCP: GOOGLE_APPLICATION_CREDENTIALS_JSON (must be this exact key)
    #   Azure: AZURE_STORAGE_ACCOUNT, AZURE_STORAGE_KEY

    # Path 2: Workload Identity annotations on neo4j-backup-sa
    identity:
      autoCreate:
        annotations:
          <annotation-key>: <annotation-value>
        # Examples:
        #   AWS IRSA:  eks.amazonaws.com/role-arn: arn:aws:iam::...
        #   GKE WI:    iam.gke.io/gcp-service-account: sa@project.iam.gserviceaccount.com
        #   Azure WI:  azure.workload.identity/client-id: <client-id>

Only one of credentialsSecretRef or identity should be specified at a time.

Temporary Storage for Cloud Operations¶

Cloud backups and restores require local staging space for neo4j-admin. Without explicit temp storage, staging uses the container's ephemeral disk, which may be too small for large databases.

The tempStorage field tells the operator to create a PVC for staging automatically:

spec:
  options:
    tempStorage:
      size: "50Gi"              # should be >= expected backup size
      storageClassName: gp3     # optional, uses cluster default if omitted

The operator: 1. Creates a PVC named {backup-name}-temp-staging (or {restore-name}-temp-staging) 2. Sets the CR as owner — the PVC is garbage-collected when the backup/restore CR is deleted 3. Mounts the PVC at /tmp/neo4j-staging in the Job pod 4. Passes --temp-path=/tmp/neo4j-staging to neo4j-admin

If you prefer to manage the PVC yourself, use tempPath instead (points to any path you've mounted via additionalArgs or other means).

Backup Options Reference¶

options:
  # Backup type: FULL (default) or DIFF
  backupType: FULL

  # For DIFF backups on CalVer 2025.04+: use latest diff instead of latest full as parent
  preferDiffAsParent: false

  # Compress backup data (recommended)
  compress: true

  # Verify backup integrity after creation
  verify: true

  # Operator-managed staging PVC for cloud operations (recommended for large databases)
  tempStorage:
    size: "50Gi"
    storageClassName: gp3  # optional

  # Manual temp path (alternative to tempStorage — you must mount the volume yourself)
  # tempPath: /my/mounted/volume

  # Include users/roles metadata in backup (Neo4j 5.26+)
  # Values: all (default), none, users, roles
  includeMetadata: all

  # Multi-threaded transaction application during backup
  parallelRecovery: false

  # Preserve failed backup artifacts for debugging
  keepFailed: false

  # Encryption at rest
  encryption:
    enabled: true
    keySecret: backup-encryption-key
    algorithm: AES256

  # Pass additional flags directly to neo4j-admin database backup
  additionalArgs:
    - "--verbose"

Custom Backup Arguments¶

Pass flags directly to neo4j-admin database backup:

spec:
  options:
    additionalArgs:
      - "--verbose"
      - "--parallel-recovery"

Cross-Namespace Operations¶

# Backup a cluster in a different namespace
spec:
  target:
    kind: Cluster
    name: production-cluster
    namespace: production

Troubleshooting Quick Reference¶

Quick Fixes¶

Problem	Quick Check	Solution
Backup Failed	`kubectl describe neo4jbackup <name>`	Check events and conditions
Permission Denied on cloud storage	`kubectl logs job/<backup-name>-backup`	Verify `credentialsSecretRef` or Workload Identity setup
Version Error	Check cluster Neo4j version	Ensure 5.26.0+ or 2025.01.0+
Pod filesystem full	Check `df -h` in backup pod	Set `tempPath` to a larger volume or use a PVC
Backup job fails with `path does not exist`	Check `tempPath`	Set a valid `tempPath` or ensure the path is auto-created
`preferDiffAsParent` has no effect	Check Neo4j version	Requires CalVer 2025.04+
Database not online after restore	Check restore status	Should be automatic — check operator logs for Bolt errors

Detailed Troubleshooting¶

For comprehensive troubleshooting, diagnostics, and advanced problem-solving, see the Complete Troubleshooting Guide.

Additional Resources¶

API Documentation¶

Neo4jBackup API Reference — Complete field specifications and options
Neo4jRestore API Reference — Detailed restore configuration reference

Examples and Templates¶

Working Examples — Copy-paste ready YAML files
Getting Started Guide — Deploy your first cluster
Installation Guide — Install the operator

Advanced Topics¶

Troubleshooting Guide — Comprehensive problem-solving
Security Best Practices — Secure your backup operations
Performance Tuning — Optimize backup/restore performance

Community and Support¶

GitHub Issues — Report bugs and request features
Neo4j Community — Get help from the community
Neo4j Documentation — Official Neo4j documentation

Backup and Restore¶

Quick Start (5 minutes)¶

Step 1: Create Your First Backup¶

Step 2: Monitor Progress¶

Step 3: What You Just Created¶

Next Steps by User Type¶

Prerequisites¶

Supported Deployment Types¶

Neo4j Version Requirements¶

Backup Architecture¶

How Backups Work¶

RBAC¶

Backup Types¶

Storage Backends¶

Cloud Storage Authentication¶

Choosing Your Authentication Method¶

AWS S3 Authentication¶

Path 1: Explicit Credentials¶

Path 2: AWS IRSA (IAM Roles for Service Accounts)¶

MinIO and S3-Compatible Storage¶

Step 1: Create the credentials Secret¶

Step 2: Create the backup resource¶

Verify the backup reached MinIO¶

Troubleshooting MinIO¶

Google Cloud Storage Authentication¶

Path 1: Explicit Credentials (Service Account Key)¶

Path 2: GKE Workload Identity¶

Azure Blob Storage Authentication¶

Path 1: Explicit Credentials (Storage Account Key)¶

Path 2: Azure Workload Identity¶

Backup Operations¶

One-Time Backup Examples¶

Backup to PVC (Local Storage)¶

Cluster Backup to S3 with Explicit Credentials¶

Cluster Backup to S3 with IRSA¶

Cluster Backup to GCS with Explicit Credentials¶

Cluster Backup to Azure with Explicit Credentials¶

Database Backup Examples¶

Database Backup to GCS¶

Database Backup to S3¶

Differential Backups¶

preferDiffAsParent (CalVer 2025.04+ only)¶

The tempPath Option¶

Scheduled Backup Examples¶

Daily Scheduled Backup to S3¶

Weekly Backup with Long Retention¶

Suspended Backups¶

Restore Operations¶

How Restore Works¶

Simple Restore Examples¶

Restore from a Backup Reference¶

Restore from a Storage Location¶

Restore to a Standalone Instance¶

Point-in-Time Recovery (PITR)¶

PITR Configuration¶

PITR with Storage-Based Base Backup¶

Restore with Hooks¶

Restore with Cypher Hooks¶

Restore with Job Hooks¶

Decision Guide: Choose Your Backup Strategy¶

Quick Decision Tree¶

Storage Backend Comparison¶

Backup Frequency Recommendations¶

Monitoring Backup and Restore Operations¶

Checking Backup Status¶

Checking Restore Status¶

Backup and Restore Events¶

Best Practices¶

Backup Best Practices¶

Restore Best Practices¶

Security Best Practices¶

Advanced Configuration¶

Cloud Storage Authentication (Full Reference)¶

Temporary Storage for Cloud Operations¶

Backup Options Reference¶

Custom Backup Arguments¶

Cross-Namespace Operations¶

Troubleshooting Quick Reference¶

Quick Fixes¶

Detailed Troubleshooting¶

`preferDiffAsParent` (CalVer 2025.04+ only)¶

The `tempPath` Option¶