Skip to main content

Synology CSI Driver

The Synology CSI (Container Storage Interface) driver enables Kubernetes to provision persistent volumes on Synology NAS devices using iSCSI protocol.

Overview

  • Namespace: synology-csi
  • Protocol: iSCSI
  • NAS Device: Synology DS925+ at 10.0.1.204
  • Deployment: Managed by ArgoCD
  • Sync Wave: -30 (deploys after networking, before applications)

Purpose

The Synology CSI driver provides:

  • Dynamic volume provisioning
  • Persistent storage for stateful applications
  • Volume expansion capabilities
  • Snapshot support
  • Integration with Synology DSM storage management

Storage Classes

The cluster provides four storage classes with different characteristics:

synology-iscsi-retain (Default)

Use Case: Production data that should be preserved

storageClassName: synology-iscsi-retain

Configuration:

  • Location: /volume2 (HDD storage pool)
  • Filesystem: btrfs
  • Reclaim Policy: Retain (PV kept after PVC deletion)
  • Volume Expansion: Enabled
  • Default: ✅ Yes

When to Use:

  • Database persistent volumes
  • Application state that must be preserved
  • Any critical data requiring manual cleanup

synology-iscsi-delete

Use Case: Temporary or development data

storageClassName: synology-iscsi-delete

Configuration:

  • Location: /volume2 (HDD storage pool)
  • Filesystem: btrfs
  • Reclaim Policy: Delete (PV auto-deleted with PVC)
  • Volume Expansion: Enabled
  • Default: ❌ No

When to Use:

  • Development environments
  • Cache storage
  • Temporary data that can be recreated
  • Testing and experimentation

synology-iscsi-retain-ssd

Use Case: High-performance production storage

storageClassName: synology-iscsi-retain-ssd

Configuration:

  • Location: /volume4 (SSD storage pool)
  • Filesystem: btrfs
  • Reclaim Policy: Retain
  • Volume Expansion: Enabled
  • Default: ❌ No

When to Use:

  • High-IOPS database workloads
  • Performance-critical applications
  • Frequently accessed data
  • Low-latency requirements

synology-iscsi-delete-ssd

Use Case: High-performance temporary storage

storageClassName: synology-iscsi-delete-ssd

Configuration:

  • Location: /volume4 (SSD storage pool)
  • Filesystem: btrfs
  • Reclaim Policy: Delete
  • Volume Expansion: Enabled
  • Default: ❌ No

When to Use:

  • High-performance cache layers
  • Temporary high-speed storage
  • Performance testing
  • Build caches

Architecture

Components

CSI Controller:

  • Handles volume provisioning and deletion
  • Manages snapshots and cloning
  • Communicates with Synology DSM API
  • Runs as a Deployment (1 replica)
  • Containers: csi-provisioner, csi-attacher, csi-resizer, csi-plugin (all with resource limits)

CSI Node Driver:

  • Runs on every Kubernetes node (DaemonSet)
  • Mounts iSCSI volumes to pods
  • Handles volume attach/detach operations
  • Manages local mount points
  • Containers: csi-driver-registrar, csi-plugin (all with resource limits)

Snapshotter:

  • Enables volume snapshots via snapshot-controller
  • Creates point-in-time copies
  • Supports snapshot-based backups
  • Containers: csi-snapshotter, csi-plugin, snapshot-controller (all with resource limits)
Resource Limits (2026-02-14, updated 2026-02-27)

Resource limits were added to all containers across the controller, node, and snapshotter components (PR #451) to comply with Gatekeeper's require-resource-limits policy. The snapshot-controller uses a Kustomize strategic merge patch since it's sourced from a remote GitHub reference.

Important: The patch must use namespace: kube-system (the upstream resource's original namespace), not the kustomization's namespace: synology-csi override. Kustomize resolves patches before applying namespace transformation (PR #478).

Version History

Current Versions (2026-01-12)

ComponentVersionNotes
synology-csiv1.2.1Requires iscsiadm-path configuration
csi-attacherv4.10.0Upgraded 2026-01-07
csi-node-driver-registrarv2.15.0Upgraded 2026-01-07
csi-provisionerv6.1.0Latest stable
csi-resizerv2.0.0Latest stable
csi-snapshotterv8.4.0Upgraded 2026-01-11
snapshot-controllerv8.2.1Upgraded 2026-01-11

Upgrade Notes

v1.2.1 Node Plugin Configuration (2026-01-12)

Synology CSI v1.2.1 changed how it locates the iscsiadm binary for Talos Linux compatibility. Without explicit configuration, new PVC mounts fail.

Required Configuration:

The node plugin must include these arguments:

args:
- --nodeid=$(KUBE_NODE_NAME)
- --endpoint=$(CSI_ENDPOINT)
- --client-info
- /etc/synology/client-info.yml
- --log-level=info
- --chroot-dir=/host # Required for v1.2.1+
- --iscsiadm-path=/usr/sbin/iscsiadm # Path on host filesystem

References:

Snapshot-Controller v8.x Upgrade (2026-01-11)

The snapshot-controller was upgraded from v6.3.1 to v8.2.1. This requires updated RBAC permissions.

RBAC Requirements for v8.x:

The CSI snapshotter ClusterRole must include patch verb for volumesnapshotcontents:

rules:
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshotcontents"]
verbs: ["get", "list", "watch", "update", "patch"] # patch added
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshotcontents/status"]
verbs: ["update", "patch"] # patch added
- apiGroups: ["groupsnapshot.storage.k8s.io"] # New API group
resources: ["volumegroupsnapshotcontents", "volumegroupsnapshotclasses"]
verbs: ["get", "list", "watch", "update", "patch"]

Network Configuration

iSCSI Connection

  • NAS IP: 10.0.1.204
  • Protocol: iSCSI (TCP port 3260)
  • Authentication: CHAP (credentials in secret)
  • Network: Direct connection via cluster network

Requirements

Kubernetes Nodes:

  • open-iscsi package installed
  • iscsid service running
  • iSCSI initiator configured

Verify on nodes:

# Check iscsid service
sudo systemctl status iscsid

# Verify iSCSI tools installed
which iscsiadm

# List iSCSI sessions
sudo iscsiadm -m session

Deployment Configuration

Application Manifest

Location: manifests/applications/synology-csi.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: synology-csi
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "-30"
spec:
project: infrastructure
source:
path: manifests/base/synology-csi
repoURL: git@github.com:imcbeth/homelab.git
targetRevision: HEAD
destination:
server: https://kubernetes.default.svc
namespace: synology-csi
syncPolicy:
automated:
prune: true
selfHeal: true

Base Manifests

Location: manifests/base/synology-csi/

Files:

  • namespace.yml - Namespace definition
  • controller.yml - CSI controller deployment
  • node.yml - CSI node DaemonSet
  • csi-driver.yml - CSIDriver resource
  • storage-class.yml - Four storage class definitions
  • configs/ - ConfigMaps and Secrets
  • snapshotter/ - Snapshot controller (optional)

Using Persistent Volumes

Creating a PVC

Example with default storage class:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-app-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
# storageClassName not specified = uses default (synology-iscsi-retain)

Example with specific storage class:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: synology-iscsi-retain-ssd # Use SSD storage
resources:
requests:
storage: 50Gi

Using PVC in Pod

apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: my-app-data

Access Modes

ReadWriteOnce (RWO):

  • Volume can be mounted read-write by a single node
  • Most common for databases and stateful apps
  • Supported by iSCSI

ReadOnlyMany (ROX):

  • Volume can be mounted read-only by many nodes
  • Supported by iSCSI

ReadWriteMany (RWX):

  • Volume can be mounted read-write by many nodes
  • NOT supported by iSCSI (use NFS for RWX)

Volume Operations

Expanding a Volume

Volumes can be expanded by editing the PVC:

# Edit PVC to increase size
kubectl edit pvc my-app-data

# Change storage request:
spec:
resources:
requests:
storage: 20Gi # Increased from 10Gi

Notes:

  • Volume can only be expanded, not shrunk
  • Pod may need restart to recognize new size
  • Filesystem will be automatically resized

Viewing Volumes

# List PVCs
kubectl get pvc -A

# List PVs
kubectl get pv

# Describe PVC
kubectl describe pvc my-app-data

# Check volume details
kubectl get pv <pv-name> -o yaml

Deleting Volumes

With Retain policy:

# Delete PVC (PV remains)
kubectl delete pvc my-app-data

# PV status changes to Released
kubectl get pv

# Manually delete PV when ready
kubectl delete pv <pv-name>

# Clean up iSCSI LUN on Synology DSM

With Delete policy:

# Delete PVC (PV and iSCSI LUN auto-deleted)
kubectl delete pvc my-app-data

Volume Snapshots

Creating a Snapshot

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: my-app-snapshot
namespace: default
spec:
volumeSnapshotClassName: synology-snapshot-class
source:
persistentVolumeClaimName: my-app-data

Restoring from Snapshot

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-app-data-restored
spec:
accessModes:
- ReadWriteOnce
storageClassName: synology-iscsi-retain
dataSource:
name: my-app-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
resources:
requests:
storage: 10Gi

Monitoring Storage

Check CSI Driver Status

# CSI controller pod
kubectl get pods -n synology-csi | grep controller

# CSI node pods (should be one per node)
kubectl get pods -n synology-csi | grep node

# View controller logs
kubectl logs -n synology-csi deployment/synology-csi-controller -c csi-provisioner

# View node driver logs
kubectl logs -n synology-csi daemonset/synology-csi-node -c csi-driver

Storage Capacity

# Total PV capacity
kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,STORAGECLASS:.spec.storageClassName

# PVC usage (requires metrics-server)
kubectl top pvc -A

Synology DSM

Web UI: https://10.0.1.204:5001

Storage Manager:

  • View iSCSI LUNs
  • Monitor storage pool capacity
  • Check disk health
  • Review snapshot usage

Troubleshooting

PVC Stuck in Pending

Check PVC events:

kubectl describe pvc <pvc-name>

Common causes:

  • CSI controller not running
  • Synology NAS unreachable
  • Storage pool out of space
  • Authentication failure
  • Invalid storage class

Verify CSI pods:

kubectl get pods -n synology-csi

Volume Mount Failures

Check pod events:

kubectl describe pod <pod-name>

Common causes:

  • iSCSI initiator not running on node
  • Network connectivity to NAS
  • Volume already attached to another node
  • Filesystem corruption

Verify iSCSI sessions on node:

# SSH to the node
sudo iscsiadm -m session

fsGroup Race Condition with Transient Files

Error message:

MountVolume.SetUp failed for volume "pvc-xxx" : applyFSGroup failed for vol xxx:
lstat /var/lib/kubelet/pods/.../grafana.db-journal: no such file or directory

Cause: Kubernetes applies fsGroup ownership recursively to all files in a mounted volume. If a transient file (like SQLite's .db-journal) is deleted between directory listing and the lstat call, the mount fails.

Solution: Add fsGroupChangePolicy: OnRootMismatch to the pod's securityContext:

spec:
securityContext:
fsGroup: 472
fsGroupChangePolicy: OnRootMismatch # Only apply fsGroup at root

This tells Kubernetes to skip recursive ownership changes unless the root directory ownership is incorrect, avoiding race conditions with transient files.

Affected workloads:

  • Grafana (SQLite database with journal files)
  • Any application using SQLite or similar databases with transient files
  • Applications that create/delete files during startup

iscsiadm "No such file or directory" Error (v1.2.1)

Error message:

MountVolume.SetUp failed: env: can't execute 'iscsiadm': No such file or directory (exit status 127)

Cause: Synology CSI v1.2.1 changed how it locates the iscsiadm binary. Without explicit configuration, the container cannot find the host's iscsiadm.

Solution: Add these arguments to the CSI node plugin container:

args:
- --chroot-dir=/host
- --iscsiadm-path=/usr/sbin/iscsiadm

Verify fix:

# Check node plugin args
kubectl get daemonset synology-csi-node -n synology-csi \
-o jsonpath='{.spec.template.spec.containers[?(@.name=="csi-plugin")].args}'

# Test PVC mount by restarting a pod
kubectl delete pod <pod-with-pvc>

Reference: See Version History for full configuration details.

VolumeSnapshot Stuck with Finalizers

Symptoms: VolumeSnapshot shows READYTOUSE: false and cannot be deleted.

Solution:

# Remove finalizers to allow deletion
kubectl patch volumesnapshot -n <namespace> <snapshot-name> \
-p '{"metadata":{"finalizers":null}}' --type=merge

# Also patch the VolumeSnapshotContent if needed
kubectl patch volumesnapshotcontent <content-name> \
-p '{"metadata":{"finalizers":null}}' --type=merge

Volume Not Expanding

Check PVC status:

kubectl describe pvc <pvc-name>

Steps:

  1. Verify allowVolumeExpansion: true in storage class
  2. Check CSI controller logs
  3. Restart pod using the PVC
  4. Verify filesystem resized inside container

CSI Driver Not Working

Check node prerequisites:

# On each Kubernetes node
sudo systemctl status iscsid
sudo systemctl status open-iscsi
which iscsiadm

Restart CSI pods:

kubectl rollout restart deployment/synology-csi-controller -n synology-csi
kubectl rollout restart daemonset/synology-csi-node -n synology-csi

Current Usage

Critical Volumes

Prometheus Metrics Storage:

  • PVC: prometheus-kube-prometheus-stack-prometheus-db-...
  • Size: 50Gi
  • Class: synology-iscsi-retain
  • Purpose: Long-term metrics retention
  • Status: CRITICAL - do not delete

Viewing All PVCs

kubectl get pvc -A --sort-by=.spec.resources.requests.storage

Performance Considerations

Raspberry Pi Cluster

Network:

  • Gigabit Ethernet on all 5 Pi nodes
  • iSCSI over standard network
  • Typical throughput: 100-300 MB/s
  • Latency: 1-5ms

Storage Pools:

  • /volume2 (HDD): Higher capacity, lower IOPS
  • /volume4 (SSD): Lower capacity, higher IOPS

Optimization:

  • Use SSD storage class for databases
  • Use HDD storage class for bulk data
  • Enable btrfs compression for better efficiency
  • Monitor NAS network utilization

Security

Authentication

  • CHAP authentication for iSCSI
  • Credentials managed via SealedSecret (manifests/base/synology-csi/client-info-sealed.yaml)
  • SealedSecrets safely stored in Git, decrypted at runtime by Sealed Secrets controller

See Secrets Management for details on managing SealedSecrets.

Network Security

  • iSCSI traffic on trusted cluster network
  • No exposure to external networks
  • NAS firewall restricts access to cluster nodes

Access Control

  • CSI driver has specific RBAC permissions
  • Service accounts scoped to synology-csi namespace
  • No privileged access outside of storage operations

Backup Strategy

Volume Snapshots

  • Use VolumeSnapshot CRD for point-in-time copies
  • Snapshots stored on Synology NAS
  • Minimal space usage with btrfs CoW

Synology Snapshots

  • Additional snapshot layer in DSM
  • Scheduled snapshots via Snapshot Replication
  • Protects against accidental deletion

Offsite Backup

  • Synology Hyper Backup for offsite replication
  • Critical PVs should be backed up regularly
  • Test restore procedures periodically

References