Synology CSI Driver
The Synology CSI (Container Storage Interface) driver enables Kubernetes to provision persistent volumes on Synology NAS devices using iSCSI protocol.
Overview
- Namespace:
synology-csi - Protocol: iSCSI
- NAS Device: Synology DS925+ at 10.0.1.204
- Deployment: Managed by ArgoCD
- Sync Wave:
-30(deploys after networking, before applications)
Purpose
The Synology CSI driver provides:
- Dynamic volume provisioning
- Persistent storage for stateful applications
- Volume expansion capabilities
- Snapshot support
- Integration with Synology DSM storage management
Storage Classes
The cluster provides four storage classes with different characteristics:
synology-iscsi-retain (Default)
Use Case: Production data that should be preserved
storageClassName: synology-iscsi-retain
Configuration:
- Location:
/volume2(HDD storage pool) - Filesystem: btrfs
- Reclaim Policy: Retain (PV kept after PVC deletion)
- Volume Expansion: Enabled
- Default: ✅ Yes
When to Use:
- Database persistent volumes
- Application state that must be preserved
- Any critical data requiring manual cleanup
synology-iscsi-delete
Use Case: Temporary or development data
storageClassName: synology-iscsi-delete
Configuration:
- Location:
/volume2(HDD storage pool) - Filesystem: btrfs
- Reclaim Policy: Delete (PV auto-deleted with PVC)
- Volume Expansion: Enabled
- Default: ❌ No
When to Use:
- Development environments
- Cache storage
- Temporary data that can be recreated
- Testing and experimentation
synology-iscsi-retain-ssd
Use Case: High-performance production storage
storageClassName: synology-iscsi-retain-ssd
Configuration:
- Location:
/volume4(SSD storage pool) - Filesystem: btrfs
- Reclaim Policy: Retain
- Volume Expansion: Enabled
- Default: ❌ No
When to Use:
- High-IOPS database workloads
- Performance-critical applications
- Frequently accessed data
- Low-latency requirements
synology-iscsi-delete-ssd
Use Case: High-performance temporary storage
storageClassName: synology-iscsi-delete-ssd
Configuration:
- Location:
/volume4(SSD storage pool) - Filesystem: btrfs
- Reclaim Policy: Delete
- Volume Expansion: Enabled
- Default: ❌ No
When to Use:
- High-performance cache layers
- Temporary high-speed storage
- Performance testing
- Build caches
Architecture
Components
CSI Controller:
- Handles volume provisioning and deletion
- Manages snapshots and cloning
- Communicates with Synology DSM API
- Runs as a Deployment (1 replica)
- Containers: csi-provisioner, csi-attacher, csi-resizer, csi-plugin (all with resource limits)
CSI Node Driver:
- Runs on every Kubernetes node (DaemonSet)
- Mounts iSCSI volumes to pods
- Handles volume attach/detach operations
- Manages local mount points
- Containers: csi-driver-registrar, csi-plugin (all with resource limits)
Snapshotter:
- Enables volume snapshots via snapshot-controller
- Creates point-in-time copies
- Supports snapshot-based backups
- Containers: csi-snapshotter, csi-plugin, snapshot-controller (all with resource limits)
Resource limits were added to all containers across the controller, node, and snapshotter components (PR #451) to comply with Gatekeeper's require-resource-limits policy. The snapshot-controller uses a Kustomize strategic merge patch since it's sourced from a remote GitHub reference.
Important: The patch must use namespace: kube-system (the upstream resource's original namespace), not the kustomization's namespace: synology-csi override. Kustomize resolves patches before applying namespace transformation (PR #478).
Version History
Current Versions (2026-01-12)
| Component | Version | Notes |
|---|---|---|
| synology-csi | v1.2.1 | Requires iscsiadm-path configuration |
| csi-attacher | v4.10.0 | Upgraded 2026-01-07 |
| csi-node-driver-registrar | v2.15.0 | Upgraded 2026-01-07 |
| csi-provisioner | v6.1.0 | Latest stable |
| csi-resizer | v2.0.0 | Latest stable |
| csi-snapshotter | v8.4.0 | Upgraded 2026-01-11 |
| snapshot-controller | v8.2.1 | Upgraded 2026-01-11 |
Upgrade Notes
v1.2.1 Node Plugin Configuration (2026-01-12)
Synology CSI v1.2.1 changed how it locates the iscsiadm binary for Talos Linux compatibility. Without explicit configuration, new PVC mounts fail.
Required Configuration:
The node plugin must include these arguments:
args:
- --nodeid=$(KUBE_NODE_NAME)
- --endpoint=$(CSI_ENDPOINT)
- --client-info
- /etc/synology/client-info.yml
- --log-level=info
- --chroot-dir=/host # Required for v1.2.1+
- --iscsiadm-path=/usr/sbin/iscsiadm # Path on host filesystem
References:
- GitHub Issue #111 - Mount failure report
- GitHub Issue #89 - Configuration fix
Snapshot-Controller v8.x Upgrade (2026-01-11)
The snapshot-controller was upgraded from v6.3.1 to v8.2.1. This requires updated RBAC permissions.
RBAC Requirements for v8.x:
The CSI snapshotter ClusterRole must include patch verb for volumesnapshotcontents:
rules:
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshotcontents"]
verbs: ["get", "list", "watch", "update", "patch"] # patch added
- apiGroups: ["snapshot.storage.k8s.io"]
resources: ["volumesnapshotcontents/status"]
verbs: ["update", "patch"] # patch added
- apiGroups: ["groupsnapshot.storage.k8s.io"] # New API group
resources: ["volumegroupsnapshotcontents", "volumegroupsnapshotclasses"]
verbs: ["get", "list", "watch", "update", "patch"]
Network Configuration
iSCSI Connection
- NAS IP: 10.0.1.204
- Protocol: iSCSI (TCP port 3260)
- Authentication: CHAP (credentials in secret)
- Network: Direct connection via cluster network
Requirements
Kubernetes Nodes:
open-iscsipackage installediscsidservice running- iSCSI initiator configured
Verify on nodes:
# Check iscsid service
sudo systemctl status iscsid
# Verify iSCSI tools installed
which iscsiadm
# List iSCSI sessions
sudo iscsiadm -m session
Deployment Configuration
Application Manifest
Location: manifests/applications/synology-csi.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: synology-csi
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "-30"
spec:
project: infrastructure
source:
path: manifests/base/synology-csi
repoURL: git@github.com:imcbeth/homelab.git
targetRevision: HEAD
destination:
server: https://kubernetes.default.svc
namespace: synology-csi
syncPolicy:
automated:
prune: true
selfHeal: true
Base Manifests
Location: manifests/base/synology-csi/
Files:
namespace.yml- Namespace definitioncontroller.yml- CSI controller deploymentnode.yml- CSI node DaemonSetcsi-driver.yml- CSIDriver resourcestorage-class.yml- Four storage class definitionsconfigs/- ConfigMaps and Secretssnapshotter/- Snapshot controller (optional)
Using Persistent Volumes
Creating a PVC
Example with default storage class:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-app-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
# storageClassName not specified = uses default (synology-iscsi-retain)
Example with specific storage class:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: synology-iscsi-retain-ssd # Use SSD storage
resources:
requests:
storage: 50Gi
Using PVC in Pod
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: my-app-data
Access Modes
ReadWriteOnce (RWO):
- Volume can be mounted read-write by a single node
- Most common for databases and stateful apps
- Supported by iSCSI
ReadOnlyMany (ROX):
- Volume can be mounted read-only by many nodes
- Supported by iSCSI
ReadWriteMany (RWX):
- Volume can be mounted read-write by many nodes
- NOT supported by iSCSI (use NFS for RWX)
Volume Operations
Expanding a Volume
Volumes can be expanded by editing the PVC:
# Edit PVC to increase size
kubectl edit pvc my-app-data
# Change storage request:
spec:
resources:
requests:
storage: 20Gi # Increased from 10Gi
Notes:
- Volume can only be expanded, not shrunk
- Pod may need restart to recognize new size
- Filesystem will be automatically resized
Viewing Volumes
# List PVCs
kubectl get pvc -A
# List PVs
kubectl get pv
# Describe PVC
kubectl describe pvc my-app-data
# Check volume details
kubectl get pv <pv-name> -o yaml
Deleting Volumes
With Retain policy:
# Delete PVC (PV remains)
kubectl delete pvc my-app-data
# PV status changes to Released
kubectl get pv
# Manually delete PV when ready
kubectl delete pv <pv-name>
# Clean up iSCSI LUN on Synology DSM
With Delete policy:
# Delete PVC (PV and iSCSI LUN auto-deleted)
kubectl delete pvc my-app-data
Volume Snapshots
Creating a Snapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: my-app-snapshot
namespace: default
spec:
volumeSnapshotClassName: synology-snapshot-class
source:
persistentVolumeClaimName: my-app-data
Restoring from Snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-app-data-restored
spec:
accessModes:
- ReadWriteOnce
storageClassName: synology-iscsi-retain
dataSource:
name: my-app-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
resources:
requests:
storage: 10Gi
Monitoring Storage
Check CSI Driver Status
# CSI controller pod
kubectl get pods -n synology-csi | grep controller
# CSI node pods (should be one per node)
kubectl get pods -n synology-csi | grep node
# View controller logs
kubectl logs -n synology-csi deployment/synology-csi-controller -c csi-provisioner
# View node driver logs
kubectl logs -n synology-csi daemonset/synology-csi-node -c csi-driver
Storage Capacity
# Total PV capacity
kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,STORAGECLASS:.spec.storageClassName
# PVC usage (requires metrics-server)
kubectl top pvc -A
Synology DSM
Web UI: https://10.0.1.204:5001
Storage Manager:
- View iSCSI LUNs
- Monitor storage pool capacity
- Check disk health
- Review snapshot usage
Troubleshooting
PVC Stuck in Pending
Check PVC events:
kubectl describe pvc <pvc-name>
Common causes:
- CSI controller not running
- Synology NAS unreachable
- Storage pool out of space
- Authentication failure
- Invalid storage class
Verify CSI pods:
kubectl get pods -n synology-csi
Volume Mount Failures
Check pod events:
kubectl describe pod <pod-name>
Common causes:
- iSCSI initiator not running on node
- Network connectivity to NAS
- Volume already attached to another node
- Filesystem corruption
Verify iSCSI sessions on node:
# SSH to the node
sudo iscsiadm -m session
fsGroup Race Condition with Transient Files
Error message:
MountVolume.SetUp failed for volume "pvc-xxx" : applyFSGroup failed for vol xxx:
lstat /var/lib/kubelet/pods/.../grafana.db-journal: no such file or directory
Cause: Kubernetes applies fsGroup ownership recursively to all files in a mounted volume. If a transient file (like SQLite's .db-journal) is deleted between directory listing and the lstat call, the mount fails.
Solution: Add fsGroupChangePolicy: OnRootMismatch to the pod's securityContext:
spec:
securityContext:
fsGroup: 472
fsGroupChangePolicy: OnRootMismatch # Only apply fsGroup at root
This tells Kubernetes to skip recursive ownership changes unless the root directory ownership is incorrect, avoiding race conditions with transient files.
Affected workloads:
- Grafana (SQLite database with journal files)
- Any application using SQLite or similar databases with transient files
- Applications that create/delete files during startup
iscsiadm "No such file or directory" Error (v1.2.1)
Error message:
MountVolume.SetUp failed: env: can't execute 'iscsiadm': No such file or directory (exit status 127)
Cause: Synology CSI v1.2.1 changed how it locates the iscsiadm binary. Without explicit configuration, the container cannot find the host's iscsiadm.
Solution: Add these arguments to the CSI node plugin container:
args:
- --chroot-dir=/host
- --iscsiadm-path=/usr/sbin/iscsiadm
Verify fix:
# Check node plugin args
kubectl get daemonset synology-csi-node -n synology-csi \
-o jsonpath='{.spec.template.spec.containers[?(@.name=="csi-plugin")].args}'
# Test PVC mount by restarting a pod
kubectl delete pod <pod-with-pvc>
Reference: See Version History for full configuration details.
VolumeSnapshot Stuck with Finalizers
Symptoms: VolumeSnapshot shows READYTOUSE: false and cannot be deleted.
Solution:
# Remove finalizers to allow deletion
kubectl patch volumesnapshot -n <namespace> <snapshot-name> \
-p '{"metadata":{"finalizers":null}}' --type=merge
# Also patch the VolumeSnapshotContent if needed
kubectl patch volumesnapshotcontent <content-name> \
-p '{"metadata":{"finalizers":null}}' --type=merge
Volume Not Expanding
Check PVC status:
kubectl describe pvc <pvc-name>
Steps:
- Verify allowVolumeExpansion: true in storage class
- Check CSI controller logs
- Restart pod using the PVC
- Verify filesystem resized inside container
CSI Driver Not Working
Check node prerequisites:
# On each Kubernetes node
sudo systemctl status iscsid
sudo systemctl status open-iscsi
which iscsiadm
Restart CSI pods:
kubectl rollout restart deployment/synology-csi-controller -n synology-csi
kubectl rollout restart daemonset/synology-csi-node -n synology-csi
Current Usage
Critical Volumes
Prometheus Metrics Storage:
- PVC:
prometheus-kube-prometheus-stack-prometheus-db-... - Size: 50Gi
- Class: synology-iscsi-retain
- Purpose: Long-term metrics retention
- Status: CRITICAL - do not delete
Viewing All PVCs
kubectl get pvc -A --sort-by=.spec.resources.requests.storage
Performance Considerations
Raspberry Pi Cluster
Network:
- Gigabit Ethernet on all 5 Pi nodes
- iSCSI over standard network
- Typical throughput: 100-300 MB/s
- Latency: 1-5ms
Storage Pools:
/volume2(HDD): Higher capacity, lower IOPS/volume4(SSD): Lower capacity, higher IOPS
Optimization:
- Use SSD storage class for databases
- Use HDD storage class for bulk data
- Enable btrfs compression for better efficiency
- Monitor NAS network utilization
Security
Authentication
- CHAP authentication for iSCSI
- Credentials managed via SealedSecret (
manifests/base/synology-csi/client-info-sealed.yaml) - SealedSecrets safely stored in Git, decrypted at runtime by Sealed Secrets controller
See Secrets Management for details on managing SealedSecrets.
Network Security
- iSCSI traffic on trusted cluster network
- No exposure to external networks
- NAS firewall restricts access to cluster nodes
Access Control
- CSI driver has specific RBAC permissions
- Service accounts scoped to synology-csi namespace
- No privileged access outside of storage operations
Backup Strategy
Volume Snapshots
- Use VolumeSnapshot CRD for point-in-time copies
- Snapshots stored on Synology NAS
- Minimal space usage with btrfs CoW
Synology Snapshots
- Additional snapshot layer in DSM
- Scheduled snapshots via Snapshot Replication
- Protects against accidental deletion
Offsite Backup
- Synology Hyper Backup for offsite replication
- Critical PVs should be backed up regularly
- Test restore procedures periodically