Longhorn Deployment Guide

This guide illustrates the procedure to deploy and configure Longhorn for K3s, which will be used a Storage manager for the CIM Solution.

Installation Requirements

Each node in the Kubernetes cluster where Longhorn is installed must fulfill the following requirements:

A container runtime compatible with Kubernetes (Docker v1.13+, containerd v1.3.7+, etc.)
Kubernetes >= v1.21
open-iscsi is installed, and the iscsid daemon is running on all the nodes. This is necessary, since Longhorn relies on iscsiadm on the host to provide persistent volumes to Kubernetes.
RWX support requires that each node has a NFSv4 client installed.
- For installing a NFSv4 client, refer below for Installing NFS client
The host filesystem supports the file extents feature to store the data. Currently we support:
- ext4
- XFS
bash, curl, findmnt, grep, awk, blkid, lsblk must be installed.
Mount propagation must be enabled.

The Longhorn workloads must be able to run as root in order for Longhorn to be deployed and operated properly.

Install Longhorn Dependencies

Install Open-ISCSI

Make sure below given steps are executed on all the Nodes in cluster. iscsid.service must be running before.

For Ubuntu 20.04 and above

CODE

apt-get install open-iscsi -y

For RHEL 8.4

CODE

yum --setopt=tsflags=noscripts install iscsi-initiator-utils -y
echo "InitiatorName=$(/sbin/iscsi-iname)" > /etc/iscsi/initiatorname.iscsi
systemctl enable iscsid
systemctl start iscsid

Install NFSv4 client

In Longhorn system, backup feature requires NFSv4, v4.1 or v4.2, and ReadWriteMany (RWX) volume feature requires NFSv4.1. Before installing NFSv4 client userspace daemon and utilities, make sure the client kernel support is enabled on each Longhorn node.

Check NFSv4.1 support is enabled in kernel

CODE

cat /boot/config-`uname -r`| grep CONFIG_NFS_V4_1

Check NFSv4.2 support is enabled in kernel

CODE

cat /boot/config-`uname -r`| grep CONFIG_NFS_V4_2

The command used to install a NFSv4 client differs depending on the Linux distribution.

For Debian and Ubuntu, use this command:

CODE

apt-get install nfs-common -y

For RHEL use this command.

CODE

yum install nfs-utils -y

Checking the Kubernetes Version

Use the following command to check your Kubernetes server version.

CODE

kubectl version

Result:

CODE

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0+k3s1", GitCommit:"2705431d9645d128441c578309574cd262285ae6", GitTreeState:"clean", BuildDate:"2021-04-26T21:45:52Z", GoVersion:"go1.16.2", Compiler:"gc", Platform:"linux/amd64"}

The Server Version should be >= v1.21.

Validate the Longhorn Environment ( only on Master Node )

This script can be used to check the Longhorn environment for potential issues.

Install JQ utility for the script to function properly.

RHEL Based System

CODE

yum install jq -y

Ubuntu/Debian Based Systems

CODE

apt-get install jq -y

Run the validation script for Longhorn

CODE

curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.4.0/scripts/environment_check.sh | bash

A sample successful completion of this script should look like

CODE

[INFO]  Required dependencies 'kubectl jq mktemp' are installed.
[INFO]  Hostname uniqueness check is passed.
[INFO]  Waiting for longhorn-environment-check pods to become ready (0/5)...
[INFO]  Waiting for longhorn-environment-check pods to become ready (0/5)...
[INFO]  Waiting for longhorn-environment-check pods to become ready (0/5)...
[INFO]  Waiting for longhorn-environment-check pods to become ready (3/5)...
[INFO]  Waiting for longhorn-environment-check pods to become ready (3/5)...
[INFO]  All longhorn-environment-check pods are ready (5/5).
[INFO]  Required packages are installed.
[INFO]  Cleaning up longhorn-environment-check pods...
[INFO]  Cleanup completed.

For the minimum recommended hardware, refer to the best practices guide below.

these manifests can be used to install the longhorn dependencies if required

For iSCSI dependencies

CODE

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.4.0/deploy/prerequisite/longhorn-iscsi-installation.yaml

For NFS dependencies

CODE

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.4.0/deploy/prerequisite/longhorn-nfs-installation.yaml

For jq utility, you will have to deploy it manually.

Additional utilities for Longhorn

Make sure below given utilities are deployed on all nodes in the cluster.

bash,
curl,
findmnt,
grep,
awk,
blkid,
lsblk

You can also enable optional epel-release for these additional tools , if they are not already installed on all systems

CODE

subscription-manager repos --enable codeready-builder-for-rhel-8-$(arch)-rpms
dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
/usr/bin/crb enable

Best Practices

We recommend the following setup for deploying Longhorn in production.

Minimum Recommended Hardware

3 nodes
4 vCPUs per node
4 GiB per node
SSD/NVMe or similar performance block device on the node for storage (recommended)
HDD/Spinning Disk or similar performance block device on the node for storage (verified)
- 500/250 max IOPS per volume (1 MiB I/O)
- 500/250 max throughput per volume (MiB/s)

Operating System

The below Linux OS distributions and versions have been verified during the v1.4.0 release testing, but it does not mean Longhorn only supports them. Basically, Longhorn should work well on any certified Kubernetes cluster running on Linux nodes with a most general-purpose operating system as below examples.

No.	OS	Versions
1.	Ubuntu	20.04, 22.04
4.	RHEL	8.6

Unsupported Operating System

Non-General Purpose OS or Container-Optimized OS due to lacking package manager or immutable system limitation.

Node and Disk Setup

We recommend the following setup for nodes and disks.

Use a Dedicated Disk

It’s recommended to dedicate a disk for Longhorn storage for production, instead of using the root disk.

Minimal Available Storage and Over-provisioning

If you need to use the root disk, use the default minimal available storage percentage setup which is 25%, and set overprovisioning percentage to 200% to minimize the chance of DiskPressure.

If you’re using a dedicated disk for Longhorn, you can lower the setting minimal available storage percentage to 10%.

For the Over-provisioning percentage, it depends on how much space your volume uses on average. For example, if your workload only uses half of the available volume size, you can set the Over-provisioning percentage to 200, which means Longhorn will consider the disk to have twice the schedulable size as its full size minus the reserved space.

Disk Space Management

Since Longhorn doesn’t currently support sharding between the different disks, we recommend using LVM to aggregate all the disks for Longhorn into a single partition, so it can be easily extended in the future.

Setting up Extra Disks

Any extra disks must be written in the /etc/fstab file to allow automatic mounting after the machine reboots.

Don’t use a symbolic link for the extra disks. Use mount --bind instead of ln -s and make sure it’s in the fstab file. For details, see the section about multiple disk support.

Configuring Default Disks Before and After Installation

To use a directory other than the default /var/lib/longhorn for storage, the Default Data Path setting can be changed before installing the system. For details on changing pre-installation settings, refer to this section.

The Default node/disk configuration feature can be used to customize the default disk after installation. Customizing the default configurations for disks and nodes is useful for scaling the cluster because it eliminates the need to configure Longhorn manually for each new node if the node contains more than one disk, or if the disk configuration is different for new nodes. Remember to enable Create default disk only on labeled node if applicable.

Deploying Workloads

If you’re using ext4 as the filesystem of the volume, we recommend adding a liveness check to workloads to help automatically recover from a network-caused interruption, a node reboot, or a Docker restart. See this section for details.

Volume Maintenance

We highly recommend using the built-in backup feature of Longhorn.

For each volume, schedule at least one recurring backup. If you must run Longhorn in production without a backupstore, then schedule at least one recurring snapshot for each volume.

Longhorn system will create snapshots automatically when rebuilding a replica. Recurring snapshots or backups can also automatically clean up the system-generated snapshot.

Guaranteed Instance Manager CPU

We recommend allowing Longhorn to have CPU requests set for engine/replica manager pods.

To be precise, you can set the percentage of a node total allocatable CPU reserved for all engine/replica manager pods by modifying settings Guaranteed Engine Manager CPU and Guaranteed Replica Manager CPU.

If you want to set a concrete value (milli CPU amount) for engine/replica manager pods on a specific node, you can update the fields Engine Manager CPU Request or Replica Manager CPU Request of the node. Notice that these 2 fields will overwrite the above settings for the specific node.

The setting Guarantee Engine CPU is deprecated. For the system upgraded from old versions, Longhorn v1.1.1 will set the node fields mentioned above automatically to the same value as the deprecated setting then clean up the setting.

For details, refer to the settings references Guaranteed Engine Manager CPU and Guaranteed Replica Manager CPU.

StorageClass

We don’t recommend modifying the default StorageClass named longhorn, since the change of parameters might cause issues during an upgrade later. If you want to change the parameters set in the StorageClass, you can create a new StorageClass by referring to the StorageClass examples.

Scheduling Settings

Replica Node Level Soft Anti-Affinity

Recommend: false

This setting should be set to false in production environment to ensure the best availability of the volume. Otherwise, one node down event may bring down more than one replicas of a volume.

Allow Volume Creation with Degraded Availability

Recommend: false

This setting should be set to false in production environment to ensure every volume have the best availability when created. Because with the setting set to true, the volume creation won’t error out even there is only enough room to schedule one replica. So there is a risk that the cluster is running out of the spaces but the user won’t be made aware immediately.

Installing Longhorn

Install Longhorn on any Kubernetes cluster using this command:
CODE
```
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.4.0/deploy/longhorn.yaml
```
One way to monitor the progress of the installation is to watch pods being created in the longhorn-system namespace:
CODE
```
kubectl get pods \
--namespace longhorn-system \
--watch
```

Check that the deployment was successful:

CODE

$ kubectl -n longhorn-system get pod
NAME                                           READY   STATUS    RESTARTS   AGE
longhorn-ui-b7c844b49-w25g5                    1/1     Running   0          2m41s
longhorn-conversion-webhook-5dc58756b6-9d5w7   1/1     Running   0          2m41s
longhorn-conversion-webhook-5dc58756b6-jp5fw   1/1     Running   0          2m41s
longhorn-admission-webhook-8b7f74576-rbvft     1/1     Running   0          2m41s
longhorn-admission-webhook-8b7f74576-pbxsv     1/1     Running   0          2m41s
longhorn-manager-pzgsp                         1/1     Running   0          2m41s
longhorn-driver-deployer-6bd59c9f76-lqczw      1/1     Running   0          2m41s
longhorn-csi-plugin-mbwqz                      2/2     Running   0          100s
csi-snapshotter-588457fcdf-22bqp               1/1     Running   0          100s
csi-snapshotter-588457fcdf-2wd6g               1/1     Running   0          100s
csi-provisioner-869bdc4b79-mzrwf               1/1     Running   0          101s
csi-provisioner-869bdc4b79-klgfm               1/1     Running   0          101s
csi-resizer-6d8cf5f99f-fd2ck                   1/1     Running   0          101s
csi-provisioner-869bdc4b79-j46rx               1/1     Running   0          101s
csi-snapshotter-588457fcdf-bvjdt               1/1     Running   0          100s
csi-resizer-6d8cf5f99f-68cw7                   1/1     Running   0          101s
csi-attacher-7bf4b7f996-df8v6                  1/1     Running   0          101s
csi-attacher-7bf4b7f996-g9cwc                  1/1     Running   0          101s
csi-attacher-7bf4b7f996-8l9sw                  1/1     Running   0          101s
csi-resizer-6d8cf5f99f-smdjw                   1/1     Running   0          101s
instance-manager-r-371b1b2e                    1/1     Running   0          114s
instance-manager-e-7c5ac28d                    1/1     Running   0          114s
engine-image-ei-df38d2e5-cv6nc                 1/1     Running   0          114s

Note: For Kubernetes < v1.25, if your cluster still enables Pod Security Policy admission controller, need to apply the podsecuritypolicy.yaml manifest in addition to applying the longhorn.yaml manifests.