High Availability with DNS

The purpose of this document is to describe steps to deploy an RKE2 Kubernetes distribution in high availability with DNS.

Prerequisites

	Node Required	vCPU	vRAM	vDisk (GiB)	Comments
RKE2	3 Control Plane nodes	2	4	50	See RKE2 installation requirements for hardware sizing, the underlying operating system, and the networking requirements.
CX-Core	2 Worker nodes	2	4	250	If Cloud Native Storage is not available, then 2 worker nodes are required on both site-A and site-B. However, if CloudNative Storage is accessible from both sites, 1 worker node can sustain workload on each site.
Superset	1 Worker node	2	8	250	For reporting

Node Required

vCPU

vRAM

vDisk (GiB)

Comments

RKE2

3 Control Plane nodes

See RKE2 installation requirements for hardware sizing, the underlying operating system, and the networking requirements.

CX-Core

2 Worker nodes

250

If Cloud Native Storage is not available, then 2 worker nodes are required on both site-A and site-B.

However, if CloudNative Storage is accessible from both sites, 1 worker node can sustain workload on each site.

Superset

1 Worker node

250

For reporting

Preparing for Deployment

All control-plane nodes must be ready as per the environment preparation mentioned in https://expertflow-docs.atlassian.net/wiki/spaces/CX/pages/155222298/%284.5%29+RKE2+Control+plane+Deployment#Environment-Preparation.

Installation and Configuration Steps

1. Setup DNS Configurations

For DNS based load balancing you need to setup a virtual FQDN that can point to all control plane nodes. Contact your network administrator to do that.

The DNS server should perform health checks on the Control-Plane nodes availability on ports 6443, 9345, 80 and 443. Otherwise routing to control-plane nodes will have to be managed manually.

Step 2. Create first Control Plane node

Follow RKE2 Control plane Deployment to create the first control-plane node.

Get the server node token from the first control plane. This is required for adding remaining control plane and worker nodes.

Bash

cat /var/lib/rancher/rke2/server/node-token

Step 3. Adding Remaining Control Plane Nodes

Before proceeding, make sure your control plane environment is ready following https://expertflow-docs.atlassian.net/wiki/spaces/CX/pages/155222298/%284.5%29+RKE2+Control+plane+Deployment#Environment-Preparation

Create the directories as listed below in the control plane nodes to be added.
Bash
```
mkdir -p /etc/rancher/rke2/
mkdir -p  /var/lib/rancher/rke2/server/manifests/
```

Create a deployment manifest called config.yaml and replace <FQDN> with the FQDN/IP of the first conrtol plane.

Bash

cat<<EOF|tee /etc/rancher/rke2/config.yaml
server: https://<FQDN>:9345
token: [token from /var/lib/rancher/rke2/server/node-token on server node 1]
write-kubeconfig-mode: "0644" 
tls-san:
  - <FQDN>
etcd-expose-metrics: true
# Make a etcd snapshot every 6 hours
etcd-snapshot-schedule-cron: "0 */6 * * *"
# Keep 56 etcd snapshorts (equals to 2 weeks with 6 a day)
etcd-snapshot-retention: 56
cni:
  - canal

EOF

Ingress-Nginx config for RKE2 - By default RKE-2 based ingress controller does not allow additional snippet information in ingress manifests, create this config before starting the deployment of RKE2.

Bash

cat<<EOF| tee /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx-config.yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      metrics:
        service:
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/port: "10254"
      config:
        use-forwarded-headers: "true"
      allowSnippetAnnotations: "true"
EOF

Step 4. Install RKE2 HA with DNS

Begin the RKE2 Deployment. Starting the Service will take approx. 10-15 minutes based on the network connection
Bash
```
curl -sfL https://get.rke2.io |INSTALL_RKE2_TYPE=server  sh - 
```
Start the RKE2 service
Bash
```
systemctl start rke2-server
```
Enable the RKE2 Service
Bash
```
systemctl enable rke2-server
```
By default, RKE2 deploys all the binaries in /var/lib/rancher/rke2/bin path, add this path to system's default PATH for kubectl utility to work appropriately
Bash
```
export PATH=$PATH:/var/lib/rancher/rke2/bin
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
```

Append these lines into current user's .bashrc file.

Bash

echo "export PATH=$PATH:/var/lib/rancher/rke2/bin" >> $HOME/.bashrc
echo "export KUBECONFIG=/etc/rancher/rke2/rke2.yaml"  >> $HOME/.bashrc

Step 5. Deploy Worker Nodes

Follow the Deployment Prerequisites from RKE2 Control plane Deployment for each worker node before deployment i.e disable firewall on all worker nodes.

On each worker node,

Run the following command to install RKE2 agent on the worker.

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -

Enable the rke2-agent service by using the following command.
```
systemctl enable rke2-agent.service
```
Create a directory by running the following commands.
```
mkdir -p /etc/rancher/rke2/
```
Add/edit /etc/rancher/rke2/config.yaml and update the following fields.
1. <Control-Plane-IP> This is the IP for the first control-plane node.
2. <Control-Plane-TOKEN> This is the token from Step 2.
  server: https://<Control-Plane-IP>:9345 token: <Control-Plane-TOKEN>
Start the service by using follow command.
```
systemctl start rke2-agent.service
```

Next Steps

Choose storage - See Storage Solution - Getting Started
CX-Core deployment on Kubernetes