24

Kubernetes in Hetzner Cloud with Rancher Part 1 - Custom Nodes Setup

 3 years ago
source link: https://vitobotta.com/2020/10/30/kubernetes-hetzner-cloud-rancher-custom-nodes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Kubernetes in Hetzner Cloud with Rancher Part 1
has_many :codes
Tips and walkthroughs on web technologies and digital life
DynaBlogger Ad



I have written several posts in the recent past about Rancher and Hetzner Cloud (referral link, we both receive credits). It's a great combination for super affordable Kubernetes deployments without sacrificing in performance, reliability and ease of use. I decided to write a new, updated two parts series on two different ways you can deploy Kubernetes to Hetzner Cloud with Rancher, with pros and cons for each. The two methods are 1) the custom nodes setup, and 2) the setup using an unofficial node driver that allows Rancher to manage virtual servers in Hetzner Cloud directly. In both cases Rancher uses its own Kubernetes distribution called RKE (Rancher Kubernetes Engine). Rancher also has another, more lightweight distribution of Kubernetes optimized for the edge called K3s, so you may want to check that out too. However at this stage I still prefer RKE because it's still a bit easier to set up in HA mode and back up. RKE uses etcd as the data store (it's vanilla Kubernetes after all, with the only difference that every Kubernetes component runs in Docker containers), and in the simplest configurations the control plane nodes can also be etcd nodes; also, you can easily back up the state of the cluster with etcd snapshots (see Rancher's documentation on this). At the time of this writing, the recommended configuration for K3s clusters in HA mode involves an external data store such as MySQL or Postgres, so it's one more thing to maintain (there is support for embedded etcd but that's still experimental at the moment).

With the "custom nodes" configuration, Rancher assumes that you already have some nodes somewhere, with just Docker installed as the only requirement, and that you want Rancher to deploy Kubernetes to these nodes. So in this case we'll create servers in Hetzner Cloud manually first, and then proceed with the Kubernetes deployment using Rancher. There are a few manual steps involved both when deploying Rancher itself and when preparing the nodes for downstream Kubernetes clusters, but the advantage is that you have more control on how the nodes are configured besides Kubernetes and can keep Rancher and each cluster in separate Hetzner Cloud projects. 

The second method, with the node driver, requires no manual steps to create and prepare the nodes for the downstream clusters, because Rancher will create the servers for us (using Docker machine under the hood) and then deploy Kubernetes to them automatically. While the downside with the custom nodes setup is the added manual steps, with the node driver setup the downside is that we need to keep both Rancher and all the downstream clusters in the same Hetzner Cloud project, because the downstream clusters need to share the same private network with Rancher (resources in different Hetzner Cloud projects cannot share the same private network). In this case you lose the ability to separate Rancher and each cluster into separate projects, but like I said creating downstream clusters this way is more automatic and a bit easier/quicker. With the node driver, it is possible to scale nodes with just one click or replace/add node pools easily, like you could with a managed Kubernetes service. The choice is yours. Until very recently I preferred the custom nodes setup because I wanted to organize everything in separate Hetzner Cloud projects, but I changed my mind and I am now using the node driver because it makes several things easier.

In this first part, we'll see how to deploy Kubernetes to Hetzner Cloud with Rancher using the custom nodes setup.

Requirements:

  • of course, you need to be familiar with Kubernetes as this is not an introduction to it.
  • you need an account with Hetzner Cloud.

Setting up Rancher

One downside of Rancher is that it requires its own Kubernetes cluster if you want to install it in HA mode. You could also install it in non-HA mode using a simple Docker container, but that's not recommended for production. The latest version 2.5.x now allows installing Rancher in any Kubernetes cluster (even managed ones), but if you are on prem or want to install Rancher from the stable repo (recommended for production), you need to install it in a RKE or K3s cluster. For Rancher's cluster I deploy Kubernetes using the RKE CLI. I still usually use one single node for Rancher, but I install it in HA mode so that I can add more nodes and make it actually HA later, if I want/need.

Creating the server

First things first, head to Hetzner Cloud's console and create a new project named e.g. "Rancher" and enter the project. From the left side menu/toolbar, click on Security > SSH Keys and add your SSH key, so that you can log into the Rancher node:

Hetzner Cloud - Add SSH Key
Hetzner Cloud - Add SSH Key

Next, click on Networks and add a private network called "default" with the default IP range:

Hetzner Cloud - Create private network
Hetzner Cloud - Create private network

Next, click on Servers, and add a server with the following settings:

  • Location: Nuremberg has better latency for US users, but the choice is yours;
  • OS Image: the default Ubuntu 20.04 is fine;
  • Type: for a small number of clusters and nodes I recommend a cloud server with 8 GB of ram. The two options are CX31 which has 2 cores and costs €8.90 + VAT per month, and CPX31 which has 4 cores and costs €12.40 + VAT. The CPX series has AMD EPYC CPUs of second generation so they are faster than the Intel CPUs (I think Skylake if I remember correctly) in the CX series. The other difference between CX and CPX servers is that CX servers are also available with Ceph storage instead of local NVMe storage, so if something happens to the underlying hardware your cloud server is automatically rebooted on another physical host without any data loss. I prefer the NVMe storage for performance but I choose the CX31 type anyway because 2 cores are enough and I can save a few euros. It doesn't really matter that CX's CPUs are slower than CPX CPUs, for Rancher;
  • Network: ensure you select the private network we created in the previous step. We'll configure the RKE cluster for Rancher so that the traffic between the nodes (if we add more nodes in the future) always goes through the private network;
  • SSH Key: select the SSH key you added earlier;
  • Name: I use the convention <cluster name>-<node role><id>. So for a single node for a cluster named Rancher that acts as controlplane/etcd/worker (which is fine for a simple Rancher setup), I name the node rancher-master1.

Click on Create & Buy Now. The server will be up and running pretty quickly, usually within 30 seconds:

03-hetzner-cloud-server-just-created.jpg

Preparing the node

Next, SSH into the server using the root user. The only requirement to deploy Kubernetes with RKE is Docker, but besides Docker I also configure SSH, install fail2ban and configure the firewall (ufw).

First, I disable root and password authentication for SSH:

sed -i 's/[#]*PermitRootLogin yes/PermitRootLogin prohibit-password/g' /etc/ssh/sshd_config
sed -i 's/[#]*PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config

systemctl restart sshd

Then I install fail2ban to ban IPs of brute force attempts:

apt update
apt install fail2ban

Then I configure the firewall:

ufw allow proto tcp from any to any port 22,80,443

ufw allow from 10.43.0.0/16
ufw allow from 10.42.0.0/16
ufw allow from 10.0.0.0/16

ufw allow from <your IP>

ufw -f default deny incoming
ufw -f default allow outgoing

ufw enable

10.43.0.0/16 is the service IP range and 10.42.0.0/16 is the pod IP range, as configured by RKE. 10.0.0.0/16 instead is the range for the private network in Hetzner Cloud to allow communication between the nodes. 

One last thing we need to prepare the node is install Docker:

curl -s https://get.docker.com | sh

The node is now ready so we can proceed with deploying Kubernetes with RKE.

Take note of the name of the interface for the private network with ifconfig. If you picked CX31 as the server type, it should be ens10.

Deploying Kubernetes for Rancher

See https://rancher.com/docs/rke/latest/en/installation/ for the installation of RKE.

Once the RKE CLI is installed, either run `rke config` and answer the questions or just paste the following in a cluster.yml file somewhere and configure as required:

nodes:
- address: <b><public IP of the node></b>
  port: "22"
  internal_address: <b><private IP of the node></b>
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: <b>rancher-master1</b>
  user: root
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
services:
  etcd:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    external_urls: []
    ca_cert: ""
    cert: ""
    key: ""
    path: ""
    uid: 0
    gid: 0
    snapshot: null
    retention: ""
    creation: ""
<b>    backup_config:
      interval_hours: 6
      retention: 28
      s3backupconfig:
        access_key: ...
        secret_key: ...
        bucket_name: ...
        region: ...
        folder: ...
        endpoint: ...</b>
  kube-api:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    service_cluster_ip_range: 10.43.0.0/16
    service_node_port_range: ""
    pod_security_policy: false
    always_pull_images: false
    secrets_encryption_config: null
    audit_log: null
    admission_configuration: null
    event_rate_limit: null
  kube-controller:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    cluster_cidr: 10.42.0.0/16
    service_cluster_ip_range: 10.43.0.0/16
  scheduler:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
  kubelet:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    cluster_domain: cluster.local
    infra_container_image: ""
    cluster_dns_server: 10.43.0.10
    fail_swap_on: false
    generate_serving_certificate: false
  kubeproxy:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
network:
  canal_network_provider:
    iface: <b><private network interface></b>
  options:
    flannel_backend_type: vxlan
  plugin: canal
  mtu: 0
  node_selector: {}
  update_strategy: null
authentication:
  strategy: x509
  sans: []
  webhook: null
addons: ""
addons_include: []
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
  mode: rbac
  options: {}
ignore_docker_version: null
kubernetes_version: "v1.19.3-rancher1-1"
private_registries: []
ingress:
  provider: ""
  options: {}
  node_selector: {}
  extra_args: {}
  dns_policy: ""
  extra_envs: []
  extra_volumes: []
  extra_volume_mounts: []
  update_strategy: null
cluster_name: "rancher"
cloud_provider:
  name: ""
prefix_path: ""
addon_job_timeout: 0
bastion_host:
  address: ""
  port: ""
  user: ""
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
monitoring:
  provider: ""
  options: {}
  node_selector: {}
  update_strategy: null
  replicas: null
restore:
  restore: false
  snapshot_name: ""
dns: null

Basically you just need to set the public and private IP addresses, the private network interface and - I recommend - the S3 settings for the recurring etcd snapshots, so you can back up and restore the cluster easily. Note that you may need to change the network interface for Canal if you choose another instance model when you create the node. Check on your instance with `ifconfig`.

To see which is the latest supported Kubernetes version run `rke config --list-version --all` and change the relevant setting if needed.

Finally, run `rke up` to provision Kubernetes on the node. Once it's done, RKE will have created two files:

  • cluster.rkestate - the state of the cluster so that when you run `rke up` again later RKE knows the current state
  • kube_config_cluster.yml - the kubeconfig file that you need to use to connect to the cluster with kubectl

Do not commit these two files into a GIT repository or similar because they contain secrets. 

To use the kubeconfig either run

export KUBECONFIG=./kube_config_cluster.yml

or copy the file as ~/.kube/config.

Then check that you can connect to the cluster and that the node is ready with `kubectl get nodes`.

cert-manager

Before installing Rancher we need to install cert-manager so that it can provision a TLS certificate for Rancher. This is easy with the official Helm chart:

kubectl create namespace cert-manager

helm repo add jetstack https://charts.jetstack.io
helm repo update

helm upgrade --install \
  --namespace cert-manager \
  --set installCRDs=true \
  --version v0.15.0 \
  cert-manager \
  jetstack/cert-manager

watch kubectl get pods --namespace cert-manager

The latest version of cert-manager is 1.0.x but 0.15.0 is the version reported in Rancher's documentation, so I use that for now. Not sure if it makes a difference.

Rancher

Configure DNS for the domain you want to use to access Rancher. then install Rancher with Helm:

kubectl create namespace cattle-system

helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo update

helm upgrade --install \
  --namespace cattle-system \
  --set hostname=<your domain> \
  --set ingress.tls.source=letsEncrypt \
  --set letsEncrypt.email=<your email address> \
  rancher \
  rancher-stable/rancher

kubectl -n cattle-system rollout status deploy/rancher

I recommend you use the stable repo for production environments. For testing etc you can also use the "latest" repo with the latest version.

The installation will take a minute or so. Once the Rancher pods are up and running, give it a minute for the TLS certificate to be provisioned with LetsEncrypt, then open the chosen domain in the browser. You will be required to change the admin password and confirm the server URL. This is the URL that nodes of downstream clusters deployed with Rancher will use to connect to Rancher itself.

I recommend you go to the security settings and 1) create a new administrator user, 2) deactivate the default admin user. I discovered and reported a vulnerability some months ago where the default admin user would be recreated with the default credentials in some circumstances. This has since been fixed, but it's a good idea to deactivate the default admin user anyway.

Creating a downstream cluster with Rancher

Now that Rancher is ready, we can create downstream clusters. Like I explained in the beginning of the post, we are going to use "custom nodes". That is, we'll create and prepare the nodes in Hetzner Cloud manually, and then configure Rancher to use them to deploy Kubernetes using RKE.

Open the Hetzner Cloud console again, and create a new project named after the cluster you want to create, for simplicity. I create one project per cluster and I use the names of the letters of the Greek alphabet to name my clusters. You may borrow that convention or use some other naming scheme.

Once you have created the project for the downstream cluster, go to Security > SSH keys and add your public key so that you can SSH into the nodes. Also go to API Tokens and create a read/write token which you will need to configure the cloud controller manager and the CSI driver. Take note of the token somewhere safe because you will only see it once.

04-hetzner-cloud-create-api-token.jpg

Next go to "Networks" and create a network named "default" and leave the IP range set to 10.0.0.0/16. The name of the network is important, make sure you use the same name with commands I will show later.

Now go to "Servers" and create three nodes (we are going to create an HA cluster with all the nodes as controlplane/etcd/worker; you can of course change the configuration of your cluster if you wish). Again I recommend Nuremberg for the location and CPX31 as the instance model this time so we can benefit from better CPU performance for our applications. Select Ubuntu 20.04 as the OS, the network and the SSH key we created previously. Give the first node a name like "test-master1" or something similar if you follow my convention, then click on the "how many servers?" plus button so to create three servers at once. Adjust the names of the additional two nodes so they are incremental, then click on "Create and buy now".

Wait a few seconds for the servers to be up, then SSH into each of them using the root user and the IP addresses you can see from the list of servers.

Configure SSH/fail2ban/firewall like we did for the Rancher node, and additionally allow traffic from the 10.244.0.0/16 range. This is expected by the default configuration of the Hetzner Cloud controller manager that we'll install later to be able to provision load balancers. You don't really need to open the ports 80 and 443 for these nodes if you are going to use load balancers. Just the port 22 is sufficient in this case. 

Next go to Rancher's clusters list, and click "Add Cluster", then choose "From existing nodes (Custom)".  Give the cluster a name, e.g. "test", then under "Advanced Options" disable the Nginx ingress controller, otherwise Rancher will install Nginx with a configuration that doesn't use an external load balancer. In this section I also recommend to configure etcd snapshots to S3 like we did for Rancher.

05-rancher-custom-nodes-setup.jpg
06-rancher-disable-nginx-ingress.jpg
07-rancher-etcd-settings.jpg

Next click on "Edit as YAML", and inside the "services" section add the following:

kubelet:
  extra_args:
    cloud-provider: "external"
kube-controller:
  cluster_cidr: 10.244.0.0/16

The above are required for the setup of the cloud controller manager. Then configure the "network section" as follows:

network:
  plugin: "canal"
  canal_network_provider:
    iface: "enp7s0"

This is to ensure that all the traffic between the nodes goes through the private network we created earlier. At the time of this writing the private network interface for the CPX31 instances is "enp7s0", but double check or change it according to the instances you have created.

Click on next, and ensure that all the roles (etcd, control plane and worker) are selected, then click on "Show advanced options". For each node, enter the correct public and private IPs, then copy the Docker command that Rancher provides you and run it on the node. Before actually running the commands ensure the IPs are correct or the installation of Kubernetes will fail.

08-rancher-custom-nodes-docker-config.jpg

A few seconds after running the Docker command, the Rancher UI will tell you that the three nodes have been registered. The installation should take a few minutes. Click on the name of the cluster and then on the "Nodes" tab and wait until all the nodes are marked as ready and the provisioning messages on the top have gone.

Now click on "Cluster", then "Kubeconfig file", and save the kubeconfig to a file. You will need to use this to connect to the cluster with kubectl etc.

To install the Hetzner cloud controller manager (so that we can use load balancers), follow these instructions:

kubectl -n kube-system create secret generic hcloud --from-literal=token=<the Hetzner project token you created earlier> --from-literal=network=default

kubectl apply -f  https://raw.githubusercontent.com/hetznercloud/hcloud-cloud-controller-manager/master/deploy/ccm-networks.yaml

Ensure that the token is correct. In a few seconds the cloud controller's pod will be up and you will be able to create services of type LoadBalancer for the ingress controller or else, and these will provision load balancers in the Hetzner Cloud project automatically (load balancers were added recently).

Another thing that we need to install in order to have a fully operational cluster is the Hetzner CSI driver so that we can create persistent volumes using Hetzner's block storage:

kubectl -n kube-system create secret generic hcloud-csi --from-literal=token=<the Hetzner project token you created earlier>

kubectl apply -f https://raw.githubusercontent.com/hetznercloud/csi-driver/v1.5.1/deploy/kubernetes/hcloud-csi.yml

Conclusions

There are many ways to deploy Kubernetes (excluding managed services), but I love Rancher because it offers an amazing management interface that is very capable and easy to use. There are some additional steps to set up Rancher before creating your clusters, so it might seem a slower process than using kubeadm or something else. But I recommend Rancher because it makes life easier IMO. You need to set it up once, and then you can create as many clusters with it as you want very easily.

I also love Hetzner Cloud because of the excellent performance, the low prices and the overall quality of the service. I have tested many providers over the past several months but I found Hetzner to be the best one for my needs and budget.

I am sure that Hetzner will introduce a managed Kubernetes service at some point - they already have load balancers, block storage and the software required for integrating them with Kubernetes, so Kubernetes itself is the next natural step - but in the meantime I am very happy with Rancher. It's very reliable and easy to use and maintain. Let me know in the comments if you try this and run into issues. I also recommend you join the Rancher Slack community, as there are many users and Rancher staff ready to help.

In the second part of this series, we'll see how to deploy Kubernetes to Hetzner Cloud using the node driver for Rancher.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK