My server part 4 — Kubernetes

projects

Published

March 20, 2024

Why the switch?

I was working on configuring rootless podman via ansible, but I had trouble because the tooling was incomplete. Ansible is a suboptiomal way to manage containers, and rootless podman can’t manage its own services.

For the whole journey, see the previous post

So yeah. I’ve decided to switch to Kubernetes, because Kubernetes can manage it’s own services, and be rootless. The configuration-as-code landscape for Kubernetes, is much, much better than podman, and I will get many more options. For example, I can use helm, which is somewhat like a package manager for kubernetes, to install apps. Both Caddy and Authentik offer helm packages. Using the offered packages is probably less work than converting a docker compose to a podman container.

Rootless (?) Kubernetes

User Namespaces

Just like how Linux has “distributions”, or bundles of software that build on top of the Linux kernels, Kubernetes has distros. I want rootless Kubernetes,

I started by looking at k3s, the version of kubernetes I have experience with. However, there appear to be some caveats with rootless mode… including not being able to run a multi node cluster.

There is an article related to Kubernetes rootless containers: https://rootlesscontaine.rs/getting-started/kubernetes/

And what I see is… not promising. I don’t really desire to do run kind or minikube, because they are stripped down compared to more feature-full Kubernetes distros like k3s

Minikube, from that list above, is promising, but it is designed for development/testing, and doesn’t support a multi-machine multi node setup

Kind has a similar usecase, and limitation. Despite how easy it is to do rootless on them, that makes them unsuitable for me.

Okay, I seem to have misunderstood what <rootlesscontaine.rs> want, as compared to what I want. That site documents how to run all the kubernetes components, as rootless using a user namespace. What I want, is using user namespaces to isolate pods.

Kubernetes seems to support this natively: https://kubernetes.io/docs/tasks/configure-pod-container/user-namespaces/

To enable user namespaces, all you need to do is set hostUsers: false in your kubernetes yaml… except, how can I override this for an existing helm chart?

Package Manager/Helm

Helm is very nice. However, from what I’ve heard, it is difficult to modify repackaged charts. This is especially concerning to me, because I intend to run all of my containers within user namespaces, and many helm charts don’t provide this. In order to run prepackaged apps in user namespaces, I need to modify existing helm charts.

https://helm.sh/docs/intro/using_helm/#customizing-the-chart-before-installing

Helm has docs on customizing the chart before installing.

There exist some package repositores (similar to dockerhub) for helm:

https://artifacthub.io/
- Helm charts, package
https://operatorhub.io/
- Not helm, rather kubernetes yaml managed by a lifecycle manager
- Packages “operators”, like this rook/ceph
- Appeals to me less than artifacthub

Kubernetes Distro

Here is a list on the CNCF website

I see three main options available to me:

Kubespray (ansible)
K3s>
RKE2
k0s
- Has an ansible playbook
kurl
- Custom kubernetes installer, including things like storage
- Supports rook, but I don’t know if it is rook + ceph
- doesn’t seem to have anywhere near as much activity as I expect it to have… no CI/CD, Longhorn is deprecated
Kubernetes the Easier Way
- Comes with a plethora of features
  - including nginx + letsencrypt, helm, ceph, metallb, prometheus/grafana
  - Very appealing, many features

I don’t really want to opt for manual installation, or anything that’s too complex for too little gains.

Kubespray appeals to me since it’s ansible, and it would be cool to manage the kubernetes cluster installation and the services from the same spot — but it got ruled out:

From the README

Supported Docker versions are 18.09, 19.03, 20.10, 23.0 and 24.0. The recommended Docker version is 24.0. Kubelet might break on docker’s non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don’t break your cluster look into e.g. the YUM versionlock plugin or apt pin

It seems as kubespray deploys kubernetes by installing docker, and then deploying kubespray in docker. Although this is a neat decision, I cannot use docker at all, becuause I intend to deploy openstack using kolla-ansible, which also uses docker. From my testing, the openstack deployment completely destroys docker’s networking, so I probably can’t use both at once.

In addition to that, kubespray completely uninstalls podman, meaning I can’t use podman as a provider for kubespray either.

I found an ansible role to deploy RKE2, but it seems that it just deploys vanilla RKE2, with none of the goodies that I am searching for like a load balancer or external storage.

I also found something similer for k3s, an ansible role to deploy a “vanilla” kubernetes cluster. However, at the bottom, they mention some other ansible roles to deploy a more than vanilla k3s cluster.

I found a reddit post where someone open sourced their own ansible playbook — Only 22 days ago (as of the time of writing this), and it’s quite comprehensive. It comes with Argocd, Cilium, Longhorn, Prometheus, Cloudfare Let’s Encrypt certificates with cert-manager, and more.

I looks very appealing to me, despite the fact that it seems to be opinionated, and designed for personal use. In addition to that, they are simply using ansible’s helm modules to deploy stuff — what would be different from me doing that, with my own deployment choices?

Distributed (?) Storage

Eventually, I do plan to scale up, and that requires a distributed storage solution. I see two main options:

Ceph
Longhorn (SUSE)
- Artifacthub
SeaweedFS
Kadalu (GlusterFS)
- Helm chart?
local-path-provisioner (SUSE)
- An enhancement to Kubernete’s builtin ability to handle local storage paths, by SUSE
- maybe this is optimal, since I have just a one node cluster?

Longhorn appeals to me, because if I choose to use other Suse products like rancher, then they probably integrate.

But now, I’ve chosen to opt for FluxCD to manage my cluster rather than Longhorn. Because of this, I will probably opt for Ceph.

I see a few options to deploy Ceph:

https://artifacthub.io/packages/helm/rook/rook-ceph
- Has a severe security vulnerability reported, but is it really that bad?
CSI (link later)

I don’t understand what a CSI is and how it compares to the Rook ceph operator.

However, it seems I’m running into another issue: It’s difficult to run rook-ceph on a single node. There are also other complaints about performance with ceph — the big complaint is that ceph uses up a lot of CPU relative to other distributed storage, but I wasn’t worrying about that. However, I’ve seen multiple claims that ceph requires high end hardware — SSD’s, which I don’t have. (Right now I have just one hard drive).

Longhorn has a similar issue — at least it seems usable on only one node, but the recommendation is to have at least 3 nodes.

Local-path-provisioner is probably what I’m going to use, because I think it is built into k3s (and by extension, RKE2), by default.

Gitops Software

Gitops is a principle of software deployment, where the deployment infrastructure, services, and configuration, are stored in git — hence the name, Git Operations.

There are several ways to do Gitops on Kubernetes, but the core challenge I am encountering, is that some Gitops software must be deployed to Kubernetes in order to manage the cluster, but you cannot use that software to deploy itself.

What likely happens is that after you deploy the software to the cluster, then it records itself and adds itself to the state, but I have to ensure this works properly.

Or maybe the GitOps software stays outside the configuration, eternally untracked, but still self updating?

I still haven’t selected a GitOps software, but I am looking at:

ArgoCD
FluxCD
- Simple enough to bootsrap
- bootstraps itself from Github repo
Fleet (made by SUSE, just like k3s, RKE2, rancher, and longhorn)

After thinking about it, I can’t find a way to deploy a cluster and the CI/CD software at once, in such a way that it provisions itself. Many deployment methods simply abstract deploying the CI/CD software afterwards.

It’s probably best to not rely on abstractions since this is my first time really deploying Kubernetes, instead, I will just have to accept that the Kubernetes deployment will not be stored as code.

I found something interesting:

Mentioned in a Lemmy comment, it takes helm charts, and is able to convert them to a format that can be consumed by ArgoCD, using Nix.

Okay, but after more research, I’ve settled on Flux. It seems very easy to bootstrap, and to use helm charts with it. I don’t really need a GUI or multitenancy like ArgoCD provides, or the integrations that Fleet (probably) provides.

Flux seems “lightweight”. It reads from a Git repo, and applies and reconciles state. In addition to that, it can bootstrap itself. Although, I think I will end up running into a funny catch-22 when I decide to move away from github to a self hosted forgejo, on the kubernetes cluster, everything will be fine… probably.

Maybe I could have a seperate git server, and that stores the Kubernetes state? Flux seems to support bootstrapping from any git repo.

A few recommendations on the internet seem to suggest that I should have bootstrap flux from something external to the cluster, rather than from inside the cluster.

Misc Addons/Deployment

Monitoring: kube-prometheus-helm-stack
Secrets:
- : https://github.com/getsops/sops
  - Basically ansible vault, very appealing
- : https://external-secrets.io/latest/
  - Seems catch-22y, I need an existing external service to manage secrets… but I suppose it is called external secrets
Ingress
- Nginx ingress… but how do I get SSL with this setup?
- Traefik ingress (has automatic https)
- Caddy: https://github.com/caddyserver/ingress — WIP software…

I need a simple git ssh for bootstrapping flux from.

https://github.com/chrisnharvey/simple-git-server
- Seems unmaintained
Apparently you can just use ssh as a git server

Services

Authentik

Authentik provides documentation on a Kubernetes deployment, along with a Helm chart.

Forgejo

Forgejo has a helm chart: https://codeberg.org/forgejo-contrib/forgejo-helm

Unlike authentik, forgejo’s helm chart also seems to have some support for rootless/user namespaces.

I also want static sites, and here are some of the options I’ve found:

Nextcloud

There is an existing helm chart for nextcloud: https://github.com/nextcloud/helm

However, it says in the above, that it is community maintained, and not truly official.

Going to the nextcloud official docs for larger scale deployment recommendations… and it’s paywalled. It’s likely that Nextcloud maintains official helm charts — but only for paying customers.

Someone had an issue with nextcloud configurations on helm, and I asked them for the problem and solution, and they replied to my post.

Networking

Yup. After spending the majority of my time setting up networking on my previous iteration of this plan, it’s time to do exactly that, again.

My original plan was to host some components of openstack on a VPS, allowing my server to give any virtual machines public ipv6 addresses despite being firewalled and behind the NAT of Cal State Northridge’s internet… except there is no NAT. Or firewall. In fact, if I set up bridging, I can give virtual machines more than one public ipv4 address without going through any of that hassle. So my plans have changed.

However, some questions come in to play that need to be answered, that I need to think about:

Multi node kubernetes: The dream is to have some form of high availability, where any two machines can fail, and my setup stays up (high availability usually requires 3 machines).
Can I deploy openstack parallel to kubernetes?
Should my router go in front of, or behind my server?

I would like to forgo the router entirely, as it reduces complexity. However, the router is useful because it can provide ethernet to my laptop, which is faster than the CSUN wifi, especially when it gets congested.

I am thinking of putting the router in front of my server, and configuring bridged networking, to allow my server to access the CSUN network directly, through both it’s ethernet ports. However, I do fear a speed bottleneck — when attempting to test CSUN’s ethernet speeds, I discovered that my laptop’s ethernet port only supported 100 mbps, meaning even though the cable supported higher speeds, and the ethernet port potentially supported higher speeds, there was a cap. I need to research

Another potential setup is for the server to be in front, with the router connected to it’s secondary NIC/ethernet port. I could use the special bridging setup I have discovered where the primary ethernet port is both a bridge and a normal network interface, and then I could add the secondary ethernet port as a virtual port to the bridge. I would then create a second bridge, and add it to the first bridge, and openstack would use that bridge as it’s bridge for virtual machines.

I’ve decided on the second. Although, there is another bottleneck in place, the NIC on my server itself. Although my router has all 1000 Mb/s ports, both NIC of my servers are capped at 100mb/s. I need to buy a PCI ethernet card (preferably with two ethernet ports).

Options:

https://www.amazon.com/Dual-Port-Gigabit-Network-Express-Ethernet/dp/B09D3JL14S

The other thing I need to get working with is dynamic dns. Since CSUN’s ethernet works via DHCP, I’m not guaranteed the same ip address between reboots or other network configuration changes. I am using porkbun as my DNS provider, and I am searching for some options for that.

https://hub.docker.com/r/qmcgaw/ddns-updater/
- Comes with kubernetes manifests, but look too complex for now
- minor issue with root domain of subdomains something needs to be “@”

It works great, although I had trouble testing it up because I was on my phone’s hotspot, which only gives me an ipv6 address.

After I get Dynamic dns working, I decided to set up bridging, since my server will sit in front of my router, or my computer having an ethernet port.

I followed my documented steps on build server part 2 (the section about bridging and veth).

Except, I did not set up a veth just yet. I don’t need it.

Kubernetes Installation and Setup

The first thing is to configure NetworkManager to ignore CNI managed interfaces, otherwise issues will occur.

I followed the quick start install guide. I started with the server, however, that only installs the server components, and not the agent components of rke2.

I also followed the agent guide, in an attempt to get a single node install. That was a bad idea — I think the server also is capable of acting as an agent in a single node install?

However, nothing starts properly. The RKE2 server service crashes, and I need to investigate why. I suspect it is because kubernete’s virtual networking is unable to properly interact with my special bridged networking setup, and this results in crashes… but the logs don’t seem to say anything relevant.

After a reinstall, it seems to work? (I think I figured out what caused it: I forgot to edit the NetworkManager.conf file). And looking at this guide, it seems that the server components also come with the agent components, as shown by how they deploy rancher (a workload) on just the server.

I mostly ignored the steps, except copying /etc/rancher/rke2/rke2.yaml to ~/.kube/config, and I also chowned the config file as my user. This way I could manage kubernetes as a regular user, rather than only as root.

I tried installing kubectl to my user using nix at first, but kubectl complained that if the version was two major releases away, it could cause issues. So insetad, I installing kubectl from a specific nixpkgs revision:

moonpie@thoth:~$ nix profile install nixpkgs/10b813040df67c4039086db0f6eaf65c536886c6#kubectl
moonpie@thoth:~$ kubectl version
Client Version: v1.28.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.10+rke2r1

Now… Fluxcd deployment is next.

I installed fluxcd using nix.

nix profile install nixpkgs#fluxcd

Before I installed fluxcd however, I decided to disable the nginx ingress controller. I did this because I intend to deploy traefik or something else as my ingress controller with flux.

It seems the RKE2 configuration file needs to be created manually. After creating that file, adding disable: rke2-ingress-nginx to it, and restarting the server service, the nginx controller is disabled.

Now, for the flux install. Since I don’t want to use github, I decided to just use ssh as a git server.

Following the guide for flux bootstrapping with git:

flux bootstrap git --url ssh://moonpie@moonpiedumpl.ing/home/moonpie/fleet-charts --private-key-file=/home/moonpie/.ssh/moonstack --path=cluster/my-cluster

And it starts working immediately… except when I clone the git repo from the server, it’s empty‽ At least it has a branch, “master”, but this branch has no commits or anything to start from. I’m attempting to follow the rest of the bootstrapping guide, so I create a directory, “cluster/my-cluster” in the git repo and put the first chart in it. Nothing happens. I think I screwed up somewhere, so I decide to uninstall flux and start over again.

Presearch and Notes for Future Pieces

Openstack

Since I am using Kubernetes to deploy services, it is worth investing if I can deploy Openstack (or some other self-hosted cloud) on Kubernetes.

These look appealing, but very hard to deploy.

Kubernetes

Misc

Graphical network policy editor

Semi graphical yaml editor

Multi-Tenancy

In case I can’t obtain multi-tenancy by openstack provisioning kuberntes, there are some alternative solutions I am looking at:

vcluster
capsule
kamaji