Table of Contents
What is a Kubernetes Operator and Why Build One From Scratch?
Kubernetes Operators enable teams to automate the operation of Kubernetes resources and applications. For complex projects with predictable patterns of maintenance and operation, Operators can be a powerful tool for simplifying the management of applications running on Kubernetes.
Unlike standard Helm charts, Operators are not limited to configurations that can only be described using languages like YAML. Instead, you can use powerful programming languages like Golang and Java to automate a plethora of tasks within your clusters. This makes Operators a great option when the standard functionality provided by tools like Helm and Kustomize have been exhausted and a more robust solution is needed.
If you want to learn more about the underlying concepts behind popular tools like Rook and Crossplane, then following this tutorial will help you do just that by walking you through the process of building a Kubernetes Operator from scratch.
Overview
Tutorial Structure
The goal of this tutorial is to help learners get a custom Kubernetes Operator up and running as quickly as possible. In order to accomplish this, many details explaining the why of the Operator are left out. This approach might seem counter intuitive but for some, myself included, this is an important first step in learning a new technology.
Additionally, we will basically speed run through the official Kubebuilder tutorial which can be referenced at any time to gain a deeper understanding of what’s going on.
What We’re Building
Finally, the Operator we will build supports a custom CronJob resource that will make using CronJobs more user friendly. The role of a CronJob in Kubernetes is to create scheduled containerized Jobs that complete one-off tasks.
Prerequisites
Once you have installed the tools listed below, we are ready to get started. This tutorial assumes only a Beginner to Intermediate understanding of these tools.
- Golang: The programming language we will use to build the Operator
- kubebuilder: The framework we will use to simplify the development of a Kubernetes Operator
- kubectl: A command line tool used to communicate with Kubernetes clusters
- Docker: The container runtime our Kubernetes cluster will use
- kind: A Docker based tool used to spin up local Kubernetes clusters
Step 1: Boilerplate
The first step to building the operator is to initialize a new Kubebuilder project.
mkdir project
cd project
kubebuilder init --domain tutorial.kubebuilder.io --repo tutorial.kubebuilder.io/project
Step 2: Define Custom Resource APIs
Next, we will create a new API that represents our custom CronJob resource. When prompted, press y for “Create Resource” and “Create Controller”.
kubebuilder create api --group batch --version v1 --kind CronJob
Next, overwrite the contents project/api/v1/cronjob_types.go with the code shared here: project/api/v1/cronjob_types.go @ github.com/kubernetes-sigs/kubebuilder
Key changes include describing what fields our custom resource will have (CronJobSpec) and what states the resource can be in (CronJobStatus).
Step 3: Implement a Controller for the Custom Resource
Right now, if you fill out a template (manifest) to deploy the custom CronJob resource, it won’t really do anything. That’s because custom resource definitions need a controller to tell them what to do (“control” them). This is similar to how an Ingress resource won’t work until a compatible Ingress Controller is deployed.
Like before, overwrite the contents of project/internal/controller/cronjob_controller.go with the code shared here: project/internal/controller/cronjob_controller.go @ github.com/kubernetes-sigs/kubebuilder
Key changes made include filling out the Reconcile and SetupWithManager function.
The Reconcile function
The Reconcile function will read the current state of the Kubernetes cluster, then perform actions to match desired state described by the custom resources. In essence, it “reconciles” the current state of the cluster to the described target state. Throughout this function, the CronJobReconciler (variable r) is used to Get, List, Update, Delete, and Create resources to realize the target state.
The SetupWithManager function
Similar to how the kube-controller-manager manages controllers that are native to Kubernetes clusters, we need to register the newly created controller with its own controller manager. In general, a Manager is an executable that wraps one or more Controllers.
Step 4: Run and Deploy the Controller and Custom Resources
Create a Kind Kubernetes cluster
At this point we’re finally read to get everything running on a Kubernetes cluster. To bootstrap a new local Kind cluster, create a file called cluster-config.yaml with the contents shown below.
1kind: Cluster
2apiVersion: kind.x-k8s.io/v1alpha4
3name: kind
4nodes:
5 - role: control-plane
6 image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
7 - role: worker
8 image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
9 - role: worker
10 image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692eThen run the command kind create cluster --config cluster-config.yaml.
Confirm you are able to communicate with the newly created cluster by running kubectl config current-context. This command should output kind-kind. If not, update the kubectl context by running kubectl use-context kind-kind.
Generate and Install Custom Resource Manifests
Next we’ll generate and run the custom resource manifests. A Makefile is used to run the commands. Most of the commands use kubectl apply which has limits that may prevent make install from running successfully. To resolve this, we will replace the following line in the Makefile.
Before
$(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
After
$(CONTROLLER_GEN) rbac:roleName=manager-role crd:maxDescLen=0 webhook paths="./..." output:crd:artifacts:config=config/crd/bases
After this, you can run these commands to generate and install the manifests:
go mod tidy
make manifests
make install
Test the Controller Locally
We will now test the controller locally before finally deploying it to the cluster. In another terminal, run the following commands within the project folder.
export ENABLE_WEBHOOKS=false
make run
We can then replace the contents of config/samples/batch_v1_cronjob.yaml with the following manifest:
1apiVersion: batch.tutorial.kubebuilder.io/v1
2kind: CronJob
3metadata:
4 labels:
5 app.kubernetes.io/name: project
6 app.kubernetes.io/managed-by: kustomize
7 name: cronjob-sample
8spec:
9 schedule: "*/1 * * * *"
10 startingDeadlineSeconds: 60
11 concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
12 jobTemplate:
13 spec:
14 template:
15 spec:
16 containers:
17 - name: hello
18 image: busybox
19 args:
20 - /bin/sh
21 - -c
22 - date; echo Hello from the Kubernetes cluster
23 restartPolicy: OnFailure
24 Then we’ll deploy the CronJob to the cluster and confirm it is created successfully.
kubectl create -f config/samples/batch_v1_cronjob.yaml
kubectl get cronjob.batch.tutorial.kubebuilder.io -o yaml
If you now check the terminal that is running make run, you will see many new logs that indicate that the controller is working as expected. For example, after a few minutes you’ll see the value of "successful jobs" increases by 1 every minute which aligns with the configuration of our CronJob. You will also see the successfully completed jobs when you run kubectl get jobs.
Deploy the Controller to the Kind Cluster
Now that we have confirmed the controller is working as expected, we can deploy it to the cluster.
But first, let’s stop the locally running controller by pressing Ctrl + C in the terminal. You’ll notice that after doing so, you won’t see any newly completed jobs when running kubectl get job. This again goes to show that without the controller, our Custom Resources won’t do much.
Run these commands to deploy the controller to the kind cluster.
make docker-build IMG=kubebuilder-tutorial/cronjob-controller:0.0.1
kind load docker-image kubebuilder-tutorial/cronjob-controller:0.0.1
make deploy IMG=kubebuilder-tutorial/cronjob-controller:0.0.1
We can confirm the controller was deployed successfully by running:
kubectl get pods,services -n project-system
Additionally, you’ll see that new jobs are being created once again:
kubectl get jobs
Step 5: Testing the Operator
Write and Run Unit Tests
The functionality of the controller should be tested thoroughly to prevent unexpected behaviour. By default, two unit test files are included in project/internal/controller/. You can run them by running:
make test
Currently, the test will fail. However, if you replace project/internal/controller/cronjob_controller_test.go with project/internal/controller/cronjob_controller_test.go @ github.com/kubernetes-sigs/kubebuilder and project/internal/controller/suite_test.go with project/internal/controller/suite_test.go @ github.com/kubernetes-sigs/kubebuilder, they should pass.
Run End-to-End Tests
You can implement automated end-to-end tests that mimic real user behaviour using Kind. The default kubebuilder boilerplate includes a simple end-to-end test in the folder project/test/e2e/ that you can run. It requires that an existing Kind cluster named kind is running.
make test-e2e
Step 6 (optional): Implement WebHooks
Last but not least, we will cover how to create HTTP admission webhooks for the Operator. First we will create boilerplate for the new webhook by running:
kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation
The above command specifically scaffolds a “defaulting” and “validating” web hook. The “Defaulter” will set defaults for the custom resource. The “Validator” will further validate the custom resource beyond what’s possible through the YAML manifests alone.
After this, replace project/api/v1/cronjob_webhook.go with project/api/v1/cronjob_webhook.go @ github.com/kubernetes-sigs/kubebuilder.
HTTPS encrypted communication is required to run the webhook. We will retrieve free SSL certs by using cert-manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.3/cert-manager.yaml
We will then need to rebuild the controller with a different image tag than before.
make docker-build IMG=kubebuilder-tutorial/cronjob-controller:0.0.2
kind load docker-image kubebuilder-tutorial/cronjob-controller:0.0.2
Then we need to uncomment all Kustomize paths, resources, and replacements labelled [WEBHOOK] or [CERTMANAGER] in project/config/crd/kustomization.yaml and project/config/default/kustomization.yaml.
Finally, we can re-deploy the controller with the new image:
make deploy IMG=kubebuilder-tutorial/cronjob-controller:0.0.2
Now if you try to create a second CronJob with an invalid schedule, the Validator should catch the issue and return an error:
1apiVersion: batch.tutorial.kubebuilder.io/v1
2kind: CronJob
3metadata:
4 labels:
5 app.kubernetes.io/name: project
6 app.kubernetes.io/managed-by: kustomize
7 name: invalid-cronjob-sample
8spec:
9 schedule: "*/1 * * * * * *"
10 startingDeadlineSeconds: 60
11 concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
12 jobTemplate:
13 spec:
14 template:
15 spec:
16 containers:
17 - name: hello
18 image: busybox
19 args:
20 - /bin/sh
21 - -c
22 - date; echo Hello from the Kubernetes cluster
23 restartPolicy: OnFailure
24 Conclusion
And with that, we have covered the entire Kubebuilder tutorial in a condensed manner. You can delete the cluster you created by running kind delete cluster -n kind. When you’re ready, I encourage you to go back to the official Kubebuilder tutorial and read the explanations more thoroughly to get a deeper understanding of building Operators with Kubebuilder. Thank you for reading!