Süsteemihaldus - Kursused - Arvutiteaduse instituut

0. Overview of Lab 11:

Welcome to the 11th lab. This week's topic is an introduction to Kubernetes. We will be looking into how to deploy an application on Kubernetes, what components are used for and how different resources are linked together.

This lab is independent of all other labs but sometimes does the same things we did in Containers and Webservers labs. In this lab, there are steps where you will delete previously done steps; this is done on purpose for you to gain a better understanding of how and why Kubernetes is built this way.

This weeks tasks:

Install Kubernetes and it's tools
Run a pod (a set of containers)
Write a container template and deploy a running container
Write a Service template and deploy an outside visible container
Write an Ingress template and deploy a container available on a domain address

1. Intro to Kubernetes

What is Kubernetes?

Having done this from the docker lab, we can see that running a container is a relatively simple task. However, running them across multiple hosts, scaling, managing and deploying them, publishing them into the outside network reliably, among other things, can be difficult without some extra tools.

Kubernetes was created to address these challenges with a set of powerful and extensible API. This API works by maintaining a set of resources, which define in what state we want the Kubernetes cluster and the containers running inside them to be, and Kubernetes will try to do it's best to meet that state.

According to the kubernetes.io website, Kubernetes is:

"an open-source system for automating deployment, scaling, and management of containerised applications".

Kubernetes was built on the basis of a Google internal project called borg. The internal Google project aim was to manage high scale infrastructure that could be fine-tuned solution for packing clusters efficiently. Because it was such a powerful tool, Google decided to make this project public.

Traditional vs Kubernetes approach

Let's look at the deployment in the traditional versus Kubernetes way.

In the traditional environment(remember lab 5 Configuring Apache web server?), a web server is a monolithic application situated on a dedicated machine. With the web traffic increase, the webserver would have to be tuned and moved to a higher and higher performance machine. Over time, the number of customisations and adjustments gets abnormally large and, therefore, ineffective.

The opposite is a Kubernetes approach: rather than using a large server, Kubernetes manages many small servers (microservices). This approach expects an application, both server and client-side, to be written with transient server deployment in mind. So, what does it mean? This means that you can run several instances of the web server across multiple Kubernetes hosts. Kubernetes calls running the same service multiple times replication. With a bit of network magic that Kubernetes takes care of, instead of having that one big Webserver answer to the queries, we now have many many small ones. Due to the fact that we can have N+1 nodes inside a Kubernetes cluster, it's much easier to scale the Kubernetes environment.

Transient microservice methodology replaces each aspect of traditional application. ' In Kubernetes, we use a Service to do the network magic. A Service'' provides an IP address and a route to the deployed service. This IP address is automatically load balanced, if the service has multiple replicas, and automatically updates itself when the service it points to scales up, down, redeploys or changes.

(source: https://www.cncf.io/blog/2019/08/19/how-kubernetes-works/)

In its bare-bones form, Kubernetes consists of Control Plane Nodes (cp) and worker nodes (worker nodes). In production, Kubernetes should be run in a minimum three-node configuration (a single cp and two worker nodes), but it will be enough for this lab to run everything in a single node.

The cp runs an API server, a scheduler, various controllers and a storage system to keep the state of the cluster, container settings and the networking configuration.

The API server exposes a Kubernetes API, which you can communicate with using a local client (kubectl) or write your own client utilising curl commands. The kube-scheduler finds an appropriate worker node to run containers and forwards the pod specs for the running containers coming to the API.

Each worker node runs a kubelet, which is a systemd process, and kube-proxy.

The kubelet'' receives requests to run the containers, manages any resources and works with the container engine to manage them on the local node. The local container engine could be Docker, containerd, etc.

The kube-proxy creates and manages networking rules to expose the container on the network to other containers or the outside world.

Terminology

We got introduced to the Kubernetes way of thinking and its basic architecture. Let us go over the terminology we are going to use. As we have seen, Kubernetes is an orchestration system for deploying and managing containers in an efficient and scalable manner. Before we were thinking of containers, Kubernetes usually talks about Pods, an object consisting of one or more containers with the same IP, access to storage, inside the same namespace. In a typical scenario, one container in a Pod runs an application while the other containers support the primary application.

What is a namespace?

Namespaces are used to keep objects distinct from each other for resource control and multi-tenant considerations. Some objects can be cluster-wide, while the others confined to a particular namespace. Hence, we can think of namespace as segregation of resources, and that is why Pods need Services to communicate.

Orchestration is managed through a series of watch-loops called controllers or operators. The API server dictates to each controller a particular object state, then the controller keeps modifying the object until the dictated state matches the current state. The controllers are grouped into Controller-Manager.

Deployment defines the scale of an application by setting how pods are replicated on the Nodes. It describes the number of identical pod replicas to run. It also can be used to perform updates with different strategies. The Pods' health is monitored, and Deployment will add, kill or restart Pods to bring the application to the desired state. A Deployment does not directly work with Pods. Instead, it manages ReplicaSets, a controller for creating and terminating pods according to a spec. The spec is sent to kubelet located on a worker node, which in turn interacts the container engine to download and make available the required resources, then spawn or terminate containers until the status matches the spec.

The Service controller requests IP addresses and information from the endpoint operators and manages the network connectivity. A Service helps communication between pods, namespaces and provides access from outside the Kubernetes cluster.

Cluster is all the components put together as a single unit.

Let us recap the following terms:

Cluster: a set of Nodes that Kubernetes manages
Node: a physical/virtual worker machine in Kubernetes
Service: provides access from outside world

Install k3s

There are quite a few various tools to work with Kubernetes. In our lab, we will install and use K3s, a lightweight Kubernetes packaged as a single binary. In K3s terminology:

cp is the k3s server
worker node is k3s agent which is supposed to situate on another node;

However, in this lab, we will have only a k3s server it will work as both cp and worker node.

Before installing k3s, you have to make the path and create this file /var/lib/rancher/k3s/server/manifests/traefik-config.yaml
- add the following content to the file:

[root@lab67-test ~]# cat /var/lib/rancher/k3s/server/manifests/traefik-config.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    ports:
      web:  
        exposedPort: 8080
      websecure: 
        exposedPort: 8443

This is necessary to move the default ingress controller ports 80 and 443 to 8080 and 8443, so your Apache configuration would work after starting K3S.

Please, install k3s on your VM. Make sure the k3s server is up and running under systemd.

curl -sfL https://get.k3s.io | sh -

Please also open port 6443/tcp. This is the port to the Kubernetes APIserver, and is useful for our monitoring, and you can also use this to connect to your Kubernetes via other Kubectl's.

If the installation went smoothly, your VM should have:

k3s systemd service
kubectl command (this command is installed to /usr/local/bin which might not be in the default PATH)

2. A simple pod

The main goal of Kubernetes is to orchestrate the lifecycle of a container. We do not interact with individual containers; instead, our smallest object is a Pod. The design principle is that a Pod follows the rule of one process per container. This aspect allows starting containers in a Pod in parallel, resulting in an inability to know the start-up order of containers in the Pod.

A Pod will often consist of a main application container and one or more supporting containers in production. The main container may need logging, a proxy or another special adapter. The supporting containers manage these tasks in the same Pod.

There is only one IP address per Pod, so containers in a pod share the IP. At the same time, communication between each other is done via IPC (Inter-process communication), the loopback interface or a shared fs.

In the lab section, let us deploy the simplest Pod and view its behaviour to our actions. Look at a simple pod template in YAML format below.

apiVersion - which version of the Kubernetes API you're using to create this object
kind is the type of object you want to create
metadata has to include at least the name of the Pod
spec defines what containers to create and their parameters

apiVersion: v1
kind: Pod
metadata:
    name: myfirstpod
spec:
    containers:
    - image: nginx
      name: user

Create this pod in k3s using kubectl create -f simple.yaml

Now you deployed a pod on Kubernetes!

Check the pod status with kubectl get pods You can view the Pod's status change from ContainerCreating to Running.

2.1 Having a look into the pod

Once the Pod is in Running state, we can have a small look around the object. Kubernetes automatically keeps quite a bit of information about the resource itself. To see this information, use the command kubectl describe pod/myfirstpod.

Name:         myfirstpod
Namespace:    default
Priority:     0
Node:         b123123.sa.cs.ut.ee/192.168.42.80
Start Time:   Mon, 25 Apr 2022 06:17:18 +0300
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.42.0.9
IPs:
  IP:  10.42.0.9
Containers:
  user:
    Container ID:   containerd://a2e59c7d2a323f3c14688185a192b2cba979f3f3ecf281a80ecd9b17c376ffd6
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:859ab6768a6f26a79bc42b231664111317d095a4f04e4b6fe79ce37b3d199097
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 25 Apr 2022 06:17:32 +0300
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dml52 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  kube-api-access-dml52:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  31m   default-scheduler  Successfully assigned default/myfirstpod to b123123.sa.cs.ut.ee
  Normal  Pulling    31m   kubelet            Pulling image "nginx"
  Normal  Pulled     31m   kubelet            Successfully pulled image "nginx" in 10.796364059s
  Normal  Created    31m   kubelet            Created container user
  Normal  Started    31m   kubelet            Started container user

This output has a massive amount of information, starting with just how and when a pod was run. On top of that it contains information which volumes it has access to (in our case only to the Kubernetes root CA certificate), which state it's in, what it's IP address inside the Kubernetes cluster is, and finally the Events it has had.

The Events table is extremely important for debugging. It will let you know which states a resource has gone through, and point out any errors during the deployment.

Because we are also on the same machine where our Kubernetes runs, we can directly access the services on it. Try doing a curl <pod IP>, it should give you the default nginx webserver output.

3. Namespaces

The term namespace is referred to both the kernel feature and the segregation of API objects by Kubernetes. Both definitions mean to keep resources distinct.

Every API call (aka your every kubectl command and other Kubernetes communication) includes a namespace. The default namespace is used if not otherwise specified. There are four namespaces created with a cluster: default, kube-node-lease, kube-public and kube-system. We will not be concerned about them in this course, but it's good to be aware of them:

default is the namespace where everything goes by default
kube-node-lease is used to keep track of other K3S nodes
kube-public - This namespace is created automatically and is readable by all users (including those not authenticated)
kube-system - This namespace is automatically used by system level services, that no-one else but Kubernetes admins should have access to.

Let us look at commands to view, create and delete namespaces.

Look what namespaces you got with the installation of K3s with kubectl get ns
Now, let us create a namespace called test-ns with kubectl create ns <name of the namespace>
- Having created a namespace, you can reference it in YAML template when creating an object. See an example below. Create your second Simple Pod under your test namespace.

apiVersion: v1
kind: Pod
metadata:
    name: <pod-name>
    namespace: <namespace-name>
...

Check if the pod is running and try to curl it. Helpful commands: kubectl get, kubectl describe.
- You might have an issue now where kubectl get does not show you the container.
- You need to specify a namespace with the kubectl commands to see the non-default namespaces. For an example: kubectl get all -n lab11
You can delete a namespace with kubectl delete ns/<namespace> if you want to keep things clean

Now you know how to create a simple namespace and pods. Let us create a pod with a single container, similar to the webserver that we used in Docker Lab.

Create a namespace lab11
Create a pod hw1 with am image registry.hpc.ut.ee/sysadmincourse-dev/hello-world:1.0 and define port 5000 as the containerPort option.

apiVersion: v1
kind: Pod
metadata:
    name: <pod-name>
    namespace: <namespace-name>
spec:
    containers:
    - image: <image>
      name: <pod_name>
      ports:
      - containerPort: <portName>

The containerPort specifies which port should Kubernetes expose from the container by default. This does not mean others can not be.

4. Running a deployment

Now let us step aside and think of Kubernetes' goal to orchestrate the container's lifecycle but do it in a scalable manner. Is the behaviour you have just observed fit this statement?

Hopefully, you can see it is not a scalable solution as it inflicts a lot of manual work to manage a pod's state: to bring a recently killed container up, kill it and redo it again. And this is just with one container; imagine if you have pods in double figures. What do you do then?

In the Terminology section, we touched Deployment as a pod state operator that also manages ReplicaSets. The Deployment controller always monitors the health of pods and worker nodes; it will replace a failed pod or ship a pod to another worker node if the current node has failed. This leads to continuity in container availability.

Most typical use cases for deployments:

A stateless web server with a fixed number of pods, for example nginx. The Deployment will maintain that number;
A deployment with StatefulSet that has amounted to persistent volume to ensure data integrity, for example a database instance;
Automatically scalable number of replicas that responds to the workload increases, automatically balancing incoming requests between the replicas, and creating new replicas as demand grows or terminating them as demand surges.

ReplicationControllers (RC) are responsible for a specified number of pods running at all times. RC provides the ability to do a rolling update.

Deployment object

apiVersion dictates which API to use to create and object. kind gives the type of object. metadata is useful information for identifying and working with the object.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: <deployment name>
  namespace: <assigned namespace>

The spec fields is a little more complicated for Deployment than for Pod. It consists of there main fields `replicas`, `selector` and `template`.

.spec.replicas indicates the number of identical Pods running on the cluster at the same time. The number can be managed manually and we will see how easy it is to scale up the number of Pods.

.spec.selector defines how the Deployment find which Pods to manage. In this case, you select a label that is defined in the Pod template (for example: .metadata.labels: app: server-py. See below). However, it is possible to create a more sophisticates selection rules, the only requirement for the Pod template to satisfy the rule.

.spec.template has the following sub-fields:

.metadata.labels field gives a label for the Pod specified below in .spec field
.spec indicates the same settings as we used to run a pod in namespace task

spec:
  replicas: <number>
  selector:
    matchLabels:
      app: <Pod label name>
  template:
    metadata:
      labels: 
        app: <Pod label name>
    spec:
      ... 
         <Pod spec information>

Write a Deployment template for the same helloworld container from the previous tasks (the above example will help you do it):
- utilise lab11 namespace for the Deployment
- use a single Pod replica
- give pod a label
- use image registry.hpc.ut.ee/sysadmincourse-dev/hello-world:1.0 with port 5000
- add this key-value pair to Pod specs: imagePullPolicy: Never, so Kubernetes won't try to feth the image if it has a local copy.
Apply your Deployment template with kubectl apply !!!!use -f flag and view it with kubectl get deployment
Try to delete The Pod with kubectl delete pod/<full pod name> and view what happens to it? Is it still alive?
- Use kubectl get -n <namespace> pods to find the Pod's full name
Manually scale deployment to 3 replicas and view it: kubectl scale

What happened after you deleted the Pod? Have you observed the same behaviour as with a simple Pod?

5. Run a service

Coming back to the main Kubernetes concept of decoupling and transiency, each part of Kubernetes is dedicated to solving a small goal. Keeping this in mind, we have automation for our containers; they are up and running. However, they are lacking a crucial part which is communication within the cluster between pods and connection from the outside world. Services come into play here, their purpose is to connect Pods together or provide access outside the cluster, taking into account that any Pod can be killed and replaced at any time. The updated Pod is connected back using Labels and the Service continues to provide the expected resource.

There are several Service Types:

ClusterIP - by default provides access only internally (can be configured to create an external endpoint)
NodePort - gives a static IP address and opens a particular address through a firewall
LoadBalancer - passes requests to a cloud provides (AWS or OpenStack)
ExternalName - allows to return an alias to an external service( doesn't utilise proxy but rather redirect at DNS level)

kubectl proxy command creates a local service to access a ClusterIP (good for troubleshooting/development work)

The presence of different services gives great flexibility. Exposing services externally and internally to the cluster or connecting internal and external resources, for example, to a database that sits outside the Kubernetes.

An agent called kube-proxy looks for the Kubernetes API for new services and endpoints of each node. It manages random ports and listens for the traffic and redirects the traffic to the randomly generated service end-points. Then Services use iptables to route traffic.

To add Service to our existing Pod, it is possible to use kubectl expose, however, in this lab we will stick with writing templates. You can add another section to the Deployment template (use --- separator) or you can create a new file for the Service. We are going to expose your Pod internally to cluster via port 80, the Service will pass the traffic from/out the Pod's 5000 port.

apiVersion: v1
kind: Service
metadata:
  name: <service name>
  namespace: <namespace name>
spec:
  ports:
  - port: <port accessible inside the cluster>
    targetPort: <port  to forward to inside the pod>
    protocol: <UDP or TCP>
  selector:
    app: <Pod label name>

Write a Service template for the same Pod from the previous Deployment task
- utilise lab11 namespace
- keep in mind the .selector.app label you used before to link Pod and Service
Apply your Service template with kubectl apply

kubectl get services to view your service and curl the service IP and port, you should see the same output as the previous task.

Service types

When checking the kubectl get services output, you can see there's a section for TYPE. There's four kinds of services possible:

ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
NodePort: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You'll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
LoadBalancer: Exposes the Service externally using a cloud provider's load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created. This can only be used if there's a cloud provider supporting this methodology (AWS, GCP, Azure). We do not.
ExternalName: Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up. This expects you to have a proxy somewhere.

As you can see, the ClusterIP type is useful only inside the cluster. To publish an application to the outside world with a service, you need to use one of the other three. As we do not have the necessary integrations, let's use the NodePort type.

Edit the previous service's definition .yaml file, and set the .spec.type to "NodePort".

...
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 5000
    protocol: TCP
  selector:
    app: lab11

And apply the definition to the cluster again with kubectl apply -f <definition_file>.

Now when you check the service, you can see it still has a cluster IP, but it also has new information in the PORTS column. Formated like this:

<Container port>:<kubernetes host port>/<protocol>

Open the host port given in the firewalls, and try querying it with <vm_name>.sa.cs.ut.ee:<kubernetes host port> this time. You should get the same result.

This means you have published a container to the network.

Utilizing automatic service load balancing

As we have a lightweight HTTP based web server deployment and a service set up, we can now see how the service automatically and dynamically configures itself, and load balances between the different endpoints.

First, let's scale up our lab11 namespace deployment. Your deployment might be named differently, so make sure to edit the commands as needed:

kubectl scale deployment/lab11 --replicas=3 -n lab11

This command tells the deployment to increase it's replicas from 1 to 3. You can check with the kubectl get pods command how more pods are created, and how they are named similarly, but the end differs. This is how we can also differentiate which pod the answer comes from.

Try curling the service again, but do it at least 5 times. You can see how the "Hostname" of the response changes. This is the load balancing at work.

If you were now to increase or reduce the replica count, the service would also internally change itself to match, without any intervention from operator side.

6. Run an Ingress

Ingress exposes HTTP and HTTPS traffic from outside the cluster to internal Services. This way it centralises traffic. It is, of course, possible to expose a containerised application outside of the cluster using Services. However, Ingress provides a better efficiency, instead of lots of services, routing is based on the request host or path allowing for centralisation of many services to a single point.

Here is a simple block example* of how traffic is managed by Ingress:

(source: https://kubernetes.io/docs/concepts/services-networking/ingress/)

An Ingress is usually configured to give Services externally-reachable URLs, load balance traffic, terminate SSL/TLS and provide virtual hosting. It is worth noting that Ingress does not exposes arbitrary ports or protocols, these are functions of Services.

In order for the Ingress to work, the cluster must have an ingress controller running. Opposite to other types of controller which are part of kube-controller-manager binary, Ingress controllers are not started automatically with a cluster. We have to select a ingress controller in our template before applying it.

There are several class of ingress controllers (nginx, traefik, envoy and etc) A tool has to provide a reverse proxy. Ingress controller allows traffic to flow to an internal service from the outside the cluster.

The Ingress resource

An Ingress template requires these fields: apiVersion, kind, metadata and spec. annotation field configures a call of Ingress controller. In this lab, we are going to use traefik.

The Ingress spec contains all the information to configure a proxy server or a load balancer. It also can have a list of rules matched against all incoming request, however, we are not going to touch these.

When a defaultBackend is set an Ingress sends all traffic to a single default backend and it handles the request in that case. We are going to pass the request to the previously set Service.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: <ingress name>
  namespace: <namespace name>
  annotations:
    kubernetes.io/ingress.class: "traefik"
spec:
  defaultBackend:
    service:
      name: <service name linked to the Pod>
      port:
        number: <port exposed internally to the cluster >

Write an Ingress template with "traefik" as ingress class and fill the blanks.
- utilise lab11 namespace
Apply your Ingress template kubectl apply -f

curl your machine but add to your firewall the following

firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16 # pods
firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16 # services
firewall-cmd --reload

Assemble Deployment, Service and Ingress into a manifest

Every task we have done in this lab should be combined together to create a complete manifest that could be used to update and modify the way your application is deployed. Below you can see an example of the complete manifest, where each part is separated by ---. The manifest structure falls under CI/CD principles as the process of updating the current setup requires only to change a single thing and run kubectl apply to update the setup.

apiVersion: v1
kind: Namespace
metadata:
  name: testv0
  labels:
    name: testv0
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: serverpy-deploy
  namespace: testv0
spec:
  replicas: 1
  selector:
    matchLabels:
      app: server-py
  template:
    metadata:
      labels:
        app: server-py
    spec:
      containers:
        - name: docker-lab
          image: docker-lab:2.0.0
          imagePullPolicy: Never 
          ports:
          - containerPort: 5000 
---
apiVersion: v1
kind: Service
metadata:
  name: serverpy-deploy
  namespace: testv0
spec:
  ports:
    # port accessible inside the cluster
  - port: 80
    # port  to forward to inside the pod 
    targetPort: 5000
    name: tcp
  selector:
    app: server-py
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: default-backend-ingress
  namespace: testv0
  annotations:
    kubernetes.io/ingress.class: "traefik"
spec:
  defaultBackend:
    service:
      name: serverpy-deploy
      port:
        number: 80

Optional

If you interested into a graphical visiolisation of your cluster, you may want to look into Lens https://k8slens.dev/, which will show namespaces, Pods and their logs, deployment and its status and many other details.

You can get kubeconfig for your Lens in /etc/rancher/k3s/k3s.yaml.

Süsteemihaldus 2021/22 kevad