Süsteemihaldus - Kursused - Arvutiteaduse instituut

Overview

This week's topic is a small introduction to Docker and DevOps union deployments wise. We will be looking into how to deploy Docker applications so that running them does not require you to change configuration files every time.

Please make sure you did the previous lab, especially editing the /etc/docker/daemon.json file. You can delete last week's containers, but not the Docker configuration, as we will utilize Docker in this and the following lab as well.

This lab is composed of the following topics:

Debugging containers
Linking containers
Properly publishing a container using a dynamic proxy

Debugging containers

Often enough, things do not work as you want them to. Either requests do not return what you would expect, the container itself seems to be missing data, some dependency you did not account for, etc.

One must know how to find debug information from containers for these reasons.

Logs

The problem is that even though the docker daemon writes logs:

journalctl -r -u docker

You'll notice that these logs are only about the docker service itself. You care more about the container logs. These can be checked doing:

docker logs <container ID|name>

So, for an example, doing docker logs <docker_lab_container_name> should return you this:

192.168.251.1 - - [11/May/2020:03:47:16 +0000] "GET / HTTP/1.1" 200 7234 "-" "curl/7.66.0" "-"
192.168.251.1 - - [11/May/2020:03:48:36 +0000] "GET / HTTP/1.1" 200 7234 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:75.0) Gecko/20100101 Firefox/75.0" "-"

Which are the same as the access logs for Apache. All the container errors get put here as well if the software and dockerfile is set up properly. You can also read the logs information from stopped (not deleted) containers, which is especially handy for finding why they stopped.

Internals

Logs are not always helpful. Sometimes you want to debug something inside the container or test network connectivity.

For this reason, you can (sometimes, when the creator of the Dockerfile has left appropriate binaries in) open a shell inside a container and use it as your own VM.

Try doing this: docker exec -ti ecstatic_dewdney sh

exec means to execute a command inside a container
-t means to open a tty (shell), and -i makes it interactive so that you can type there
ecstatic_dewdney is the container name of the container we built an image for. This name is auto-generated and you can find it with docker ps.
sh is the command to be executed

If this works, it should drop you into a new shell. If you check the files and IP address, it is an entirely different machine. You can check the configuration used to run the container (cat /etc/passwd), install software (apk update && apk add vim), and ping the network to test networking.

Some things to remember:

If you make changes to this container from the inside, then it will be there only for the lifetime of this particular container. The moment you delete this one, it's gone. For this reason, it's never a good practice to make changes inside the container, but always the Dockerfile.
localhost now means the container itself.
- If you do curl localhost:5000, now you see the same information on port 5000 as you did on port 5005 before.
  - You may need to first install curl (apk add curl)
- Port 5005 gives an error because nothing is listening inside the container on port 5000.
- If you want to access your VM host, not the container, you need to use an IP address. If you check your container's IP address, it should be something like 192.168.67.X. You can access your VM on 192.168.67.1 (e.g. curl 192.168.67.1:5000).

Security scanning

One of the more significant issues with packaging software like containers is that it's challenging to keep track of what versions of software/operating system/tools you are running in each container.

It's even worse when trying to find container images. How can you make sure the image you want to use is safe?

Thankfully, nowadays, we have tools called Security Scanners. There's quite a few of them, but we will be using Trivy, which does not need any other dependencies to run.

To install Trivy on your VM, run the required commands from here: https://aquasecurity.github.io/trivy/v0.26.0/getting-started/installation/

Trivy installs its binary to a folder that you might not have in PATH. This means you have to call it like this: /usr/local/bin/trivy.

Scanning with Trivy is a relatively simple ordeal. All you need to do, is give it the name of either a local image, like:

/usr/local/bin/trivy image alpine

Or scan an image in a remote registry:

/usr/local/bin/trivy image registry.hpc.ut.ee/mirror/library/python:3.4-alpine

As you can see, the Alpine image has no outstanding security issues, which is excellent. On the other hand, Python image is old enough to have quite a few.

These issues can be solved by either recreating the entire image or updating it when using it as a base image for projects.

Linking containers

Sometimes, you want to hide your running service from the outside world for security reasons, but still allow it to be accessible by some containers.

A good example is a database. A database does not need to be accessible from the outside world, as it contains important information that is very easy to access - you only need to know the right username and password.

This is why Docker uses networks internally, which is one of the most complicated aspects of Docker. We will make our own network, and make one container available only from inside another.

We will use the whoami container we set up in the previous lab. If you do not have it, set up a new container: https://github.com/containous/whoami

As a reminder: docker run -d --name whoami registry.HPC.ut.ee/mirror/containous/whoami
- As you can notice, we did not set up a port for it. It only has a service listening on port 80 inside the container, nothing outside the container.
- Go try to find a way to access it. (Clue: there is a way, but it is annoying)
- Let's also try to ping it from our previous container.
  - docker exec -ti ecstatic_dewdney sh (this is the self-built container, use your own containers name)
  - ping whoami
    - What happens?
    - Also, check how many network interfaces this container has.

So, as you can see, the two containers cannot talk to each other.

Let's now make a new network, and add our containers to it:

docker network create test --subnet 192.168.150.0/24
- NB! The IP address is mandatory, if you do not specify this, you lose access to your machine over the internet.
- You can check the networks by using docker network ls and docker network inspect test.
docker network connect test whoami
docker network connect test ecstatic_dewdney

Now, if you go back inside the ecstatic_dewdney container, recheck the network interfaces. Let's try to ping the other container.

docker exec -ti ecstatic_dewdney sh
ping whoami

As you can see, now the container pings the other container, and also, if you have specified a name for it, it can utilize the name!
NB! pinging with name only works when you specify a --name <smth> parameter when running the container. If you do not, it gets an auto-assigned name and IP address, and it is your responsibility to know what you need to connect to.

There are tools to make this easier for you, for an example, docker-compose, but we find that you cannot use tools properly unless you know what actually happens, when you do use those tools.

Properly publishing a container

Even though using the ports method for publishing things to the internet would work.. technically, then there are huge problems with that approach:

You need to remember which service is on which port.
You cannot scale services, as you cannot put multiple of them listening on the same port.
If you use a service provider, then it is very often that only some ports from the internet are allowed. (e.g. in the public internet, only ports 80 and 443)
Firewall and security configuration becomes complicated.
You have no overview about how often or how your service is used unless the software inside your container provides that information. (it usually does not)

One of the solutions would be to do a localhost proxy like we did in the web lab or last lab. The problem with this is, that it would solve only points 2, 3 and 5. Thankfully, there are thought out solutions out there capable of proxying without using any docker ports.

One of these services is called Traefik. We will be setting up a Traefik proxy to a container without dabbling with the fancy buttons and dials Traefik has (automatic encryption, metrics, logging, complicated routing). This is left as homework, if interested.

Make a directory at /etc/traefik and copy the following inside /etc/traefik/traefik.toml.

[global]
  checkNewVersion = true
  sendAnonymousUsage = true

[entryPoints]
  [entryPoints.web]
    address = ":80"

[log]

[api]
  insecure = true
  dashboard = true

[ping]

[providers.docker]

This configuration opens up ports 80 and 8080 inside the container - port 80 for accessing websites/containers and port 8080 for a fancy dashboard.

Now run Traefik: docker run -d -p 50080:80 -p 58080:8080 -v /var/run/docker.sock:/var/run/docker.sock:ro -v /etc/traefik/traefik.toml:/traefik.toml:ro --restart=always traefik:v2.1.8

We map container port 80 to host port 50080 because Apache is already listening on port 80.
We map container port 8080 to host port 58080 to comply with the previous standard.
Traefik needs access to host /var/run/docker.sock to find containers we will start up. NB! Do this only with services you trust! One can read security information from this socket and use this socket to become root from inside the container!
Also mount /etc/traefik/traefik.toml as a configuration file inside the container.

After running this and opening ports 50080 and 58080, you should be able to access them from the internet. Check the 58080 port.

Now we have a working proxy, but we have not told it to proxy anything. Let's fix that problem.

Run a container like this:

docker run -d --name whoami-traefik --label traefik.enable=true --label 'traefik.http.routers.router.rule=Host(`<machine_name>.sa.cs.ut.ee`)' --label 'traefik.http.routers.router.entrypoints=web' registry.hpc.ut.ee/mirror/containous/whoami

Where you replace appropriate bits.

traefik.enable=true tells traefik to proxy this container
traefik.http.routers.router.rule=Host(`<machine_name>.sa.cs.ut.ee`) tells which name to route to this container
traefik.http.routers.router.entrypoints=web says which entrypoint to use (we have only one)
--name whoami-traefik we set a name to the container so you cannot run multiple instances of the container with these settings - traefik does not like that.

If you go to page <machine_name>.sa.cs.ut.ee:50080, you should see the output of whoami container. You can check the logs and see how it works, also there should be information about this container on the Traefik dashboard.

Süsteemihaldus 2021/22 kevad