Lab 3: Ansible
Welcome to the Ansible lab. Here is a shortlist of topics we will cover during this lab:
- Configuration Management and IT automation
- Installing and testing Ansible
- Introduction to Ansible
- First playbook
- Solving Cloud lab tasks with Ansible
- Ansible modules
- Ansible roles
- Ansible variables
- Git
PS! Before continuing, please make sure you have completed the previous labs.
This lab is strongly recommended but not mandatory. It will help you think in a more transactional manner. The Ansible playbook can be used during an exam, either for using it as documentation or even solving the exam if you have written it well enough.
There are no scoring tests because it is impossible to check if you have solved it by hand or using Ansible.
Configuration Management and IT automation
In the field of Systems Administration, things are primarily done at scale. At least nowadays, especially when talking about the cloud (AWS, GCS, Azure).
"At scale" means that instead of working on a single or a few servers, system administrators usually work with tens, hundreds or thousands of servers. This is necessary for availability, integrity and performance reasons, but as you'll find out in this course, even working on one machine is highly time-consuming.
Here's where configuration management is of help. Instead of manually working on every machine/system/software, we can define a state where we want things to be, let intelligent tools determine the difference between the current state and the required state.
This state (software, configuration files, secrets, settings) difference is passed to IT automation, which applies necessary steps to bridge the gap between these states.
Think of it as code for servers or infrastructure. As you will soon see, it is in some cases very similar to writing code, with functions, variables and conditional sentences.
Utilizing well-developed, documented, and tested automation allows system administrators to automatically manage large systems clusters. This kind of methodology provides systems administrators with a way to automate their job, allowing them to focus on development and infrastructure changes instead of small everyday tasks by conforming to the DRY principle.
Using this kind of automation is deceptively simple. There is a tricky part, an annoying part and an easy part.
Learning the language, in our case, Ansible, is the tricky part. You need to know the tools you use to create an effective automation solution.
Understanding and learning the steps the automation should take is the annoying part. You cannot write automation for setting up a web server if you do not know the steps. Learning these steps is the main focus of the course and why we work on one server at a time.
The easy part is writing the automation for a singular service or server. Implementing this for the automation tool is trivial when you break the automation into well-defined steps beforehand.
Having the different parts of automation work together, allowing a single tool to manage various services, servers, clusters and either tightly or loosely connected infrastructure reliably, is the main difficulty of the whole system administration. Still, we will not focus on this part in this course because we only have 16 weeks. Please check out the DevOps and Cloud Computing courses as continuations to this course.
Installing and testing Ansible
We will use Ansible as the configuration management and IT automation tool during this course. Ansible is a prevalent configuration tool because it is relatively easy to start out with, has first-class support for CentOS, and allows operations across different infrastructure like VMs, servers, containers, network devices, etc.
There are two main ways to run Ansible.
- Installing Ansible inside our lab machine and installing against the same device, e.g.
localhost
- Installing Ansible on your personal machine (not lab machine) and running it against the lab machine in a remote fashion. (Recommended)
The second method is highly recommended and is the primary way Ansible is used in the industry, but we will describe the first method since Ansible might have problems running on Windows machines. You are welcome to adapt the guide to the second method.
First, let's install Ansible on our virtual machines:
yum install ansible
# Use yum to install Ansible
ansible --version
# Check whether Ansible binary works
You can be sure it works if you get an output with the version and different configuration option definitions.
We can also test whether we can connect to our VM using Ansible:
ansible -m ping localhost
# Use the Ansible ping
module to verify a connection to localhost
You should get a green SUCCESS message like this:
[centos@test ~]# ansible -m ping localhost localhost | SUCCESS => { "changed": false, "ping": "pong" }
Introduction to Ansible
Ansible expects a particular folder structure to work as intended.
As a starter, you need the following folder structure:
[centos@test ~]# tree ansible ansible ├── inventory # folder │ └── hosts # file └── playbook.yml # file
You also need to specify some information in the two files we created.
The hosts
file is an inventory of hosts you'll be managing with Ansible. For us, it's only our localhost
currently.
The format of the hosts
file looks like this:
[group1] <ip> <hostname> [group2] google.com 8.8.8.8
But as we only have one host, we should have it like this:
[localhost] localhost
And this will allow us to run the playbook with having our VM - localhost
- as the target because we're running Ansible from inside the VM itself.
Now we only have to give Ansible something to do. There are two ways to do that:
- Ad-hoc commands
- Playbooks
Ad-hoc commands are singular commands that do something on the server. For example, the ping
command we used was an ad-hoc command:
ansible -m ping localhost
ansible -m <module> <target>
But this methodology is mainly used to test out things or fix simple mistakes made by playbooks.
An Ansible playbook is a blueprint of automation tasks - which are actions executed with no human involvement. The tasks are run one after another, without the operator (you) having any way to influence the process after execution has started.
First playbook
As we want to manage our machine with Ansible completely, let's start by using playbooks.
The playbook file is called playbook.yml
, but you can name it anything you would like.
A playbook looks like this:
- hosts: all user: centos become: yes become_user: root tasks: - name: Testing this out debug: msg="Hello from server!"
As this is YAML, indentation is significant. If you get syntax errors, please make sure indentation is fine.
Running the above playbook is done like this: ansible-playbook playbook.yml -i inventory/hosts
Once you run this, it will process for a bit and then fail in red text. Because we told it to run as user centos
and against the host localhost
, it will now try to login into the machine, which fails, because we have not set up a key.
First, generate a key for your VM's centos
user with the ssh-keygen
command (it will walk you through key generation).
Afterwards, copy the generated public key to the authorized_keys
file.
cat /home/centos/.ssh/id_rsa.pub >> /home/centos/.ssh/authorized_keys
And then try re-running the playbook. This time, you should get green output.
Ansible modules
The most important part of the whole playbook is the tasks
, which defines which actions are to be taken by the playbook upon execution.
The tasks
declaration is processed top-down, meaning the first task is executed first.
Every task consists of a name, module and arguments, in the format of:
- name: something module: argument: "value"
or the short form of:
- name: something module: argument="value"
In our previous testing example, we used the debug
module, which you can use to debug parts of your playbook. Documentation here: https://docs.ansible.com/ansible/2.4/debug_module.html
Every action you want to do with Ansible needs to use an appropriate module. The list of modules is massive: https://docs.ansible.com/ansible/2.9/modules/list_of_all_modules.html.
We will be helping you out with which modules are, in our opinion, correct ones to use for particular tasks in future labs.
Solving Cloud lab tasks with Ansible
As an introduction to Ansible, we will be solving the tasks we had in the Cloud/ETAIS lab by writing a playbook.
We will be using the following modules for this:
- user
- file
- get_url
There are four different problems we need to solve:
- Making a
scoring
user. - Creating the
.ssh
directory for thescoring
user - Copying the
scoring.sa.cs.ut.ee
public certificate toscoring
user'sauthorized_keys
file. - Adding
scoring
as a sudoer's user.
Let's go task by task, adding parts to the Ansible playbook and running the playbook after every addition.
First, making the user is done using the user
module, like this:
- name: Add user scoring as Linux user user: name: scoring
After adding this part, the playbook should output a green "ok" task. The green output indicates that the system is already in the state defined in Ansible, which is because we already did the necessary change by hand in Lab 3.
An important concept to understand is "idempotency", which means that no new actions are to be made when a state is already reached on a machine. Idempotency allows you to run Ansible on several targets of different states and have them arrive at one particular state in the end. You can also keep re-running the Ansible, and it will only change what's needed.
The second part we need to add is making the .ssh
folder. For this, we use the file
module.
- name: Create .ssh directory file: path: /home/scoring/.ssh state: directory owner: scoring group: scoring mode: '0700'
We also make sure the owner, group and permissions (mode) are correct. The state: directory
implies making a directory instead of a file.
The third part is adding the scoring.sa.cs.ut.ee
public key to the authorized_keys
file. We can do a bit of magic here, where the key is downloaded automatically with the get_url
module:
- name: Download user scoring public key get_url: url: https://scoring.sa.cs.ut.ee/files/id_rsa.pub dest: /home/scoring/.ssh/authorized_keys owner: scoring group: scoring mode: '0600'
This module will download the file from the website defined as the url
parameter's value and put it into dest
. Again, we limit the file's user, group, and permissions.
The last part is making the sudoers
file modification. While we could use the lineinfile
module to make appropriate changes inside the sudoers file itself, we are better off using a solution provided by the Linux distribution authors - the configuration directories.
Some software and configuration files also include all the settings from a separate folder, just for situations where we want to modify the system utilising configuration management. Editing a file with configuration management is more complex than just adding the correct file into a folder to specify only the settings we want to change.
For achieving this, we will still use the lineinfile
module to create a file in /etc/sudoers.d/scoring
, but we won't have to write regex:
- name: Create the /etc/sudoers.d/scoring file lineinfile: dest: /etc/sudoers.d/scoring line: "scoring ALL=(ALL) NOPASSWD: ALL" state: present create: yes
create: yes
allows for creating the file if it does not exist.
After adding this part, the playbook should output a yellow "changed" task. The yellow output indicates that Ansible has made a change in the system. You will see the "changed" status only once.
Ansible roles
As our task is eventually having the whole ten labs written down as Ansible, it will become challenging to manage the hundreds of tasks later on.
Mitigating this issue requires us to use something called "Ansible roles". Think of these as functions in code - they are packaged subsets of Ansible playbooks, usually meant for solving one issue at a time. E.g. a role for setting up the webserver, a role for setting up the firewall, etc.
This kind of isolation allows you to select which roles you want to include when running Ansible, without going through a hundred tasks every time.
This requires setting up the roles folder appropriately, moving our tasks from the playbook into the role folder, and including the role in the playbook.
First, make an appropriate roles structure:
[centos@test ansible]# pwd /home/centos/ansible [centos@test ansible]# tree roles roles └── etais └── tasks └── main.yml
Then move the four tasks for solving the Cloud/ETAIS lab into the roles/etais/tasks/main.yml
file, so it looks like this:
- name: Add user scoring user: name: scoring - name: Create .ssh directory file: path: /home/scoring/.ssh state: directory owner: scoring group: scoring mode: '0700' - name: Download user scoring public key get_url: url: https://scoring.sa.cs.ut.ee/files/id_rsa.pub dest: /home/scoring/.ssh/authorized_keys owner: scoring group: scoring mode: '0600' - name: Create the /etc/sudoers.d/scoring file lineinfile: dest: /etc/sudoers.d/scoring line: "scoring ALL=(ALL) NOPASSWD: ALL" state: present create: yes
The main.yml
file is the default file that gets included when referring to a role. In a functional role, the main.yml
has to exist.
And now let's add the definition of the role to our main playbook, so it looks like this:
- hosts: all user: centos become: yes become_user: root roles: - etais tasks: - name: Testing this out debug: msg="Hello from server!"
Once you run this, all the tasks should still be green, but if you noticed, then the order of execution changed. That's because roles
are always processed before tasks
.
Now, when you start making another lab's playbook, you can just create a new role in the roles
folder, include it, and conveniently split tasks.
Roles definitions allow you also to use only parts of the written playbook during the exam.
Ansible variables
Ansible is only moderately useful if we expect the environments to be always the same. This problem can be solved by writing some intelligence into our playbooks.
In Ansible, there are two main ways to add intelligence - variables and conditionals. We will cover conditionals in a later lab when we have an excellent use case.
Variables, on the other hand, are helpful from the beginning.
When you run Ansible on your host, it first gathers information about your environment. Ansible documentation calls this information "facts".
You can check which facts Ansible has about your host by running: ansible -m setup localhost
. The output is a massive index of facts, including the machine's IP address, SSH keys used, hypervisors run on etc.
You can also define your variables using the hosts file, group_vars
and host_vars
folders, set_facts
module and more.
We will go more in-depth with variables once we need them, but just as an example, let's create a task that prints out some information with variables.
- name: Testing this out debug: msg="Hello from user {{ ansible_user_id }} on server {{ ansible_hostname }}!"
Using Git to store files
The Ansible playbook we wrote works as our documentation, backup plan, exam helping tool and a future example. Making sure it does not disappear when we accidentally delete our machine is an essential part.
Let's upload our playbook to some version management system to solve this problem. Git is a version control system you should already be familiar with from previous courses. While GitHub is a free public repository host, we will be using a private Gitlab instance created especially for the Computer Science Institute.
As the first step, log in to our GitLab at https://gitlab.cs.ut.ee. For the authentication method, choose LDAP instead of Standard.
- Now create a blank project
- Give it a name of your choosing or just your matrix number
- Project slug value is autogenerated, and you don't have to change that
- For the Visibility level, keep it Private
- And finally, skip Initialize repository with a README since we will be using our Ansible repo root as the initialiser.
Once the project has been made, we can upload our playbook.
First, initialise your global git configuration. You can also copy-paste these commands from the git repo setup instructions.
- git config --global user.name "<your username>"
- git config --global user.email <firstname.lastname>@ut.ee
Now we are ready to push our repository into the course GitLab. You should already be in the Ansible repository root directory, but if you are not, change your directory to that of the Ansible.
- cd <ansible repository>
- git init to initialise the directory as a git repo.
- git remote add origin https://gitlab.cs.ut.ee/<username>/<repository name>.git Make sure to replace the username and repository name values accordingly
- git add .
- git commit -m "First commit"
git push -u origin main
And you should be done. Refresh the page in gitlab to see all of your files nicely stored in the git repository. Also from here on now, don't forget to regularly commit your changes to the repository, atleast after each lab.