Overview
This weeks topic is shared and distributed file systems. We will be talking how to share data and filesystems for easy access between servers and clients, and show you nowadays standards for data management.
This lab is composed of following topics:
- Network File System (NFS)
- SAMBA
- Object Storage (s3)
Shared filesystems
While a singular isolated server is often enough useful on it's own, sometimes you need to share data with either other servers or clients.
One option would be to just copy data over every time you need it, but this would be a headache to housekeep, especially when there are multiple clients and/or servers accessing the files at the same time. Example use cases:
- Accounting file storage for big corporation with multiple people working on them
- A highly available web server, that has multiple servers serving the same content
- Data archival and retrieval system for big institutions.
- Multi-site datacenters with their massive eventual-consistency database systems.
On top of that, this time of filesystems also provide you with interfacing to be able to back up data, which is immensely important in the digital world.
Network File System (NFS)
NFS is often enough the easiest way to set up filesystem sharing in UNIX environments (BSD, Linux, Mac OSX). As students have one machine to work with, your machine will both be the NFS server (party providing the filesystem) and NFS client (party accessing the filesystem).
Important files:
- /etc/fstab # This file is for describing filesystems that the OS should automount on boot
- /etc/exports # This is the file you describe your exported NFS filesystems from
- Install the
nfs-utils
package. - Start
nfs-server
service. - Use the man pages or the internet, to find out the format of
/etc/exports
file. - Make the directory that you will export -
/shares/nfs
- Make the directory you will mount the exported filesystem to -
/mnt/nfs
. - Setup export using
/etc/exports
file.- You want to give your own machine read-write access to the
/shares/nfs
directory. - As you are eventually mounting from your own machine, you can give permissions to
localhost
.
- You want to give your own machine read-write access to the
- After changing
/etc/exports
, useexportfs -a
command to publish the configuration change. - Mount the filesystem using the following command:
mount -t nfs <vm_name>.sa.cs.ut.ee:/shares/nfs /mnt/nfs
- This command should work without any output. If there is, something went wrong.
If your mount command worked without any output, then your filesystem should be mounted. This should be visible in the outputs of the following commands:
mount
df -hT
Also, when creating a file in /mnt/nfs
, it should appear in /shares/nfs
and vice versa.
SAMBA
Although NFS is fairly easy to set up and manage, it does not solve all the problems. Firstly, Windows machines do not play well with NFS, and secondly in NFS, you give access from the whole client machine. This is not desired in the context of enduser clients.
This is why Microsoft has made their own standard, called Samba.
- Install the following packages: samba samba-common samba-client cifs-utils
- Make the directory that you will export -
/shares/samba
- Create a samba group:
groupadd samba_group
- Add some users to this group:
- usermod -g samba_group scoring # Necessary for Nagios checks!
- usermod -g samba_group centos # Used for your own testing
- Add samba passwords to these users:
- smbpasswd -a scoring # Set the password to be same as Nagios,
2daysuperadmin
- Repeat for other users you want to use
- smbpasswd -a scoring # Set the password to be same as Nagios,
- Set appropriate permissions:
chmod -R 0755 /shares/samba
chown -R root:samba_group /shares/samba
chcon -t samba_share_t /shares/samba
- Edit the samba configuration file
/etc/samba/smb.conf
, and add following section:
[smb] comment = Samba valid users = @samba_group path = /shares/samba browsable =yes writable = yes
- Start the
smb
service.
Command testparm
shows whether samba configuration is valid.
You can use smbclient -L //localhost -U <user>
to list all the accessible shares in a system.
This command can be used with users that you gave a password to.
For mounting the filesystem, use the following command:
mount -t cifs -o username=scoring //localhost/smb /mnt/samba/
Object Storage
Even though NFS and Samba can be used to solve most of the problems, you'll be in a bit of a configuration hell once you start having machines in different parts/networks of the world, and try to share filesystems between them.
For an example, NFS requires at least ports 111 and 2049 on the server side. Opening weird ports is a security issue itself.
This is why Amazon Web Services decided to use the concept Object Storage to create their own standardized service called S3. Object Storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. [1] S3 service allows you to access storage as you would access a website, by using HTTP and HTTPS protocols.
Let's play around with object storage a bit. First of all, let's make an account. Now, your lab tutors have set up a system where you can add a user yourself. Usually, these users are made for you by the Object Storage service provider.
Inside your VM, there's a service called Consul
running. This service is responsible for advertising the existence of your machine to our Scoring server. This is how we find out your machine's IP address and name automatically.
This service also has a subsystem, called a Key-Value store. You can write values into this store. We have utilized this to allow you to make users for yourself.
Run the following command:
/usr/local/bin/consul kv put object/<machine_name> <password>
<password>
should be at least 8 characters long, alphanumeric string.
<machine_name>
should be lowercase, otherwise you cannot make a bucket later.
Make sure to not use a safe password, as this value can be read by any other student by doing /usr/local/bin/consul kv get object/<machine_name> <password>
.
Also, even though it's technically possible for you to change other people's passwords as well, if you do that, you will have automatically failed this course. (We will know, there's monitoring in place)
After having done that, you should now have access to the Object Storage. First things first, let's see if the Object Storage service answers and is accessible with your username and account.
Go to https://scoring.sa.cs.ut.ee:9000. Access Key
is your machine name, Secret Key
is your password.
When you sign in, it should look rather empty. If you cannot log in, make sure you chose an appropriate password, and try again.
This interface is a web interface for object storage. It is not always the same, as this is dependent on the service that provides the interface. Our interface is called "Minio", and it is completely open source (you can download and run it yourself).
It looks empty because we have configured your user to only have access to buckets with the same username, utilizing policies. Buckets are logical containers in which data is held in Object Storage. You can think of them like folders inside normal filesystems, but there are differences.
Here is the policy each of students users have been applied to:
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:CreateBucket", "s3:PutBucketPolicy", "s3:GetBucketPolicy", "s3:DeleteBucket", "s3:DeleteBucketPolicy", "s3:ListAllMyBuckets", "s3:ListBucket" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::${aws:username}" ], "Sid": "" }, { "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::${aws:username}/*" ], "Sid": "" } ] }
So let's configure your VM to use object storage, and make your own bucket.
Inside your VM, download and set up the following tool: https://docs.min.io/docs/minio-client-complete-guide.html We trust you to be able to download and install it yourself, but we give a few pointers:
- You want to use the "Binary Download (GNU/Linux)" download.
- Move the file to /usr/bin/, so you can use it without specifying the path.
- Make sure it is owned by root.
- Make sure it is executable.
mc --help
should output no errors.
After having done that, we need to configure your machine to be able to talk to the Object Service.
- Check all the configured hosts:
mc config host list
- Add our own:
mc config host add scoring https://scoring.sa.cs.ut.ee:9000 <machine_name> <password>
- Make sure it shows in the check command.
If everything went well, now we should be in a state where we can start using the storage. First things first, make a bucket.
mc mb scoring/<machine_name>
- You can also try making a bucket when substituting
<machine_name>
with something else, it should fail.
This is a point where a bucket should show up in the web interface as well. You can try playing with the storage now.
The command structure is as following:
mc [FLAGS] COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]
This means:
- List all files in a remote bucket:
mc ls <host>/<bucket>
, or in our case,mc ls scoring/<machine_name>
- Copy a local file to remote bucket:
mc cp /path/to/file <host>/<bucket>/<file>
- Cat a file in remote bucket:
mc cat <host>/<bucket>/<file>
All these files should show up in the web interface as well.
For the intents of Nagios check, please make a file called scoringtestfile
, with only your machine name written in it, and add it into the uppermost path in your bucket.
Nagios will do the following check:
mc cat scoring/<machine_name>/scoringtestfile
For the intents of a tutorial, this lab is over. You are free to play around with the object storage to learn, as Object Storage is widely used in Google Cloud Services and Amazon Web Services, and if you're ever going to be an IT technical person, you cannot get around these.
Extra materials for self-teaching:
mc
command complete guide: https://docs.min.io/docs/minio-client-complete-guide.htmlrestic
command - a backup solution built on top of Object Storage: https://restic.readthedocs.io/en/stable/