0. Overview of Lab 8 :
Welcome to 8th lab. Here is a short list of actions that we will do in this lab:
- Installing updates
- Getting familiar with local filesystems
- EXT4, BTRFS, XFS, fat32 (for boots)
- We will go into depth ETX4 or BTRFSServ
- LVM partition or logical volume formatted with a specific type
- Getting familiar with network filesystems
- NFS v3/v4 - mounting/umounting serving
- SMB
- Object Storage
Before we can continue, figure out answers to these two questions
- How many disks are connected to your virtual machine?
- Think through, (with the help of for an example the
tree
command) directories which are important, so you would not mount over or delete these folders accidentally.
1. Local Filesystems
File systems (abbreviated as FS) are an integral part of any computer system. While a disk is a type of physical storage medium data is stored on, file systems control how data is written, stored and retrieved to said physical storage. As such, many different types exist, each with their own structure, logic, features and more. The file system layer acts as an intermediary between hardware, whether the data storage is on USB, harddrives, SSDs, or NVME type drives, a file system will mediate input-output(IO) between physical hardware and applications.
Local file systems usually run on a single, local machine and are directly attached to storage. Most local file systems are POSIX-compliant meaning it supports such system calls like read(), write(), and seek(). The main differences between different local FS types are seen from a user side perspective are scalability, performance and ease of use. The main things to consider are how large the FS needs to be, what unique features it should have and how it performs under the specific workload (e.g. databases), and which services and (operating) systems the FS should be compatible with.
In this lab, we will talk about EXT4, BTRFS, XFS and fat32 local file systems, as well as some shared/network filesystems.
EXT4
Extended journaling file system 4 (ext4) is the 4th generation of the ext FS family and was introduced as a default file system in RHEL6. It has a read and write backward compatibility with ext2 and ext3. The ext family filesystem was first implemented as the standard filesystem created specifically for the Linux kernel, using a structure inspired by the traditional Unix principles.
Traditionally, ext4 divides a disk into blocks, each with their own superblock, descriptors, mappings, inode tables and finally data blocks. This block based approach is often aligned with the physical sector size of harddisks to facilitate performance. To seek for a file, ext4 first reads the superblocks and inode tables to figure the location of the data and then the data itself. It is worth noting that each block had its own metadata, meaning that the metadata is inserted in between the actual data.
ext4 adds several new and improved features over the previous generation, such as:
- Supported maximum file system size up to 50 TiB
- Extent-based metadata
- Delayed allocation
- Journal checksumming
- Large storage support
The extent-based metadata and the delayed allocation features provide a more compact and efficient way to track utilized space in a file system. These features improve file system performance and reduce the space consumed by metadata. Delayed allocation allows the file system to postpone the selection of the permanent location for newly written user data until the data is flushed to disk. This enables higher performance since it can allow for larger, more contiguous allocations, allowing the file system to make decisions with much better information.
File system repair time using the fsck utility in ext4 is much faster than in ext2 and ext3. Some file system repairs have demonstrated up to a six-fold increase in performance. This is in large part due to the filesystem being “journaled” - a circular log of changes not yet synced to the disk is kept, which in case of a crash or power failure (commonly called dirty shutdown) allowed for quick recovery using the journal.
XFS
XFS is a highly scalable, high-performance, robust, and mature 64-bit journaling file system that supports very large files and file systems on a single host. It is the default file system in Red Hat Enterprise Linux. XFS was originally developed in the early 1990s by SGI and has a long history of running on extremely large servers and storage arrays.
... The features of XFS include:
- Reliability
- Metadata journaling, which ensures file system integrity after a system crash by keeping a record of file system operations that can be replayed when the system is restarted and the file system remounted
- Extensive run-time metadata consistency checking
- Scalable and fast repair utilities
- Quota journaling. This avoids the need for lengthy quota consistency checks after a crash.
- Scalability and performance
- Supported file system size up to 1024 TiB
- Ability to support a large number of concurrent operations
- B-tree indexing for scalability of free space management
- Sophisticated metadata read-ahead algorithms
- Optimizations for streaming video workloads
- Allocation schemes
- Extent-based allocation
- Stripe-aware allocation policies
- Delayed allocation
- Space pre-allocation
- Dynamically allocated inodes
- Other features
- Reflink-based file copies (new in Red Hat Enterprise Linux 8)
- Tightly integrated backup and restore utilities
- Online defragmentation
- Online file system growing
- Comprehensive diagnostics capabilities
- Extended attributes (xattr). This allows the system to associate several additional name/value pairs per file.
- Project or directory quotas. This allows quota restrictions over a directory tree.
- Subsecond timestamps
Performance characteristics of XFS
XFS has a high performance on large systems with enterprise workloads. A large system is one with a relatively high number of CPUs, multiple HBAs, and connections to external disk arrays. XFS also performs well on smaller systems that have a multi-threaded, parallel I/O workload.
XFS has a relatively low performance for single-threaded, metadata-intensive workloads: for example, a workload that creates or deletes large numbers of small files in a single thread.
BTRFS
Btrfs (B-Tree Filesystem) is a modern copy-on-write (CoW) filesystem for Linux. Btrfs aims to implement many advanced filesystem features while focusing on fault tolerance, repair, and easy administration. The btrfs filesystem is designed to support the requirement of high performance and large storage servers, especially across multiple disks.
Btrfs is intended to address the lack of pooling, snapshots, checksums, and integral multi-device spanning in Linux file systems. The core data structure differs from ext4 and xfs in a major way - instead of a “block” based allocation, data is kept in a b-tree format, a balanced tree of pointers. This structure allows easy handling of snapshots, by simply making the snapshot tree point to a different file/directory. It also facilitates seeking for files, since a tree can be much more easily traversed.
- Advanced features
- Copy-on-write (COW)
- is an optimization that if multiple callers ask for resources that are initially indistinguishable, they are pointed to the same resource. In other words, if a unit of data is copied but not modified, the copy exists as a reference to the original. Only in the case, that a new bite is written a copy is created.
- Checksums
A checksum is a sequence of characters used to check data for any occurred errors. If the checksum of an original file is known, you can confirm its copy is identical with a checksum utility. Data on the drives can ingrain tiny errors, for example, bit flips, that can happen from magnetic issues, solar flares or most commonly during copying. BTRFS has metadata checksums and it traverses all the files. Therefore you can set up the checksum scheduled to run and the FS will tell you if something has gone wrong and will try to find and fix the flipped bit.
Unfortunately, RedHat has removed the BTRFS support from RHEL 8 and consequently from CentOS 8. But it became a default FS for Fedora 33.
FAT32
The FAT(File allocation table) file system was originally developed for use on floppy disks. While no longer the default file system for Microsoft Windows, it is still used in smaller devices and storage media, particularly since it is easy to implement and is supported by every major OS (Win, Linux, Mac). The FAT format starts with a large table of all the file pointers of a partition, after which actual data blocks follow. This is in stark contrast to ext-based filesystems, where each bit of metadata is kept inside the file system block itself.
Fat has many limitations, the original version FAT8 only having 8-bit table entries. This allowed for a maximum partition size of 8MiB. With each implementation of FAT the data table size was increased with the last version FAT32 supporting block sizes that could be changed when formatting the partition, but with Windows standard settings a maximum file size of 4GiB and partition size of 2TiB was set for the last revision. FAT32 also supported little in extra features, especially in redundancy and recoverability. All these factors paired with limits to file name lengths has lead to FAT type filesystems being used in relatively niche circumstances, mainly in legacy hardware or applications.
Tasks for local filesystem
A partition organises a storage device into smaller segments meaning creating partitions allows you to use some percentage of your storage space for a specific purpose and leave the rest. A partition’s common usage case is as an ordinary filesystem.
1. Create the partition:
lsblk
command shows you all block devices attached to your system:
[root@labor6 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 512M 0 part /boot └─sda2 8:2 0 19.5G 0 part / sdb 8:16 0 1G 0 disk
Here you will see how many block devices you have, in this case sda
and sdb
. You can see that sda
has two partitions:
sda1
- your boot partitionsda2
- your root partition
sdb
has zero partitions and is not mounted, however, it may have some data on it. \\
To check this you can issue blkid
command:
[root@labor6 ~]# blkid /dev/sda1: LABEL="boot" UUID="57a362af-5447-4e88-9161-ff32a2feb513" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="5ac37ea1-01" /dev/sda2: LABEL="root" UUID="bfedb9fb-9860-47f2-8355-9b654605bbb4" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="5ac37ea1-02"
blkid
determines the type of content (for example fs or swap) that a block device holds and some attributes from the content metadata, for instance UUID, BLOCK_SIZE, TYPE.
Here you can see that sdb
is not listed meaning that sdb
doesn’t have a filesystem on it and therefore it is an empty block device.
The first letter, before permissions, in the output of ls -l
, shows the type of file.* @-@ for a file
- @d@ for directory
- @l@ for a link
In the directory /dev
, the block devices are shown as files and are indicated by letter @b@ in @ls -l@ output which stands for block device
.
[root@labor6 ~]# ls -l /dev/ | grep '^b' # This shows all files in /dev, and then filters for only those lines that start with the letter b brw-rw----. 1 root disk 8, 0 Mar 22 08:23 sda brw-rw----. 1 root disk 8, 1 Mar 22 08:23 sda1 brw-rw----. 1 root disk 8, 2 Mar 28 14:27 sda2 brw-rw----. 1 root disk 8, 16 Mar 22 08:23 sdb
Let’s make use of an unused device sdb
. We can, by default, create several partitions from sdb1
to sdb4
. Partitions are created using the fdisk
command. It gives a detailed overview of the block devices, partitions and logical volumes:
[root@labor6 ~]# fdisk -l Disk /dev/sda: 20 GiB, 21474836480 bytes, 41943040 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x5ac37ea1 Device Boot Start End Sectors Size Id Type /dev/sda1 * 2048 1050623 1048576 512M 83 Linux /dev/sda2 1050624 41943006 40892383 19.5G 83 Linux Disk /dev/sdb: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes
fdisk
is an interactive tool to create/manage/delete partitions.
- Let’s start with creating partitions on sdb ->
fdisk /dev/sdb
[root@labor6 ~]# fdisk /dev/sdb Welcome to fdisk (util-linux 2.32.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0x05d607e2. The message is expected as it is an unused block device. Select ‘m’ option to view all the available options Command (m for help): m Help: DOS (MBR) a toggle a bootable flag b edit nested BSD disklabel c toggle the dos compatibility flag Generic d delete a partition F list free unpartitioned space l list known partition types n add a new partition p print the partition table t change a partition type v verify the partition table i print information about a partition Misc m print this menu u change display/entry units x extra functionality (experts only) Script I load disk layout from sfdisk script file O dump disk layout to sfdisk script file Save & Exit w write table to disk and exit q quit without saving changes Create a new label g create a new empty GPT partition table G create a new empty SGI (IRIX) partition table o create a new empty DOS partition table s create a new empty Sun partition table
Let’s print the partition table by selecting ‘p’. At the moment the table is currently empty:
Command (m for help): p Disk /dev/sdb: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x05d607e2
- Now let’s select option ‘n’ to add a partition:
Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p):
Let us stop for a moment and talk about partition types: Primary, Extended and Logical. You can only create 4 partitions on a given block device. This limit is caused by the MBR’s partition which is only 1-2kb in size and therefore cannot hold more information than 4 partitions. Hence, the concept of partitions was extended into Primary, Extended and Logical.
Primary partition - a conventional partitions created by default unless specified otherwise. If you want to make your partition bootable that has to be Primary partition.
Extended partition - a special type of partition which is intended to hold multiple logical partitions. It can be thought about as a container for all logical partitions. An extended partition allows you to create many logical partitions solving the problem of 4 primary partitions.
Logical partition - a partition created inside an extended partition. They can be used in the same way as a primary partition.
You should ensure to create an extended partition if inthe future you may need or want to have more than 4 partitions.
Let’s resume our practical part and create an extended partition which is going to be 1GiB in size and then create two logical partitions there:
Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): e Partition number (1-4, default 1): First sector (2048-2097151, default 2048): Last sector, +sectors or +size{K,M,G,T,P} (2048-2097151, default 2097151): Created a new partition 1 of type 'Extended' and of size 1023 MiB.
Why does the first sector start at 2048 rather than 0?
Print the partition table and have a look at the created partition, we already used the command to see the table. Here is the output:
Disk /dev/sdb: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xa60b1417 Device Boot Start End Sectors Size Id Type /dev/sdb1 2048 2097151 2095104 1023M 5 Extended
Note, this hasn’t been enforced yet, we have to write ‘w’ to apply the changes. These changes are more like a preview.
Let’s go ahead and create two logical partitions to populate the container.
- Logical partition 1 should have first sector = 4096 and the last sector = 1048574
- Logical partition 2 should have the first sector = 1050624 and the last sector = 2097151.
See the results which should look roughly like these:
Command (m for help): p Disk /dev/sdb: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xa60b1417 Device Boot Start End Sectors Size Id Type /dev/sdb1 2048 2097151 2095104 1023M 5 Extended /dev/sdb5 4096 1048574 1044479 510M 83 Linux /dev/sdb6 1050624 2097151 1046528 511M 83 Linux
After ensuring that we have achieved a desirable outcome. Let’s apply the changes by writing this new partition table, by issuing ‘w’ option:
Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks.
Check for the created partition with
lsblk
or
ls -l /dev | grep “sdb”
2. Format the partition, aka install a filesystem:
To do so you need to use mkfs
command:
[root@labor6 ~]# mkfs #use double TAB to see the list mkfs mkfs.cramfs mkfs.ext2 mkfs.ext3 mkfs.ext4 mkfs.fat mkfs.minix mkfs.msdos mkfs.vfat mkfs.xfs
Let’s install ext4
filesystem on the sdb5
and xfs
on the sdb6
and call them lab7FSext4
and lab7FSxfs
accordingly. Make sure to use -L
flag and assign the given names:
[root@labor6 ~]# mkfs.ext4 -L "lab7FSext4" /dev/sdb5 mke2fs 1.45.6 (20-Mar-2020) Discarding device blocks: done Creating filesystem with 522236 1k blocks and 130560 inodes Filesystem UUID: 16d13727-6e25-40a8-9459-39846f4c77f4 Superblock backups stored on blocks: 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409 Allocating group tables: done Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done
Repeat the step and create a xfs
partition and name it properly, then run blkid
:
[root@labor6 ~]# blkid /dev/sda1: LABEL="boot" UUID="57a362af-5447-4e88-9161-ff32a2feb513" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="5ac37ea1-01" /dev/sda2: LABEL="root" UUID="bfedb9fb-9860-47f2-8355-9b654605bbb4" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="5ac37ea1-02" /dev/sr0: BLOCK_SIZE="2048" UUID="2021-03-21-11-55-42-00" LABEL="config-2" TYPE="iso9660" /dev/sdb5: LABEL="lab7FSext4" UUID="16d13727-6e25-40a8-9459-39846f4c77f4" BLOCK_SIZE="1024" TYPE="ext4" PARTUUID="a60b1417-05" /dev/sdb6: LABEL="lab7FSxfs" UUID="916d1f69-d90f-4077-b23f-360619a1b6bf" BLOCK_SIZE="512" TYPE="xfs" PARTUUID="a60b1417-06"
As soon as a partition is formatted, a UUID is assigned to that block device. Any unformatted block devices won’t appear in the above list. All of this info is useful when adding entries to the /etc/fstab
file.
You can analyse block devices with file
command:
[root@labor6 ~]# file -sL /dev/sdb5 /dev/sdb5: Linux rev 1.0 ext4 filesystem data, UUID=16d13727-6e25-40a8-9459-39846f4c77f4, volume name "lab7FSext4" (extents) (64bit) (large files) (huge files)
3. Mount the partition:
By running lsblk, you can see sdb5 and sbd6 are not mounted yet, so we need to mount them (Make sure they are formatted by checking blkid command). You can manually mount and unmount the partitions with commands mount and umount. However, mount command is not persistent across boots, so we also need to deal with that.
Warning: it is highly important to test your /etc/fstab because any errors inside will stop your machine reboot.
Let’s take a look what is inside the /etc/fstab
file:
- [root@labor6 ~]# cat /etc/fstab
# # /etc/fstab # Created by anaconda on Sun Mar 1 19:48:45 2020 # # Accessible filesystems, by reference, are maintained under '/dev/disk/'. # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info. # # After editing this file, run 'systemctl daemon-reload' to update systemd # units generated from this file. # UUID=bfedb9fb-9860-47f2-8355-9b654605bbb4 / xfs defaults 0 0 UUID=57a362af-5447-4e88-9161-ff32a2feb513 /boot xfs defaults 0 0
From man fstab
: “The file fstab contains descriptive information about the filesystems the system can mount. fstab is only read by programs, and not written; it is the duty of the system administrator to properly create and maintain this file. Each filesystem is described on a separate line. Fields on each line are separated by tabs or spaces. Lines starting with '#' are comments. Blank lines are ignored.”
You need to add a new entry in this file to automount your FS. Each entry has 6 fields.
1 column - the block special device or remote filesystem to be mounted
2 column - the mount point (target) for the filesystem
3 column - the type of the filesystem
4 column - the mount options associated with the filesystem
5 column - This field is used by dump(back-up utility) to determine which filesystems need to be dumped. Defaults to zero (don't dump) if not present.
6 column - used by fsck(8) to determine the order in which filesystem checks are done at boot time
- Create two directories where the partitions will be mounted into with
mkdir -p
command and name them/lab7/fs-ext4
and/lab7/fs-xfs
accordingly.
Use your favourite text editor to open /etc/fstab
and add two lines to describe your partitions.
Column 4 -> defaults
Column 5 -> 0
Column 6 -> 0
Here is an example of lines UUID=V2VsbCBEb25lISBIT1QgVElQIGlzIGJsa2lk /lab7/fs-ext4 ext4 defaults 0 0 UUID=V2VsbCBEb25lISBIT1QgVElQIGlzIGJsa2lk /lab7/fs-xfs xfs defaults 0 0
Now execute
mount -a
command to test your new configuration. It will mount all devices specified in the /etc/fstab
, unless they are already mounted.
Then run
systemctl daemon-reload
to update systemd and
lsblk
to check out your new setup
4. snapshots; restoring
https://docs.ansible.com/ansible/latest/collections/ansible/posix/mount_module.html
Warning! Ansible has modules to create/delete/format FS, we don’t recommend to do so as it might mess up your machine. However, it is fine to mount with Ansible.
2. Network Filesystems (NFS)
NFS is a distributed file system protocol originally developed by Sun Micorsystems (Sun; now Oracle) in 1984. The protocol allows a user on a client computer to access files over a computer network much like as if local storage is accessed. The protocol builds on the Open Network Computing Remote Procedure Call (ONC RPC) system. NFS is an open standard defined in a Request for Comments (RFC), allowing anyone to use the protocol.
NFS tasks:
NFS is often enough the easiest way to set up filesystem sharing in UNIX environments (BSD, Linux, Mac OSX). As students have one machine to work with, your machine will both be the NFS server (party providing the filesystem) and NFS client (party accessing the filesystem).
Important files:
/etc/fstab
# This file is for describing filesystems that the OS should automount on boot/etc/exports
# This is the file you describe your exported NFS filesystems from
- Install the
nfs-utils
package. - Start
nfs-server
service. - Use the man pages or the internet, to find out the format of
/etc/exports
file. - Make the directory that you will export -
/shares/nfs
- Make the directory you will mount the exported filesystem to -
/mnt/nfs
. - Setup export using
/etc/exports
file.- You want to give your own machine read-write access to the
/shares/nfs
directory. - As you are eventually mounting from your own machine, you can give permissions to localhost.
- You want to give your own machine read-write access to the
- After changing
/etc/exports
, useexportfs -a
command to publish the configuration change. - Mount the filesystem using the following command:
mount -t nfs <vm_name>.sa.cs.ut.ee:/shares/nfs /mnt/nfs
- This command should work without any output. If there is, something went wrong.
If your mount command worked without any output, then your filesystem should be mounted. This should be visible in the outputs of the following commands:
- mount
- df -hT
Also, when creating a file in /mnt/nfs, it should appear in /shares/nfs and vice versa.
CIFS, SMB, SAMBA
SMB protocol is another communication protocol for providing shared access to files, printers and serial ports between nodes on a network. The Server Message Block (SMB) protocol is a network file sharing protocol that allows applications on a computer to read and write to files and to request services from server programs in a computer network. The SMB protocol can be used on top of its TCP/IP protocol or other network protocols. Using the SMB protocol, an application (or the user of an application) can access files or other resources at a remote server. This allows applications to read, create, and update files on the remote server. SMB can also communicate with any server program that is set up to receive an SMB client request. SMB is a fabric protocol that is used by Software-defined Data Center (SDDC) computing technologies, such as Storage Spaces Direct, Storage Replica. For more information, see Windows Server software-defined datacenter.
The first common implementation of SMB is called CIFS (Common Internet File System), which was originally developed by IBM, then later merged under the SMB name. With the release of Windows 95 in the early 1990’s, Microsoft has made considerable modifications to the most commonly used SMB version. Microsoft then merged the updated version of the SMB protocol (and rebranded it as CIFS) with the LAN Manager product bringing both client and server support. With later revisions, Microsoft then dropped the CIFS name in favor of SMB
While the first revision of SMB was developed by IBM to run on NetBIOS, unix-like operating systems don't support SMB protocol communication by default and that is why SAMBA was created to accommodate communication between two systems. Samba runs on linux clients but implements a native Windows protocol, which means SAMBA is an open source CIFS/SMB implementation. Samba is the standard Windows interoperability suite of file systems on unix-like operating systems.
Recap * CIFS was developed by IBM * Microsoft modified it and released CIFS/SMB 1.0 * Microsoft then dropped the CIFS name and continued with SMB 2.0 * SAMBA was then created as a tool to create and mount CIFS/SMB type share file systems
- Install the following packages: samba samba-common samba-client cifs-utils
- Make the directory that you will export - /shares/samba
- Create a samba group:
- groupadd samba_group
- Add some users to this group:
- usermod -a -G samba_group scoring # Necessary for Nagios checks!
- usermod -a -G samba_group centos # Used for your own testing
- Add samba passwords to these users:
- smbpasswd -a scoring # Set the password to be same as Nagios, 2daysuperadmin
- Repeat for other users you want to use
- Set appropriate permissions:
- chmod -R 0755 /shares/samba
- chown -R root:samba_group /shares/samba
- chcon -t samba_share_t /shares/samba
- Edit the samba configuration file /etc/samba/smb.conf, and add following section:
[smb] comment = Samba share for SA lab valid users = @samba_group path = /shares/samba browsable =yes writable = yes
- Start and enable the smb service.
We will also need to open the following ports
- 139 and 445 TCP
Command testparm
shows whether samba configuration is valid.
You can use smbclient -L //localhost -U <user>
to list all the accessible shares in a system.
This command can be used with users that you gave a password to.
For mounting the filesystem, use the following command:
mount -t cifs -o username=scoring //localhost/smb /mnt/samba/
Object Storage
Object storage, often referred to as object-based storage, is a data storage architecture for handling large amounts of unstructured data. This is data that does not conform to, or cannot be organized easily into, a traditional relational database with rows and columns. Today’s Internet communications data is largely unstructured. This includes email, videos, photos, web pages, audio files, sensor data, and other types of media and web content (textual or non-textual). This content streams continuously from social media, search engines, mobile, and “smart” devices.
Objects are discrete units of data that are stored in a structurally flat data environment. There no complex hierarchies like folders, symlinks, file permissions or file types as in a file-based system. Each object is a simple, self-contained repository that includes the data, metadata (descriptive information associated with an object), and a unique identifying ID number (instead of a file name). This information enables an application to locate and access the object. You can aggregate object storage devices into larger storage pools and distribute these storage pools across locations. This allows for unlimited scale, as well as improved data resiliency and disaster recovery. Object storage removes the complexity and scalability challenges of a hierarchical file system with complex hierarchies. Objects can be stored locally, but most often reside on cloud servers, with accessibility from anywhere in the world.
You can store any number of static files on an object storage instance to be called by an API. Additional RESTful API standards are emerging that go beyond creating, retrieving, updating, and deleting objects. These allow applications to manage the object storage, its containers, accounts, multi-tenancy, security, billing, and more.
In reality, object storage is being used as a bit of a medium to accomodate normal workflows as well. Most object storage software still allows using folders to segregate objects into different folders, while allowing normal, filesystem like access on top of API access.
This is why Amazon Web Services decided to use the concept Object Storage to create their own standardized service called S3. Object Storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. [1] S3 service allows you to access storage as you would access a website, by using HTTP and HTTPS protocols.
Even though NFS and Samba can be used to solve most file sharing problems, you'll be in a bit of a configuration hell once you start having machines in different parts/networks of the world, with tera- or petabytes of data on them, and try to share filesystems between the machines. For an example, NFS requires at least ports 111 and 2049 on the server side. Opening weird ports is a security issue itself. On top of that, both Samba and NFS are very difficult to loadbalance with. Object Storage, due to it's unique HTTP based workflow, uses the same tools as for load balancing as, for an example, websites.
Let's play around with object storage a bit. First of all, let's make an account. Now, your lab tutors have set up a system where you can add a user yourself. Usually, these users are made for you by the Object Storage service provider.
Inside your VM, there's a service called Consul
running. This service is responsible for advertising the existence of your machine to our Scoring server. This is how we find out your machine's IP address and name automatically.
This service also has a subsystem, called a Key-Value store. You can write values into this store. We have utilized this to allow you to make users for yourself.
Run the following command:
/usr/local/bin/consul kv put object/<machine_name> <password>
<password>
should be at least 8 characters long, alphanumeric string.
<machine_name>
is usually your matrix number, unless you made a mistake in naming it. Do not use the full domain name. This value should be lowercase, otherwise you cannot make a bucket later.
Make sure to not use a safe password, as this value can be read by any other student by doing /usr/local/bin/consul kv get object/<machine_name> <password>
.
Also, even though it's technically possible for you to change other people's passwords as well, if you do that, you will have automatically failed this course. (We will know, there's monitoring in place)
After having running the put
command, you should now have access to the Object Storage.
First things first, let's see if the Object Storage service answers and is accessible with your username and account.
Go to https://scoring.sa.cs.ut.ee:9000. Access Key
is your machine name, Secret Key
is your password.
When you sign in, it should look rather empty. If you cannot log in, make sure you chose an appropriate password, and try again.
This interface is a web interface for object storage. It is not always the same, as this is dependent on the service that provides the interface. Our interface is called "Minio", and it is completely open source (you can download and run it yourself).
It looks empty because we have configured your user to only have access to buckets with the same username, utilizing policies. Buckets are logical containers in which data is held in Object Storage. You can think of them like folders inside normal filesystems, but there are differences.
Here is the policy each of students users have been applied to:
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:CreateBucket", "s3:PutBucketPolicy", "s3:GetBucketPolicy", "s3:DeleteBucket", "s3:DeleteBucketPolicy", "s3:ListAllMyBuckets", "s3:ListBucket" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::${aws:username}" ], "Sid": "" }, { "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:GetObject", "s3:ListMultipartUploadParts", "s3:PutObject" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::${aws:username}/*" ], "Sid": "" } ] }
So let's configure your VM to use object storage, and make your own bucket.
Inside your VM, download and set up the following tool: https://docs.min.io/docs/minio-client-complete-guide.html We trust you to be able to download and install it yourself, but we give a few pointers:
- You want to use the "Binary Download (GNU/Linux)" download.
- Move the file to /usr/bin/, so you can use it without specifying the path.
- Make sure it is owned by root.
- Make sure it is executable.
mc --help
should output no errors.
After having done that, we need to configure your machine to be able to talk to the Object Service.
- Check all the configured hosts:
mc config host list
- Add our own:
mc config host add scoring https://scoring.sa.cs.ut.ee:9000 <machine_name> <password>
- Make sure it shows in the check command.
If everything went well, now we should be in a state where we can start using the storage. First things first, make a bucket.
mc mb scoring/<machine_name>
- You can also try making a bucket when substituting
<machine_name>
with something else, it should fail.
This is a point where a bucket should show up in the web interface as well. You can try playing with the storage now.
The command structure is as following:
mc [FLAGS] COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]
This means:
- List all files in a remote bucket:
mc ls <host>/<bucket>
, or in our case,mc ls scoring/<machine_name>
- Copy a local file to remote bucket:
mc cp /path/to/file <host>/<bucket>/<file>
- Cat a file in remote bucket:
mc cat <host>/<bucket>/<file>
All these files should show up in the web interface as well.
For the intents of Nagios check, please make a file called scoringtestfile
, with only your machine name written in it, and add it into the uppermost path in your bucket.
Nagios will do the following check:
mc cat scoring/<machine_name>/scoringtestfile
For the intents of a tutorial, this lab is over. You are free to play around with the object storage to learn, as Object Storage is widely used in Google Cloud Services and Amazon Web Services, and if you're ever going to be an IT technical person, you cannot do without these.
Extra materials for self-learning:
mc
command complete guide: https://docs.min.io/docs/minio-client-complete-guide.htmlrestic
command - a backup solution built on top of Object Storage: https://restic.readthedocs.io/en/stable/