How to Configure Block Storage on Linux

How to Configure Block Storage on Linux

Understanding and managing storage in Linux is essential. There are several key concepts and commands for working with block storage.

If you are an absolute beginner in configuring storage on Linux, then this article is for you.

Here we will discuss:

  • Block storage device naming and the /dev directory
  • Disk partitioning
  • Creating and mounting filesystems
  • Logical Volume Manager (LVM)

    Prerequisites

    • A server running Linux preferably CentOS 7 or higher
    • At least 2 free storage devices

    Introduction

    Every physical server or virtual machine has some directly attached block devices. These can be HDD, SSD, or USB Flash drives.

    Whenever a new block device is inserted into the system, the available capacity is not immediately accessible. In other words, files and directories cannot be directly created on the device.

    Some basic configuration is required to make the storage space available. It includes creating partitions, creating filesystems, and mounting.

    Before we start discussing the commands relative to these operations, let’s talk about the /dev directory and block device naming.

    Block storage device naming

    Linux handles storage devices as files. These files are stored under the /dev directory. It’s critical to understand this concept because most storage commands take files from this directory as their arguments. Many system programs also use these files as the interface for Input/Output or I/O operations with the block devices.

    Whenever a new block device is inserted or detected, the system automatically creates a new corresponding file under the directory /dev.

    The name of this file depends on several things. Most importantly, it depends on the device bus type and the order in which the device was inserted relative to the other devices.

    For example, if the bus type is SCSI or SATA, the first half of the file name is sd. The other half is a letter assigned in alphabetical order depending on the order in which the device was detected.

    The first, second, and third SCSI devices on a system are named sda, sdb, and sdc, respectively.

    Devices with other bus types are subject to slightly different naming rules. For NVMe, which is a modern bus type, the first part of the device name is nvme. There is also a bus number added in the middle. The final part is a letter and a number, which indicate the order of detection. For example: nvme0n1 or nvme0n2.

    We can list the /dev directory content to get an overview of all the block devices present on the system.

    ls /dev

    There are many files, including sda, sdb, sda1, sda2, nvme0n1, and nvme0n2.

    Note that /dev also includes other files that are not related to storage devices.

    sda and sdb are block devices. However, sda1 and sda2 don’t represent actual block devices. They represent partitions, which are discussed next.

    Partitioning

    What is partitioning

    Partitioning is the operation of dividing the block device into logical block devices called partitions. It is the first operation to do before we can use the disk.

    To list all disks and partitions present on the system, use the command:

    lsblk

    lsblk shows a clear tree structure of disks and their partitions. Also, the “type” column of the output indicates whether the line represents a disk, partition, or LVM. (LVM type is an advanced storage scheme that is discussed at the end)

    There are mainly two partitioning schemes: the Master Boot Record (MBR) and the GUID Partition Table (GPT). MBR is a legacy partitioning scheme, but it is still used. It is limited to a maximum disk size of 2 TiB. GPT partitioning is preferred because it breaks the limit of the 2 TiB disks and introduces other enhancements.

    How to create a partition

    The fdisk command is used to create MBR partitions, while the gdisk command is used to create GPT partitions. Both are interactive commands and are very similar.

    These commands create a partition table on the first few sectors of a disk. (With GPT, the partition table is also backed up at the last few disk sectors).

    This table contains all the information about the partitions. For example, it includes the sector addresses where every partition begins and ends. (A sector is the minimum read/write block size used by disks, it’s usually 512 Bytes).

    Let’s see an example. We have a CentOS VM with a 20 GB disk named sdb. Note that on your machine, the disk name could be different. Make sure to select the right disk. We will create a new GPT partition table and two partitions with 2 GB each.

    Storage related commands are usually restricted to the root user only. So login as root and use the command below:

    gdisk /dev/sdb

    The output above indicates that the sdb disk doesn’t have any partition table yet. Press the question mark character “?” to see all available commands.

    We use the “n” command to create a new partition, which also creates a new partition table. The command asks for parameters one by one:

    1. Partition number: because the disk doesn’t have any partitions yet, type 1 and press enter.
    2. First sector: the partition starts from this sector. Leave it empty and press enter. The default value, which is the first free sector, is selected automatically.
    3. Last Sector: it is where we indicate the required size of the partition. Type +2G, then press enter. Note that we can actually put the sector number, but this requires a further calculation to determine the exact sector that yields a partition of 2GB in size.
    4. Partition file system: many filesystems are available. Keep the default value of “Linux filesystem” by leaving it empty and pressing enter.

    Repeat the same process to create another 2GB partition. Then type “w” and then “y” to write everything, confirm, and quit.

    To see the change on the sdb device, use the command below:

    lsblk /dev/sdb

    We can see that we have the sdb disk with 20 GB total size and two partitions of 2 GB. The sdb drive still has 16 GB of free space on which we can create more partitions later.

    Filesystems

    What is a Filesystem?

    A Filesystem is built on top of a partition. A partition without a file system can be seen as just a long array of sectors. The filesystem is a logical abstraction layer that enables the creation of a hierarchy of files and directories.

    There are many types of filesystems. The most widely used in Linux are ext4 and xfs. xfs is the default file system on modern Red Hat and CentOS distributions, while Ubuntu uses ext4.

    Creating a Filesystem

    Let’s continue with our example. We will create an xfs file system on the sdb1 partition that we have already created. The command to make a filesystem is mkfs. It uses the -t argument to specify the filesystem type.

    mkfs -t xfs /dev/sdb1

    Mounting a Filesystem

    After creating the filesystem, we are one more step away from utilizing the partition storage space. That step is mounting.

    Mounting a filesystem is the operation of linking the filesystem created on top of a partition to the system’s file hierarchy. It includes choosing a directory on the file hierarchy to host the filesystem built on the partition. This directory is called a mount point.

    Let’s create a new mount point called “data” and mount the sdb1 filesystem under /data

    To confirm the success of the mount operation, we can list all the mounted filesystems by using the command below:

    mkdir /data
    mount /dev/sdb1 /data
    df -h

    The last line indicates that the filesystem has been mounted under /data directory. Now it can be used to store files. All files created under /data directory are stored in the sdb1 partition.

    Note that the xfs file system by default occupies some of the available space as metadata, which explains the used 33MB.

    However, a manually created mount with the mount command does not survive a reboot. In most cases, the mount should last after a reboot. The mount parameters must be written in the file /etc/fstab to mount a partition persistently.

    /etc/fstab by default contains all the persistent mounts required by the system. The root user can further add user required mounts. Let’s see the content of this file:

    cat /etc/fstab

    Each line in this file that doesn’t start with the hash character “#” corresponds to a mount, which is automatically created during the boot process. Each line contains 6 fields:

    1. Block device: device file such as /dev/sdb1. Note that other device identifiers such as UUID also exist.
    2. Mount point.
    3. Filesystem type: xfs, ext4, vfat … etc.
    4. Mount options: these options are handed over to the mount command at boot. The value “defaults” is usually selected unless some different mount options are required.
    5. Backup: this field takes the value 1 to enable filesystem backup at boot or 0 to disable it. The backup is done by the dump utility. Most systems do not come with the dump program installed by default, so this field is usually set to 0.
    6. Filesystem check: this field takes the value 1 to enable filesystem check at boot or 0 to disable it. The filesystem check is done by the fsck command, which reports any inconsistencies or errors in the filesystem structure.

    Let’s add a line to persistenly mount the /dev/sdb1 partition under the /data directory. You can edit the /etc/fstab directly or just append the line to the end of the file with the command below:

    echo "/dev/sdb1 /data xfs defaults 0 0" >> /etc/fstab

    Use the cat command again to make sure the line was added successfully.

    cat /etc/fstab

    When a new line is added, the mount operation doesn’t occur unless the system is rebooted. To ensure the mount configuration is correct without rebooting, use the mount command with –a argument. ( -v for verbosity)

    mount -av

    The output indicates a successful mount. (Ignore the SELinux message)

    The command mounts every line in /etc/fstab that is not mounted yet. After editing /etc/fstab, it is always recommended to use mount -a. Because if a typo or error exists in the file, the command reports an error and exits. If there is an undetected typo or issue in this file, the system will not boot correctly on the next restart.

    In some cases, putting the device file such as /dev/sdb1 in the first field of /etc/fstab could be problematic. If block devices are removed then inserted in a different order, the names under /dev automatically change.

    For example, the block device sdb could become sdc and vice versa. Hence, partition sdb1 becomes sdc1. /etc/fstab is a static file. Not knowing about the naming change, the system always mounts sdb1 to /data, which results in a wrong mount.

    The filesystem’s Universally Unique Identifier or UUID is used in the first field of /etc/fstab instead of the device block file to prevent this issue.

    Every filesystem is assigned a UUID when it’s created.

    To determine the UUID of the filesystem residing on the /deb/sdb1 partition, use the command below:

    blkid /dev/sdb1

    The first UUID listed in the output of blkid corresponds to the filesystem. The filesystem’s UUID doesn’t change even if the device name changes. Therefore, using the filesystem’s UUID in /etc/fstab guarantees that the same filesystem is always mounted under the correct directory.

    Edit /etc/fstab to replace the /dev/sdb1 line with the one below:

    UUID=cba91aba-318c-be9a-a280fd185afc /data xfs defaults 0 0

    Use cat command to make sure it has been updated.

    cat /etc/fstab

    After editing /etc/fstab, use command mount –a to mount the filesystems. If device names change, the mounts don’t get mixed up. The system always mounts the filesystem with the correct UUID to the corresponding mount point regardless of the partition’s name.

    Logical Volume Manager (LVM)

    What is LVM

    LVM is a Linux storage technology that introduces an extra layer of abstraction on top of disks and partitions to provide advanced features such as logical volumes, thin provisioning, snapshots, and software RAID.

    Most modern distributions support LVM. It overcomes the limitations imposed by partitions. For example, expanding or reducing partition sizes is very delicate, error-prone, and sometimes impossible. With LVM, changing the size of a volume is straightforward.

    How is LVM structured

    LVM consists of three main layers.

    1. Block devices such as disks or partitions are referred to as Physical Volume (PV).
    2. LVM consolidates all PVs under one or more storage pools called Volume Group (VG).
    3. The available storage in a VG is used to create virtual block devices called Logical Volume (LV).

    The LV is used to store data. It is comparable to a regular block device. After creating an LV, a filesystem must be built on top of it, and then it is mounted. The same steps for creating and mounting a filesystem for a regular partition apply for an LV.

    LVM provides flexibility for managing storage. The size of an LV could be tailored based on needs. In the example figure below, 3 drives of 1 TB are merged into a VG, and 2 LVs with different sizes are created: 1TB and 2TB.

    Once all drives are merged into a VG, different combinations of LVs can be created. For example, an LV can take all the available space on a VG. In such a case, the LV is spread on all disks. If the LV is about to be full, more drives can be added to the VG, and then the LV can be expanded. If the LV is not longer needed, it is deleted and its space is returned to the owning VG.

    How to Create Volume Group (VG) and Logical Volume (LV)

    Let’s see a configuration example of LVM. We have a server with 2 free disks and 1 free partition. Below is the output of lsblk.

    In our example, one VG is created. It contains the drives nmvme0n1 (8GB), nvme0n2 (8GB), and the sdb2 partition (2GB). The VG is used to create one LV of 15GB. The LV contains an xfs file system and is mounted under directory /data2.

    Create PVs

    The first step is to label the drives as Physical Volumes (PV). This is done in batch through the pvcreate command as indicated below:

    pvcreate /dev/sdb2 /dev/nvme0n1 /dev/nvme0n2

    Create a VG

    The second step is to create a VG called “data_vg” using the newly labeled PVs.

    vgcreate data_vg /dev/sdb2 /dev/nvme0n1 /dev/nvme0n2

    To list information about the available VGs we can use two commands: vgs which displays brief info about VGs and vgdisplay which show more info.

    vgs

    vgs lists a VG per line, how many PVs per VG, available and free space per VG. The “centos” VG is automatically created by the system during the installation phase.

    vgdisplay

    You may notice that the newly created VG data_vg has 17.99 GiB of free space instead of 18 GB which is the sum of storage available on all PVs.

    Actually, the VG size is often slightly different than the sum of the storage space available on the PVs. Because in LVM, PVs are divided into blocks called Physical Extents (PE). The default size of the PE is 4MB. These blocks are combined to build the VG. So, the VG size is always a multiple of 4MB which explains why the VG is not exactly 4MB.

    Note that the PE’s size could be changed during the creation of the VG but in our example we left the default value.

    Create an LV

    Create an LV called data_lv using the command lvcreate. It takes the LV size, name, and the parent VG as arguments.

    lvcreate – size 15G – name data_lv data_vg

    To list available LVs, use commands lvs or lvdisplay.

    The LV is a logical block device, hence it must have a corresponding file. When a VG is created, a directory with the VG’s name is created under /dev that holds a file for every created LV. In our example, the file that corresponds to data_lv is /dev/data_vg/data_lv. We can verify this by listing the content of /dev/data_vg.

    ls /dev/data_vg

    Now we create a filesystem on data_lv using mkfs.

    mkfs -t xfs /dev/data_vg/data_lv

    Finally we create the mount point /data2 and mount data_lv. Then we verify the mount with df command.

    mkdir /data2
    mount /dev/data_vg/data_lv /data2
    df -h

    Now the LV is mounted under /data2 and we can use it to store data. If the space is about to be finished lvextend command can be used to add space to the LV from the available VG space. If the VG is also about to finish we can add a drive or a partition to the VG using command vgextend.

    Conclusion

    In this article, we introduced basic concepts and commands for working with storage on Linux. The key takeaways are:

    • Linux treats block storage devices as files stored under the /dev direcotry
    • The name of files under /dev depends on the drive characteristics
    • To use storage space on a drive, first, we must partition it, create a filesystem then mount it
    • /etc/fstab is used to create mounts that survive a reboot
    • LVM is a storage technology that consolidates all storage space under one pool.
    • Key commands: lsblk, gdisk, fdisk, mkfs, blkid, mount, df, pvcreate, vgcreate, lvcreate.

    LVM was shortly described in this article. It supports more advanced features such as RAID, snapshots and thin provisioning which are beyond the scope of this article.

    Man pages and distribution documentation are a great source of information that you should read to deepen your knowledge about storage.

    Having the foundational knowledge discussed in this article will enable you to understand and configure more advanced storage technologies in Linux. Such as, Network File System (NFS), iSCSI, Virtual Disk Optimizer (VDO), Stratis, and many more.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Previous Post
    How to Use Syncplay to Synchronize VLC Player on Multiple Computers

    How to Use Syncplay to Synchronize VLC Player on Multiple Computers

    Next Post
    The Best Thunderbolt 3 Monitors

    11 Best Thunderbolt 3 Monitors – Buying Guide (2021)

    Related Posts