Tuesday, April 1, 2014

Advanced File Systems and Logical Volume Management

What is LVM, why do people use it and how does it work? 

With LVM there are three terms which are important to understand. The first term is physical volume. A physical volume is a hard drive or a partition. The second term is volume group. A volume group is a collection of physical volumes. If I have three hard drives (A, B and C), if I link drives A and B together I can consider them a volume group. Another hard drive, such a C, could be made into a separate volume group consisting of a single physical volume. The third term is logical volume. A logical volume is basically a file system which exists inside a volume group. If this is difficult to visualize I find it helps to think about cookies. 

Traditional file systems are like baking cookies. We scoop out some raw dough onto a pan. Each cookie is physically separate from all other cookies. Once we put the pan in the oven the cookies harden and come out of the oven as fixed-sized individual snacks. A cookie and a traditional file system are both of a fixed size, separate from all other cookies or partitions. They cannot be merged once made and resizing them is difficult. If you make eight cookies and ten friends come to visit you cannot simply make each cookie smaller, freeing up dough for the extra two guests. Likewise, if six people arrive you cannot dynamically erase two cookies and make the remaining six cookies bigger to satisfy your guests. Now, let's re-imagine cookie baking with LVM. With LVM what we do is take all of the cookie dough and spread it onto the pan as one big block. We put the block of dough in the oven and, when it comes out, we have a solid sheet, a giant cookie that we can then carve into as many pieces of any size we wish. It doesn't matter how many people show up now, because we can dynamically carve the block of cookie so each person gets a fair share. LVM lets us group all of our storage devices (cookie dough) into one big block so that we can carve up the block into separate, dynamic file systems. 

By now you are probably hungry for an example. For the purposes of this tutorial I am going to say I have two hard drives (sda and sdb). I will also assume we have our distribution's LVM packages installed. First I am going to create a LVM-compatible partition on sda. This partition will be called sda1. To do this I launch cfdisk or another partition manager and create a partition which takes up the entire drive. I set the partition type to be Linux-LVM, which is numerically identified by the code 8E. 

Our next step is to mark our device, sda1, as being a physical volume which can be used by LVM.

pvcreate /dev/sda1

Now we have a physical volume and we want to use it to create a volume group. We can create a volume group called datapool using the following command:

vgcreate datapool /dev/sda1

Now that we have a volume group consisting of one partition, sda1, we can divide the group into separate file systems or logical volumes. Here we create a logical volume called myhome and make it 50GB in size.

lvcreate -n myhome -L 50g datapool

Now we have a virtual partition, or logical volume, called myhome. The next thing we need to do is format it with a file system. In this example we use the ext3 file system to format myhome. Remember, the logical volume myhome exists within the volume group datapool.

mkfs.ext3 /dev/datapool/myhome

Finally, we get to mount the logical volume and start making use of it. Here we create a new mount point, called Data, and attach our new logical volume to the Data directory.

mkdir Data
mount /dev/datapool/myhome Data

Were we to run the df command right now we should see a 50GB file system mounted under the Data directory. This is great, but earlier we talked about resizing and how dynamic LVM can be. What if we want to make the logical volume myhome larger? We can do that by extending the logical volume and then resizing its file system. Here we grow the myhome volume by 100GB.

lvextend -L +100g datapool/myhome
resize2fs /dev/datapool/myhome

We do not even need to take the file system off-line or reboot or anything of that nature. Simply running these commands expands the logical volume and the file system on it. We now have a 150GB storage pool under the Data directory.

At the moment we just have one device, sda1, in our volume group. What if we run out of space and want to add a new hard drive to our storage pool? In that case the steps are similar to creating the volume group in the first place. We create a partition on our second disk, sdb, and make it of type Linux-LVM. We then mark the new device as a physical volume.

pvcreate /dev/sdb1

Next we add the new device to our volume group.

vgextend datapool /dev/sdb1

This gives us a whole new device in our volume group which we can then assign to a logical volume. We can either create a new logical volume and assign it its own mount point or we could add the new storage to our existing myhome logical volume using the lvextend command. If at any point we would like to see a list of physical volumes, volume groups or our logical volumes we can run special list commands to display the existing groups and their sizes. The commands pvs, vgs, and lvs list the existing physical volumes, volume groups and logical volumes, respectively. 

A word of warning about using LVM: It is a powerful and flexible technology which can be very useful in situations where data storage requires change. This makes LVM especially useful on servers where data can grow quickly and, sometimes, in unpredictable ways. However, there is a potential problem with using LVM and that is if one physical storage device fails we can lose all of the data stored in the volume group. For instance, if I have drives A, B and C in a volume group and drive C fails, I may have just lost all of my data stored in the entire volume group. For this reason it is very important to make regular backups of data stored on a volume group as files may be stretched across any or all devices inside the group.