• Dec

    07

    2011

    Disclaimer: This is one of those semi-selfish posts, where I need to perform a task that I do only once in a while. So, to save myself from having to search these steps every time, I’m documenting it here. I hope someone else will find this useful as well.

    Recently, I decided to upgrade the drives in my desktop. Since I don’t like the idea of downtime due to a failed drive (my desktop is also my home file server and firewall/router/dhcp server), I use software RAID1. The cost of an extra drive has been well worth the time saved from the last couple of HD failures I’ve had (including the lousy refurbished Seagate RMA drives). My initial array was 2 Seagate ST3750330AS (750GB, 32MB cache, 3Gb/s). They’re reasonably good drives, and I’ve had them for 4 years or so. For the same price I bought these drives for a couple of years ago, I managed to get 2 Western Digital WD1002FAEX-00Z3A0 (1TB, 64MB cache, 6Gb/s). So, a bit more space and faster drives. Now, the trick was to upgrade them without any downtime (other than the time to drives), or the need to resort to copying data off and/or restoring from backups (backups are important, but that’s another topic). In my setup, I have 2 drives (/dev/sda and /dev/sdb) which are combined to make 2 MD devices, /dev/md0 (my /boot partition, which is 100MB) and /dev/md1 (a physical volume used for LVM, and is the rest of the drive). Here are the steps I had to take to do this:

    Step 1. Remove one drive from the array

    Flag one of the drives as ‘faulty’ so that it can be removed from the array. In this case, I’m going to start with /dev/sda, but I could easily also use /dev/sdb. The command below will remove both partitions (sda1 and sda2) from their respected raid devices (md0 and md1) and set the drive as ‘failed’.

    [root@ ~]# mdadm /dev/md0 --fail /dev/sda1 mdadm: set /dev/sda1 faulty in /dev/md0 [root@ ~]# mdadm /dev/md1 --fail /dev/sda2 mdadm: set /dev/sda2 faulty in /dev/md1

    Step 2. Physically replace ‘failed’ drive

    I then safely removed the ‘failed’ disk from my tower (kept is as a safety backup too), and replaced it with one of the 1TB drives.

    Step 3. Copy partition data from old drive to new

    I dumped the partition scheme from sda to sdb. This would have been more important if I was replacing the drive with an identical one (in the case of a drive failure). However, this would allow me to be sure that my md0 (/boot) partitions were the same size. IMPORTANT: Make sure you put the devices below in the right order, otherwise very, very bad things will happen.

    [root@ ~]# sfdisk -d /dev/sdb | sfdisk /dev/sda

    If the above step looks too scary for you, then you might want to save the partition map to a file first, to at least have a chance of recovering it should you mess up. Here’s how:

    [root@ ~]# sfdisk -d /dev/sdb > partition_backup.txt

    Step 4. Replace 2nd partition with bigger one

    Next I used fdisk to delete the 2nd partition so that it will be larger (almost 1TB instead of almost 750GB)

    [root@ ~]# fdisk /dev/sda Command (m for help): p Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 * 63 208844 104391 fd Linux raid autodetect /dev/sda2 208845 1465144064 732467610 fd Linux raid autodetect Command (m for help): d Partition number (1-4): 2 Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4, default 2): 2 First sector (208845-1953525167, default 208845): [enter] Using default value 208845 Last sector, +sectors or +size{K,M,G} (208845-1953525167, default 1953525167): [enter] Using default value 1953525167 Command (m for help): t Partition number (1-4): 2 Hex code (type L to list codes): fd Changed system type of partition 2 to fd (Linux raid autodetect) Command (m for help): p Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 * 63 208844 104391 fd Linux raid autodetect /dev/sda2 208845 1953525167 976658161+ fd Linux raid autodetect Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.

    Now I’ve got 2 drives with 1 partition (for /boot) that is exactly the same size, and the second partition on the second drive is almost 1TB instead of 750GB. Software RAID is ok with this, and it will allow me to add both partitions to my previous RAID devices, however each device will only be the size of the smallest partition. I could have taken the opportunity to increase the size of md0 if I wanted to, but didn’t really need to.

    Step 5. Update MBR for Grub

    OK, this is going to assume you’re using grub as your bootloader and that it’s on the MBR. Since a drive has been replaced, this needs to updated otherwise the machine might not reboot.

    [root@mafalda ~]# grub GNU GRUB version 0.97-75.fc15 (640K lower / 3072K upper memory) grub> root (hd0,0) root (hd0,0) Filesystem type is ext2fs, partition type 0xfd grub> setup (hd0) setup (hd0) Checking if "/boot/grub/stage1" exists... no Checking if "/grub/stage1" exists... yes Checking if "/grub/stage2" exists... yes Checking if "/grub/e2fs_stage1_5" exists... yes Running "embed /grub/e2fs_stage1_5 (hd0)"... 26 sectors are embedded. succeeded Running "install /grub/stage1 (hd0) (hd0)1+26 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded Done. grub> root (hd1,0) root (hd1,0) Filesystem type is ext2fs, partition type 0xfd grub> setup (hd1) setup (hd1) Checking if "/boot/grub/stage1" exists... no Checking if "/grub/stage1" exists... yes Checking if "/grub/stage2" exists... yes Checking if "/grub/e2fs_stage1_5" exists... yes Running "embed /grub/e2fs_stage1_5 (hd1)"... 26 sectors are embedded. succeeded Running "install /grub/stage1 (hd1) (hd1)1+26 p (hd1,0)/grub/stage2 /grub/grub.conf"... succeeded Done. grub> quit quit

    Step 6. Add drive back to array

    Next step is to add this drive back to the RAID array and rebuild.

    [root@ ~]# mdadm /dev/md0 --add /dev/sda1 mdadm: added /dev/sda1 [root@ ~]# mdadm /dev/md1 --add /dev/sda2 mdadm: added /dev/sda2

    Step 7. Wait, wait, wait

    After that, I wait for the RAID to rebuild and show up in a 'clean' state. Since this is 750GB of data that needs to sync, it means several hours. So, I keep checking the state of the array until I see this: [root@mafalda ~]# mdadm --detail /dev/md1 /dev/md1: Version : 0.90 Creation Time : Fri Mar 14 23:36:49 2008 Raid Level : raid1 Array Size : 732467520 (698.54 GiB 750.05 GB) Used Dev Size : 732467520 (698.54 GiB 750.05 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Tue Nov 29 04:01:16 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : bad9d783:9b7b40be:9fd48ad3:7334f5c6 Events : 0.3555449

    Step 8. Rince, lather and repeat

    Repeat steps 1-7 using sdb instead of sda and vice versa.

    Step 9. Resize the raid partition

    After both drives have been replaced, the second RAID device is still only 750GB. It needs to be modified to bring it up to 1TB. This is the command to do that:

    [root@ ~]# mdadm --grow /dev/md1 --size=max mdadm: Limited v0.90 array to 2TB per device mdadm: component size of /dev/md1 has been set to 976658048K

    Checking the status of the device will show it's new size, and it's back to the waiting game.

    [root@ ~]# mdadm --detail /dev/md1 /dev/md1: Version : 0.90 Creation Time : Fri Mar 14 23:36:49 2008 Raid Level : raid1 Array Size : 976658048 (931.41 GiB 1000.10 GB) Used Dev Size : 976658048 (931.41 GiB 1000.10 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Tue Nov 29 04:01:27 2011 State : active, resyncing Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Rebuild Status : 0% complete

    Step 10. Resize the LVM physical volume

    Now that the RAID device has been resized, the next step is to resize the LVM physical volume. The pvdisplay command confirms it:

    [root@ ~]# pvresize /dev/md1 Physical volume "/dev/md1" changed 1 physical volume(s) resized / 0 physical volume(s) not resized [root@ ~]# pvdisplay --- Physical volume --- PV Name /dev/md1 VG Name mafalda_gv PV Size 931.41 GiB / not usable 7.44 MiB Allocatable yes PE Size 32.00 MiB Total PE 29805 Free PE 1053 Allocated PE 28752 PV UUID SoHDMC-kf1N-YgsY-D5wq-tTE0-Z1sq-MKlTCR

    Step 11. Adding adding the extra space to my /home partition

    I've pretty much done all the necessary things that this post has set out to do. If I wanted to add 200GB of this data to say, my /home partition (the 'home_lv' logical volume in the 'mafalda_vg' volume group), this is what the commands would be:

    [root@ ~]# lvresize -L +200G /dev/mapper/mafalda_vg-home_lv

    Then I would need to grow the ext[2|3|4] filesystem, which can also be done online

    [root@ ~]# resize2fs /dev/mapper/mafalda_vg-home_lv

    That would effectively add 200GB to my /home partition. All done with minimal downtime (the time to swap physical drives).

    Leave a Reply

    Your email address will not be published. Required fields are marked *