RAID1アレイをシステム落とさず拡大してみた
SATAになってホットスワップが出来るんじゃないか?と思って実際やってみたら出来た。
EXT3を使っているからディスクサイズの変更もマウントした状態で出来るし、ダウンタイムがない。いいね、これ。
現状確認。
$ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Apr 30 16:57:06 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31026 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 17 1 active sync /dev/sdb1 |
sdb1, sdc1を使って2TBのraid1アレイとなっている。
ドライブのpartitionをどうしているのかも確認
$ sudo fdisk /dev/sdb The device presents a logical sector size that is smaller than the physical sector size. Aligning to a physical sector (or optimal I /O ) size boundary is recommended, or performance may be impacted. WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode ( command 'c' ) and change display units to sectors ( command 'u' ). Command (m for help): p Disk /dev/sdb : 2000.4 GB, 2000398934016 bytes 255 heads, 63 sectors /track , 243201 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical /physical ): 512 bytes / 4096 bytes I /O size (minimum /optimal ): 4096 bytes / 4096 bytes Disk identifier: 0xa5128fe1 Device Boot Start End Blocks Id System /dev/sdb1 1 243201 1953512001 fd Linux raid autodetect Partition 1 does not start on physical sector boundary. Command (m for help): q |
丸ごとLinux raid autoにしてた。わかりやすい
raid1の片側を外す作業
$ sudo mdadm /dev/md0 -f /dev/sdc1 mdadm: set /dev/sdc1 faulty in /dev/md0 |
まずは不良指定して、
$ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Apr 30 18:00:37 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31028 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 17 1 active sync /dev/sdb1 2 8 33 - faulty spare /dev/sdc1 |
faultyになった
外す。
$ sudo mdadm /dev/md0 -r /dev/sdc1 mdadm: hot removed /dev/sdc1 from /dev/md0 $ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Apr 30 18:01:00 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31032 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 17 1 active sync /dev/sdb1 |
外れたのを確認。
デバイスナンバーを確認する。
$ dmesg [ 1.863625] scsi 2:0:0:0: Direct-Access ATA PATRIOT MEMORY 3 02.1 PQ: 0 ANSI: 5 [ 1.863876] scsi 4:0:0:0: Direct-Access ATA ST2000DM001-9YN1 CC46 PQ: 0 ANSI: 5 [ 1.871566] scsi 5:0:0:0: Direct-Access ATA WDC WD20EARS-00M 51.0 PQ: 0 ANSI: 5 [ 1.879340] sd 2:0:0:0: [sda] 62586880 512-byte logical blocks: (32.0 GB /29 .8 GiB) [ 1.879433] sd 2:0:0:0: [sda] Write Protect is off [ 1.879438] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 1.879476] sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 1.879625] sd 4:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB /1 .81 TiB) [ 1.879629] sd 4:0:0:0: [sdb] 4096-byte physical blocks [ 1.879710] sd 4:0:0:0: [sdb] Write Protect is off [ 1.879714] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 1.879752] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.879758] sd 5:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB /1 .81 TiB) [ 1.879874] sd 5:0:0:0: [sdc] Write Protect is off [ 1.879879] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 1.879916] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA |
ドライブ名称と番号、ドライブレターが確認できた。
アレイから外したsdcを停止して外せるようにする。
# echo 1 > /sys/class/scsi_device/5 \:0\:0\:0 /device/delete $ dmesg [15244924.978492] sd 5:0:0:0: [sdc] Synchronizing SCSI cache [15244924.978885] sd 5:0:0:0: [sdc] Stopping disk [15244925.414845] ata6.00: disabled |
止まったので、そっと外してみる。
[15245059.522558] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen [15245059.522607] ata6: irq_stat 0x00400040, connection status changed [15245059.522636] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch } [15245059.522670] ata6: hard resetting link [15245060.244267] ata6: SATA link down (SStatus 0 SControl 300) [15245060.244278] ata6: EH complete |
おお~~
無事はずれた。
新しいディスクを刺す
$ dmesg [15245236.970765] ata6: exception Emask 0x50 SAct 0x0 SErr 0x40d0802 action 0xe frozen [15245236.970813] ata6: irq_stat 0x00000040, connection status changed [15245236.970843] ata6: SError: { RecovComm HostInt PHYRdyChg CommWake 10B8B DevExch } [15245236.970894] ata6: hard resetting link [15245237.915795] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [15245238.097541] ata6.00: ATA-9: WDC WD30EZRX-00DC0B0, 80.00A80, max UDMA /133 [15245238.097545] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31 /32 ), AA [15245238.098367] ata6.00: configured for UDMA /133 [15245238.098375] ata6: EH complete [15245238.098487] scsi 5:0:0:0: Direct-Access ATA WDC WD30EZRX-00D 80.0 PQ: 0 ANSI: 5 [15245238.098813] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB /2 .72 TiB) [15245238.098817] sd 5:0:0:0: [sdc] 4096-byte physical blocks [15245238.098866] sd 5:0:0:0: [sdc] Write Protect is off [15245238.098869] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [15245238.098889] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [15245238.099077] sdc: unknown partition table [15245238.127170] sd 5:0:0:0: [sdc] Attached SCSI disk |
新しいWD30EZRXが認識された。
partitionを切る作業
$ sudo fdisk /dev/sdc Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel with disk identifier 0x5ea3473a. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) WARNING: The size of this disk is 3.0 TB (3000592982016 bytes). DOS partition table format can not be used on drives for volumes larger than (2199023255040 bytes) for 512-byte sectors. Use parted(1) and GUID partition table format (GPT). The device presents a logical sector size that is smaller than the physical sector size. Aligning to a physical sector (or optimal I /O ) size boundary is recommended, or performance may be impacted. WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode ( command 'c' ) and change display units to sectors ( command 'u' ). Command (m for help): |
む?
2199023255040 bytes以上は出来ないと言っているようだ。
partedでgptを使えと。
parted導入から
$ sudo aptitude install parted 以下の新規パッケージがインストールされます: libparted0debian1{a} parted 更新: 0 個、新規インストール: 2 個、削除: 0 個、保留: 0 個。 498 kB のアーカイブを取得する必要があります。展開後に 991 kB のディスク領域が新たに消費されます。 先に進みますか? [Y /n/ ?] Y 取得:1 http: //ftp2 .jp.debian.org /debian/ squeeze /main libparted0debian1 amd64 2.3-5 [341 kB] 取得:2 http: //ftp2 .jp.debian.org /debian/ squeeze /main parted amd64 2.3-5 [156 kB] 498 kB を 0秒 秒でダウンロードしました (835 kB /s ) 未選択パッケージ libparted0debian1 を選択しています。 (データベースを読み込んでいます ... 現在 91556 個のファイルとディレクトリがインストールされています。) (... /libparted0debian1_2 .3-5_amd64.deb から) libparted0debian1 を展開しています... 未選択パッケージ parted を選択しています。 (... /parted_2 .3-5_amd64.deb から) parted を展開しています... man -db のトリガを処理しています ... libparted0debian1 (2.3-5) を設定しています ... parted (2.3-5) を設定しています ... |
改めてpartition切り
$ sudo parted /dev/sdc GNU Parted 2.3 Using /dev/sdc Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue ? Yes /No ? Yes (parted) unit GB (parted) print Model: ATA WDC WD30EZRX-00D (scsi) Disk /dev/sdc : 3001GB Sector size (logical /physical ): 512B /4096B Partition Table: gpt Number Start End Size File system Name Flags (parted) mkpart primary 0 3001 (parted) set 1 raid on (parted) print Model: ATA WDC WD30EZRX-00D (scsi) Disk /dev/sdc : 3001GB Sector size (logical /physical ): 512B /4096B Partition Table: gpt Number Start End Size File system Name Flags 1 0.00GB 3001GB 3001GB primary raid (parted) q |
用意できたsdc1をアレイに加える。
$ sudo mdadm /dev/md0 -a /dev/sdc1 mdadm: added /dev/sdc1 |
ここからが永い。
リミットがかかっていて遅いようなので、調べてみたら案の定。
ということで、スピードアップさせる。
# echo 75000 > /proc/sys/dev/raid/speed_limit_min $ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1[2] sdb1[1] 1953508352 blocks [2 /1 ] [_U] [>....................] recovery = 0.7% (14022784 /1953508352 ) finish=511.3min speed=63213K /sec |
確かにだいぶ速度は速くなったけど、500分以上待つ事に。
まぁ気長に。
$ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed May 1 06:25:44 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31140 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 17 1 active sync /dev/sdb1 |
もう一個取り替える作業も淡々と行う。
$ sudo mdadm /dev/md0 -f /dev/sdb1 mdadm: set /dev/sdb1 faulty in /dev/md0 $ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1[0] sdb1[2](F) 1953508352 blocks [2 /1 ] [U_] unused devices: <none> $ sudo mdadm /dev/md0 -r /dev/sdb1 mdadm: hot removed /dev/sdb1 from /dev/md0 $ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed May 1 16:38:48 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31144 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 0 0 1 removed # echo 1 > /sys/class/scsi_device/4 \:0\:0\:0 /device/delete $ dmesg [15325486.325343] sd 4:0:0:0: [sdb] Synchronizing SCSI cache [15325486.325564] sd 4:0:0:0: [sdb] Stopping disk [15325487.329696] ata5.00: disabled |
準備完了、ドライブを抜き取る。
$ dmesg [15325663.380698] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen [15325663.380746] ata5: irq_stat 0x00400040, connection status changed [15325663.380775] ata5: SError: { PHYRdyChg 10B8B DevExch } [15325663.380807] ata5: hard resetting link [15325664.100625] ata5: SATA link down (SStatus 0 SControl 300) [15325664.100636] ata5: EH complete |
新しいドライブを刺す
$ dmesg [15325954.141923] ata5: link is slow to respond, please be patient (ready=0) [15325957.607452] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [15325957.798290] ata5.00: ATA-9: WDC WD30EZRX-00DC0B0, 80.00A80, max UDMA /133 [15325957.798295] ata5.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31 /32 ), AA [15325957.799002] ata5.00: configured for UDMA /133 [15325957.799010] ata5: EH complete [15325957.799122] scsi 4:0:0:0: Direct-Access ATA WDC WD30EZRX-00D 80.0 PQ: 0 ANSI: 5 [15325957.799370] sd 4:0:0:0: [sdb] 5860533168 512-byte logical blocks: (3.00 TB /2 .72 TiB) [15325957.799373] sd 4:0:0:0: [sdb] 4096-byte physical blocks [15325957.799419] sd 4:0:0:0: [sdb] Write Protect is off [15325957.799422] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [15325957.799442] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [15325957.799572] sdb: unknown partition table [15325957.825131] sd 4:0:0:0: [sdb] Attached SCSI disk |
認識を確認して、partition操作~アレイへ追加。
$ sudo parted /dev/sdb GNU Parted 2.3 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA WDC WD30EZRX-00D (scsi) Disk /dev/sdb : 3001GB Sector size (logical /physical ): 512B /4096B Partition Table: msdos Number Start End Size Type File system Flags (parted) mklabel gpt Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue ? Yes /No ? Yes (parted) unit GB (parted) p Model: ATA WDC WD30EZRX-00D (scsi) Disk /dev/sdb : 3001GB Sector size (logical /physical ): 512B /4096B Partition Table: gpt Number Start End Size File system Name Flags (parted) mkpart primary 0 3001 (parted) set 1 raid on (parted) p Model: ATA WDC WD30EZRX-00D (scsi) Disk /dev/sdb : 3001GB Sector size (logical /physical ): 512B /4096B Partition Table: gpt Number Start End Size File system Name Flags 1 0.00GB 3001GB 3001GB primary raid (parted) q Information: You may need to update /etc/fstab . $ sudo mdadm /dev/md0 -a /dev/sdb1 mdadm: added /dev/sdb1 |
ここからが永い待ち時間。
$ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[2] sdc1[0] 2147480704 blocks [2 /1 ] [U_] [>....................] recovery = 0.0% (908480 /2147480704 ) finish=236.2min speed=151413K /sec unused devices: <none> $ dmesg [15326374.226174] md: bind<sdb1> [15326374.405108] RAID1 conf printout: [15326374.405112] --- wd:1 rd:2 [15326374.405116] disk 0, wo:0, o:1, dev:sdc1 [15326374.405120] disk 1, wo:1, o:1, dev:sdb1 [15326374.405240] md: recovery of RAID array md0 [15326374.405244] md: minimum _guaranteed_ speed: 75000 KB /sec/disk . [15326374.405247] md: using maximum available idle IO bandwidth (but not more than 200000 KB /sec ) for recovery. [15326374.405251] md: using 128k window, over a total of 1953508352 blocks. |
おや、思ったより早い、しかも倍くらい。
raid.speed_limit_minの設定を変えたのが大きい。(partitionの作り方も影響していると思われるが
これはsysctl.confに書いておくのがよいらしいです。
$ sudo vim /etc/sysctl .conf 追記 dev.raid.speed_limit_min = 50000 dev.raid.speed_limit_max = 200000 |
しかしそれでも2TBで240分待つ
$ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sdc1[0] 1953508352 blocks [2 /2 ] [UU] unused devices: <none> $ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 1953508352 (1863.01 GiB 2000.39 GB) Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu May 2 06:25:44 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31242 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 17 1 active sync /dev/sdb1 |
これで両方のドライブを入れ替え終わった。
この状態だとドライブの一部が使わないので、アレイサイズを拡大する。
$ sudo mdadm /dev/md0 --grow --size=max mdadm: component size of /dev/md0 has been set to 2930265024K $ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sdc1[0] 2930265024 blocks [2 /2 ] [UU] [==============>......] resync = 73.3% (2148814720 /2930265024 ) finish=136.6min speed=95286K /sec unused devices: <none> |
およそ1TB分の拡大を行うのに140分
$ sudo mdadm --detail /dev/md0 /dev/md0 : Version : 0.90 Creation Time : Sun Aug 22 12:23:53 2010 Raid Level : raid1 Array Size : 2930265024 (2794.52 GiB 3000.59 GB) Used Dev Size : -1 Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Fri May 3 10:54:29 2013 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 ( local to host tohya) Events : 0.31380 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 17 1 active sync /dev/sdb1 |
アレイサイズが大きくなったら、パーティーションサイズを拡大する
$ sudo resize2fs /dev/md0 resize2fs 1.41.12 (17-May-2010) Filesystem at /dev/md0 is mounted on /srv/store ; on-line resizing required old desc_blocks = 117, new_desc_blocks = 175 Performing an on-line resize of /dev/md0 to 732566256 (4k) blocks. The filesystem on /dev/md0 is now 732566256 blocks long. |
これも結構時間かかる。
sudo resize2fs -p /dev/md0 |
のようにして進捗を表示させるのがいいと思う。(今回忘れてしまった)
2TBのraid1から3TBへ替えるのに2日はかかるけど、ほとんど待ち時間。ドライブの抜き差しさえ簡単にできればストレージの拡大はとても楽に行える時代になった。止めなくていいのがすごくいいね。