RAID1アレイをシステム落とさず拡大してみた

0

SATAになってホットスワップが出来るんじゃないか?と思って実際やってみたら出来た。
EXT3を使っているからディスクサイズの変更もマウントした状態で出来るし、ダウンタイムがない。いいね、これ。

現状確認。

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Apr 30 16:57:06 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31026

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       17        1      active sync   /dev/sdb1

sdb1, sdc1を使って2TBのraid1アレイとなっている。

ドライブのpartitionをどうしているのかも確認

$ sudo fdisk /dev/sdb
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): p

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0xa5128fe1

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      243201  1953512001   fd  Linux raid autodetect
Partition 1 does not start on physical sector boundary.

Command (m for help): q

丸ごとLinux raid autoにしてた。わかりやすい

raid1の片側を外す作業

$ sudo mdadm /dev/md0 -f /dev/sdc1
mdadm: set /dev/sdc1 faulty in /dev/md0

まずは不良指定して、

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Apr 30 18:00:37 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31028

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1

       2       8       33        -      faulty spare   /dev/sdc1

faultyになった

外す。

$ sudo mdadm /dev/md0 -r /dev/sdc1
mdadm: hot removed /dev/sdc1 from /dev/md0

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Apr 30 18:01:00 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31032

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1

外れたのを確認。

デバイスナンバーを確認する。

$ dmesg
[    1.863625] scsi 2:0:0:0: Direct-Access     ATA      PATRIOT MEMORY 3 02.1 PQ: 0 ANSI: 5
[    1.863876] scsi 4:0:0:0: Direct-Access     ATA      ST2000DM001-9YN1 CC46 PQ: 0 ANSI: 5
[    1.871566] scsi 5:0:0:0: Direct-Access     ATA      WDC WD20EARS-00M 51.0 PQ: 0 ANSI: 5
[    1.879340] sd 2:0:0:0: [sda] 62586880 512-byte logical blocks: (32.0 GB/29.8 GiB)
[    1.879433] sd 2:0:0:0: [sda] Write Protect is off
[    1.879438] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.879476] sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    1.879625] sd 4:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[    1.879629] sd 4:0:0:0: [sdb] 4096-byte physical blocks
[    1.879710] sd 4:0:0:0: [sdb] Write Protect is off
[    1.879714] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.879752] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.879758] sd 5:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[    1.879874] sd 5:0:0:0: [sdc] Write Protect is off
[    1.879879] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    1.879916] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

ドライブ名称と番号、ドライブレターが確認できた。

アレイから外したsdcを停止して外せるようにする。

# echo 1 > /sys/class/scsi_device/5\:0\:0\:0/device/delete
$ dmesg
[15244924.978492] sd 5:0:0:0: [sdc] Synchronizing SCSI cache
[15244924.978885] sd 5:0:0:0: [sdc] Stopping disk
[15244925.414845] ata6.00: disabled

止まったので、そっと外してみる。

[15245059.522558] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[15245059.522607] ata6: irq_stat 0x00400040, connection status changed
[15245059.522636] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
[15245059.522670] ata6: hard resetting link
[15245060.244267] ata6: SATA link down (SStatus 0 SControl 300)
[15245060.244278] ata6: EH complete

おお~~
無事はずれた。

新しいディスクを刺す

$ dmesg
[15245236.970765] ata6: exception Emask 0x50 SAct 0x0 SErr 0x40d0802 action 0xe frozen
[15245236.970813] ata6: irq_stat 0x00000040, connection status changed
[15245236.970843] ata6: SError: { RecovComm HostInt PHYRdyChg CommWake 10B8B DevExch }
[15245236.970894] ata6: hard resetting link
[15245237.915795] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[15245238.097541] ata6.00: ATA-9: WDC WD30EZRX-00DC0B0, 80.00A80, max UDMA/133
[15245238.097545] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[15245238.098367] ata6.00: configured for UDMA/133
[15245238.098375] ata6: EH complete
[15245238.098487] scsi 5:0:0:0: Direct-Access     ATA      WDC WD30EZRX-00D 80.0 PQ: 0 ANSI: 5
[15245238.098813] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[15245238.098817] sd 5:0:0:0: [sdc] 4096-byte physical blocks
[15245238.098866] sd 5:0:0:0: [sdc] Write Protect is off
[15245238.098869] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[15245238.098889] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[15245238.099077]  sdc: unknown partition table
[15245238.127170] sd 5:0:0:0: [sdc] Attached SCSI disk

新しいWD30EZRXが認識された。

partitionを切る作業

$ sudo fdisk /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x5ea3473a.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: The size of this disk is 3.0 TB (3000592982016 bytes).
DOS partition table format can not be used on drives for volumes
larger than (2199023255040 bytes) for 512-byte sectors. Use parted(1) and GUID
partition table format (GPT).

The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): 

む?
2199023255040 bytes以上は出来ないと言っているようだ。
partedでgptを使えと。

parted導入から

$ sudo aptitude install parted
以下の新規パッケージがインストールされます:
  libparted0debian1{a} parted
更新: 0 個、新規インストール: 2 個、削除: 0 個、保留: 0 個。
498 kB のアーカイブを取得する必要があります。展開後に 991 kB のディスク領域が新たに消費されます。
先に進みますか? [Y/n/?] Y
取得:1 http://ftp2.jp.debian.org/debian/ squeeze/main libparted0debian1 amd64 2.3-5 [341 kB]
取得:2 http://ftp2.jp.debian.org/debian/ squeeze/main parted amd64 2.3-5 [156 kB]
498 kB を 0秒 秒でダウンロードしました (835 kB/s)
未選択パッケージ libparted0debian1 を選択しています。
(データベースを読み込んでいます ... 現在 91556 個のファイルとディレクトリがインストールされています。)
(.../libparted0debian1_2.3-5_amd64.deb から) libparted0debian1 を展開しています...
未選択パッケージ parted を選択しています。
(.../parted_2.3-5_amd64.deb から) parted を展開しています...
man-db のトリガを処理しています ...
libparted0debian1 (2.3-5) を設定しています ...
parted (2.3-5) を設定しています ...

改めてpartition切り

$ sudo parted /dev/sdc
GNU Parted 2.3
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) unit GB
(parted) print
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sdc: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start  End  Size  File system  Name  Flags

(parted) mkpart primary 0 3001
(parted) set 1 raid on
(parted) print
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sdc: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      0.00GB  3001GB  3001GB               primary  raid

(parted) q

用意できたsdc1をアレイに加える。

$ sudo mdadm /dev/md0 -a /dev/sdc1
mdadm: added /dev/sdc1

ここからが永い。
リミットがかかっていて遅いようなので、調べてみたら案の定。

ということで、スピードアップさせる。

# echo 75000 > /proc/sys/dev/raid/speed_limit_min

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[2] sdb1[1]
      1953508352 blocks [2/1] [_U]
      [>....................]  recovery =  0.7% (14022784/1953508352) finish=511.3min speed=63213K/sec

確かにだいぶ速度は速くなったけど、500分以上待つ事に。
まぁ気長に。

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed May  1 06:25:44 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31140

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       17        1      active sync   /dev/sdb1

もう一個取り替える作業も淡々と行う。

$ sudo mdadm /dev/md0 -f /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md0

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[0] sdb1[2](F)
      1953508352 blocks [2/1] [U_]

unused devices: <none>


$ sudo mdadm /dev/md0 -r /dev/sdb1
mdadm: hot removed /dev/sdb1 from /dev/md0

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed May  1 16:38:48 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31144

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       0        0        1      removed


# echo 1 > /sys/class/scsi_device/4\:0\:0\:0/device/delete


$ dmesg
[15325486.325343] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
[15325486.325564] sd 4:0:0:0: [sdb] Stopping disk
[15325487.329696] ata5.00: disabled

準備完了、ドライブを抜き取る。

$ dmesg
[15325663.380698] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen
[15325663.380746] ata5: irq_stat 0x00400040, connection status changed
[15325663.380775] ata5: SError: { PHYRdyChg 10B8B DevExch }
[15325663.380807] ata5: hard resetting link
[15325664.100625] ata5: SATA link down (SStatus 0 SControl 300)
[15325664.100636] ata5: EH complete

新しいドライブを刺す

$ dmesg
[15325954.141923] ata5: link is slow to respond, please be patient (ready=0)
[15325957.607452] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[15325957.798290] ata5.00: ATA-9: WDC WD30EZRX-00DC0B0, 80.00A80, max UDMA/133
[15325957.798295] ata5.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[15325957.799002] ata5.00: configured for UDMA/133
[15325957.799010] ata5: EH complete
[15325957.799122] scsi 4:0:0:0: Direct-Access     ATA      WDC WD30EZRX-00D 80.0 PQ: 0 ANSI: 5
[15325957.799370] sd 4:0:0:0: [sdb] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[15325957.799373] sd 4:0:0:0: [sdb] 4096-byte physical blocks
[15325957.799419] sd 4:0:0:0: [sdb] Write Protect is off
[15325957.799422] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[15325957.799442] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[15325957.799572]  sdb: unknown partition table
[15325957.825131] sd 4:0:0:0: [sdb] Attached SCSI disk

認識を確認して、partition操作~アレイへ追加。

$ sudo parted /dev/sdb
GNU Parted 2.3
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos

Number  Start  End  Size  Type  File system  Flags

(parted) mklabel gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) unit GB
(parted) p
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start  End  Size  File system  Name  Flags

(parted) mkpart primary 0 3001
(parted) set 1 raid on
(parted) p
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      0.00GB  3001GB  3001GB               primary  raid

(parted) q
Information: You may need to update /etc/fstab.


$ sudo mdadm /dev/md0 -a /dev/sdb1
mdadm: added /dev/sdb1

ここからが永い待ち時間。

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2] sdc1[0]
      2147480704 blocks [2/1] [U_]
      [>....................]  recovery =  0.0% (908480/2147480704) finish=236.2min speed=151413K/sec

unused devices: <none>


$ dmesg
[15326374.226174] md: bind<sdb1>
[15326374.405108] RAID1 conf printout:
[15326374.405112]  --- wd:1 rd:2
[15326374.405116]  disk 0, wo:0, o:1, dev:sdc1
[15326374.405120]  disk 1, wo:1, o:1, dev:sdb1
[15326374.405240] md: recovery of RAID array md0
[15326374.405244] md: minimum _guaranteed_  speed: 75000 KB/sec/disk.
[15326374.405247] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[15326374.405251] md: using 128k window, over a total of 1953508352 blocks.

おや、思ったより早い、しかも倍くらい。
raid.speed_limit_minの設定を変えたのが大きい。(partitionの作り方も影響していると思われるが

これはsysctl.confに書いておくのがよいらしいです。

$ sudo vim /etc/sysctl.conf
 追記 
dev.raid.speed_limit_min = 50000
dev.raid.speed_limit_max = 200000

しかしそれでも2TBで240分待つ

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[1] sdc1[0]
      1953508352 blocks [2/2] [UU]

unused devices: <none>


$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 1953508352 (1863.01 GiB 2000.39 GB)
  Used Dev Size : 1953508352 (1863.01 GiB 2000.39 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu May  2 06:25:44 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31242

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       17        1      active sync   /dev/sdb1

これで両方のドライブを入れ替え終わった。
この状態だとドライブの一部が使わないので、アレイサイズを拡大する。

$ sudo mdadm /dev/md0 --grow --size=max
mdadm: component size of /dev/md0 has been set to 2930265024K

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[1] sdc1[0]
      2930265024 blocks [2/2] [UU]
      [==============>......]  resync = 73.3% (2148814720/2930265024) finish=136.6min speed=95286K/sec

unused devices: <none>

およそ1TB分の拡大を行うのに140分

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Aug 22 12:23:53 2010
     Raid Level : raid1
     Array Size : 2930265024 (2794.52 GiB 3000.59 GB)
  Used Dev Size : -1
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri May  3 10:54:29 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 893b086a:cdc70ecc:4b94b77b:7c0d0721 (local to host tohya)
         Events : 0.31380

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       17        1      active sync   /dev/sdb1

アレイサイズが大きくなったら、パーティーションサイズを拡大する

$ sudo resize2fs /dev/md0
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/md0 is mounted on /srv/store; on-line resizing required
old desc_blocks = 117, new_desc_blocks = 175
Performing an on-line resize of /dev/md0 to 732566256 (4k) blocks.
The filesystem on /dev/md0 is now 732566256 blocks long.

これも結構時間かかる。

sudo resize2fs -p /dev/md0

のようにして進捗を表示させるのがいいと思う。(今回忘れてしまった)

2TBのraid1から3TBへ替えるのに2日はかかるけど、ほとんど待ち時間。ドライブの抜き差しさえ簡単にできればストレージの拡大はとても楽に行える時代になった。止めなくていいのがすごくいいね。

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です