I have a QNAP TS-453A NAS with 4 6TB disks configured as a RAID5 storage pool. Currently with firmware 22.214.171.1242.
Yesterday I discovered that one of the disks (disk 3) had failed, and that the storage pool was in a "Degraded" state.
I quickly got a new disk. The old disk was a WD Red 6TB WD60EFRX, the new one is a WD Red 6TB WD60AFEX. I hot-swapped the disks. According to the documentation, the new disk should be detected automatically, and the storage pool should automatically start rebuilding ("Rebuilding" state). But nothing happened.
I checked the UI, Storage & Snapshots tool. The storage pool was still in degraded state, but all four disks were now green and healthy. However, disk 3 was listed as "not a member" of the storage pool. When I selected to Manage the pool, I could do nothing. The only action that was not disabled was "Rebuild RAID Group", but when I tried that there were no free disks to add to the RAID group.
So the problem appeared to be that disk 3 had been detected and was in use, but still it was listed as "not a member" of the storage pool. No actions were available in the UI to fix the situation. Pulling out the disk and inserting it again did not change anything. Googling for help showed that others have encountered similar situations, but no solutions helped me.
I decided to have look "under the hood" to see if I could figure out what was wrong.
ssh admin@mynas [~] # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] md1 : active raid5 sda3 sdd3 sdb3 17551701504 blocks super 1.0 level 5, 512k chunk, algorithm 2 [4/3] [UU_U] md256 : active raid1 sdc2(S) sdd2(S) sdb2 sda2 530112 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md13 : active raid1 sdc4 sda4 sdd4 sdb4 458880 blocks super 1.0 [24/4] [UUUU____________________] bitmap: 1/1 pages [4KB], 65536KB chunk md9 : active raid1 sdc1 sda1 sdd1 sdb1 530048 blocks super 1.0 [24/4] [UUUU____________________] bitmap: 1/1 pages [4KB], 65536KB chunk
OK, so /dev/md1 is my RAID5 storage pool. Only /dev/sda3, /dev/sdb3 and /dev/sdd3 are part of the group. /dev/sdc3 is missing. Let's check the group:
[~] # mdadm --misc --detail /dev/md1 /dev/md1: Version : 1.0 Creation Time : Tue Aug 23 05:48:30 2016 Raid Level : raid5 Array Size : 17551701504 (16738.61 GiB 17972.94 GB) Used Dev Size : 5850567168 (5579.54 GiB 5990.98 GB) Raid Devices : 4 Total Devices : 3 Persistence : Superblock is persistent Update Time : Sat Apr 4 18:10:54 2020 State : clean, degraded Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : 1 UUID : f82504b7:2c60d9bd:5676ec84:0a5ba214 Events : 27378 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 4 0 0 4 removed 3 8 51 3 active sync /dev/sdd3
OK, so 3 active devices, and /dev/sdc3 is missing, as expected. Let's check if the disk exists and is formatted like the other disks:
[~] # parted GNU Parted 3.1 Using /dev/sdc Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) select /dev/sda select /dev/sda Using /dev/sda (parted) print print Model: WDC WD60EFRX-68L0BN1 (scsi) Disk /dev/sda: 6001GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 20.5kB 543MB 543MB ext3 primary 2 543MB 1086MB 543MB linux-swap(v1) primary 3 1086MB 5992GB 5991GB primary 4 5992GB 5993GB 543MB ext3 primary 5 5993GB 6001GB 8554MB linux-swap(v1) primary (parted) select /dev/sdc select /dev/sdc Using /dev/sdc (parted) print print Model: WDC WD60EFAX-68SHWN0 (scsi) Disk /dev/sdc: 6001GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 20.5kB 543MB 543MB ext3 primary 2 543MB 1086MB 543MB primary 3 1086MB 5992GB 5991GB primary 4 5992GB 5993GB 543MB ext3 primary 5 5993GB 6001GB 8554MB primary (parted) quit quit
OK, so the new disk seems to be formatted correctly. Let's just try to add the missing disk partition to the RAID group:
[~] # mdadm --manage /dev/md1 --add /dev/sdc3 mdadm: added /dev/sdc3 [~] # mdadm --misc --detail /dev/md1 /dev/md1: Version : 1.0 Creation Time : Tue Aug 23 05:48:30 2016 Raid Level : raid5 Array Size : 17551701504 (16738.61 GiB 17972.94 GB) Used Dev Size : 5850567168 (5579.54 GiB 5990.98 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Sat Apr 4 18:18:17 2020 State : active, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Rebuild Status : 0% complete Name : 1 UUID : f82504b7:2c60d9bd:5676ec84:0a5ba214 Events : 27846 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 4 8 35 2 spare rebuilding /dev/sdc3 3 8 51 3 active sync /dev/sdd3
Great! The RAID group is recovering! The NAS emitted two beeps and the status light started blinking red/green. In the UI, the storage pool state changed to "Rebuilding".
For some reason the NAS did not correctly add the /dev/sdc3 disk partition to the storage pool. The disk had been correctly partitioned and the partitions formatted, and the other RAID arrays had apparently recovered, but not /dev/md1. Adding /dev/sdc3 manually to /dev/md1 fixed the problem.
One more thing: it looks like /etc/config/mdadm.conf and /etc/config/raidtab are missing. /etc/mdadm.conf and /etc/raidtab existed as symbolic links to the non-existent files. I'm not sure that they are needed, but as a precaution I created them. mdadm.conf is created like this:
[~] # mdadm --detail -scan >> /etc/config/mdadm.conf
and this is the content of raidtab:
[~] # cat /etc/config/raidtab raiddev /dev/md1 raid-level 5 nr-raid-disks 4 nr-spare-disks 0 chunk-size 4 persistent-superblock 1 device /dev/sda3 raid-disk 0 device /dev/sdb3 raid-disk 1 device /dev/sdc3 raid-disk 2 device /dev/sdd3 raid-disk 3
Same problem and work for me!
I meet the same problem with my raid 1 (two ssd). When I restart my ts551, one ssd failed. The light is on and the system can read notice this disk, but raise "not a member" error when i try to rebuild raid 1. Neither can i erase it. Then I followed these instructions and it worked!
Thanks for your blog!!!
This worked for me as well!!…
This worked for me as well!!
Thanks a million!!
Alternate solution - possibly
I had all the same issus as described in the post on QNAP 431P-1. One of the disks (disk 2) had failed, that the storage pool was in a "Degraded" state. Disk 2 shoud have been ok, SMART info all ok. Ran a scan for bad blocks, overnight. Came back ok & disk was green. However the raid would not rebuild & that there were no free disks to add to the RAID group. I would have followed the instructions here, but I was unable to configure SSH on in step 3 below - Unable to get a browser connection to enable SSH. By the time I followed steps 4 through 6 it had started rebuilding.
1) I powered off the NAS.
2) pulled all disks out of the NAS
3) Powered on the NAS & waited about 10 to 15 minutes (maybe could be done more quickly)
4) I powered off the NAS.
5) pushed all disks into the NAS
6) Powered on the NAS. Status flashed red & green for a few minutes.
On logging into the NAS it is now rebuilding the RAID.
I'll watch the disk to see if it fails again, in which case I'll have a NEW replacement to hand.
Same issue as well, tried another stuff with "frozen" state of Raid disk but no disk were "frozen"...
Following this procedure simply worked fine for me and now my Raid disk is rebuiliding ;-)
QNAP support helped to get my degraded RAID 5 array back available online, but didn't fix the replacement disk showing as not being a member.
Your information has been excellent, allowing me to add the disk back in as described.
Sincerely grateful for the clear description. All the best, Damian
Can't add the new drive
Thanks for this tutorial!
I had the same issue with raid building. My Disk3 failed and I swapped it with a new drive. But the rebuilding is not happening. Now my second disk is already in a warning condition. So I am trying hard to get this raid rebuild with the Disk3.
I followed the steps in this tutorial. But the below error pops up.
mdadm --manage /dev/md1 --add /dev/sdc3
mdadm: /dev/md1 has failed so using --add cannot work and might destroy
mdadm: data on /dev/sdc3. You should stop the array and re-assemble it.
[/] # mdadm --misc --detail /dev/md1
Version : 1.0
Creation Time : Wed Jan 20 00:45:40 2016
Raid Level : raid5
Array Size : 11691190848 (11149.59 GiB 11971.78 GB)
Used Dev Size : 3897063616 (3716.53 GiB 3990.59 GB)
Raid Devices : 4
Total Devices : 3
Persistence : Superblock is persistent
Update Time : Wed Apr 26 10:59:13 2023
State : active, FAILED, Rescue
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Name : 1
UUID : a378f780:be36cd12:3aec18ed:36d7afcb
Events : 81301
Number Major Minor RaidDevice State
5 8 19 0 active sync /dev/sdb3
1 8 51 1 active sync /dev/sdd3
4 8 3 2 faulty /dev/sda3
6 0 0 6 removed
Any fix is appreciated!