I need help with RAID error

edited September 2021 in General

Hello, I've never had to deals with issues with RAID (new to dedicated servers since we switched from VPS's)

This is the error that I've got to email:

A Fail event had been detected on md device /dev/md/2.

It could be related to component device /dev/nvme0n1p3.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md2 : active raid1 nvme1n1p3[1] nvme0n1p3[0](F)
      3745885504 blocks super 1.2 [2/1] [_U]
      bitmap: 3/28 pages [12KB], 65536KB chunk

md0 : active raid1 nvme1n1p1[1] nvme0n1p1[0]
      4189184 blocks super 1.2 [2/2] [UU]

md1 : active raid1 nvme1n1p2[1] nvme0n1p2[0]
      523264 blocks super 1.2 [2/2] [UU]

unused devices: 
A Fail event had been detected on md device /dev/md/1.

It could be related to component device /dev/nvme0n1p2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md2 : active raid1 nvme1n1p3[1]
      3745885504 blocks super 1.2 [2/1] [_U]
      bitmap: 4/28 pages [16KB], 65536KB chunk

md0 : active raid1 nvme1n1p1[1] nvme0n1p1[0](F)
      4189184 blocks super 1.2 [2/1] [_U]

md1 : active raid1 nvme1n1p2[1] nvme0n1p2[0](F)
      523264 blocks super 1.2 [2/1] [_U]

unused devices: 

and there's second mail with:

A Fail event had been detected on md device /dev/md/0.

It could be related to component device /dev/nvme0n1p1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md2 : active raid1 nvme1n1p3[1]
      3745885504 blocks super 1.2 [2/1] [_U]
      bitmap: 4/28 pages [16KB], 65536KB chunk

md0 : active raid1 nvme1n1p1[1]
      4189184 blocks super 1.2 [2/1] [_U]

md1 : active raid1 nvme1n1p2[1]
      523264 blocks super 1.2 [2/1] [_U]

unused devices: 

What should I do? Thanks.

Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

Comments

  • Here's more info that I've Googled

    [root@blue ~]# smartctl -H /dev/nvme1n1p2
    smartctl 7.1 2020-08-23 r5080 [x86_64-linux-4.18.0-305.17.1.lve.el8.x86_64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    [root@blue ~]# smartctl -H /dev/nvme1n1p1
    smartctl 7.1 2020-08-23 r5080 [x86_64-linux-4.18.0-305.17.1.lve.el8.x86_64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    [root@blue ~]# smartctl -H /dev/nvme1n1p3
    smartctl 7.1 2020-08-23 r5080 [x86_64-linux-4.18.0-305.17.1.lve.el8.x86_64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    [root@blue ~]# fdisk -l /dev/nvme1n1p1 /dev/nvme1n1p2 /dev/nvme1n1p3
    Disk /dev/nvme1n1p1: 4 GiB, 4294967296 bytes, 8388608 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 131072 bytes / 131072 bytes
    
    
    Disk /dev/nvme1n1p2: 512 MiB, 536870912 bytes, 1048576 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 131072 bytes / 131072 bytes
    
    
    Disk /dev/nvme1n1p3: 3.5 TiB, 3835922030080 bytes, 7492035215 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 131072 bytes / 131072 bytes
    
    

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • edited September 2021

    And the partitions:

    [root@blue ~]# df -h
    Filesystem      Size  Used Avail Use% Mounted on
    devtmpfs         63G     0   63G   0% /dev
    tmpfs            63G     0   63G   0% /dev/shm
    tmpfs            63G  952K   63G   1% /run
    tmpfs            63G     0   63G   0% /sys/fs/cgroup
    /dev/md2        3.5T   50G  3.3T   2% /
    /dev/md1        485M  332M  128M  73% /boot
    tmpfs            13G     0   13G   0% /run/user/0
    

    The server is Hetzner AX 101

    and I've did the partition thing with their tutorial for disk larger than 2TB with imageinstall (rescue while OS install)

    PART swap swap 4G
    PART /boot ext3 512M
    PART / ext4 all
    

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • SGrafSGraf Hosting ProviderServices Provider
    edited September 2021

    Helped a bit via discord, .... mdam should be back in sync for now.

    Thanked by (1)Not_Oles

    MyRoot.PW ★ Dedicated Servers ★ LIR-Services ★ Web-Hosting ★
    ★ Locations: Austria + Netherlands + USA ★ [email protected]

  • @SGraf Thanks for helping! Thanks also for posting about the resolution so the thread wouldn't be left just dangling. :)

    @Amadex @SGraf Could you guys please post a brief note about

    • what the problem was,

    • what caused the problem, and

    • how you fixed it?

    Best wishes and kindest regards from a clueless™ guy in the desert! 🏜️

    Tom. 穆坦然. Not Oles. Happy New York City guy visiting Mexico! How is your 文言文?
    The MetalVPS.com website runs very speedily on MicroLXC.net! Thanks to @Neoon!

  • SagnikSSagnikS Hosting ProviderOG
    edited September 2021

    You should get the NVMe that was removed from the array replaced ASAP. I had that happen too and put it back in the RAID array because a badblocks and SMART test came out clean. A few hours later, the node started behaving extremely weirdly (high iowait) and eventually crashed.

    Thanked by (1)Not_Oles
  • I don't understand why in 2021 people still do mdadm arrays... Logical Volume Groups, btrfs, or ZFS are the way to go. There's too many issues with write holes and desyncing on mdadm that require manual intervention for my tastes.

    Thanked by (1)Not_Oles

    Cheap dedis are my drug, and I'm too far gone to turn back.

  • @SagnikS said: the NVMe that was removed from the array

    Sorry, how do we know that an NVMe was removed from the array? :)

    Thanked by (1)SagnikS

    Tom. 穆坦然. Not Oles. Happy New York City guy visiting Mexico! How is your 文言文?
    The MetalVPS.com website runs very speedily on MicroLXC.net! Thanks to @Neoon!

  • SagnikSSagnikS Hosting ProviderOG

    @Not_Oles said:

    @SagnikS said: the NVMe that was removed from the array

    Sorry, how do we know that an NVMe was removed from the array? :)

    Got an alert from our monitoring software, and an email too. cat /proc/mdstat will also show your array as degraded.

  • SagnikSSagnikS Hosting ProviderOG

    @CamoYoshi said:
    I don't understand why in 2021 people still do mdadm arrays... Logical Volume Groups, btrfs, or ZFS are the way to go. There's too many issues with write holes and desyncing on mdadm that require manual intervention for my tastes.

    LVM RAID uses the same underlying driver iirc, btrfs is still not stable and ZFS has a performance overhead/you need to tune it properly. mdadm still works just fine ootb and is a tested solution.

    Thanked by (3)Falzo mfs ialexpw
  • @SGraf helped me a lot. Thanks again! 🙌

    @Not_Oles
    Problem: I've got email that RAID has failed
    What caused the problem: dunno
    Fixed: @SGraf was troubleshooting

    Idk if I should replace the disk or keep it for now.

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • SGrafSGraf Hosting ProviderServices Provider

    @Amadex said:
    @SGraf helped me a lot. Thanks again! 🙌

    @Not_Oles
    Problem: I've got email that RAID has failed
    What caused the problem: dunno
    Fixed: @SGraf was troubleshooting

    Idk if I should replace the disk or keep it for now.

    As i said in the chat, get that disk replaced. Just because we put the mdam raid back together for now, doesnt mean it will be stable in the future.

    We saw i/o write erors on the ssd before the system dropped the ssd completely. the disk came back after a reboot and we re-added+synced the ssd.

    MyRoot.PW ★ Dedicated Servers ★ LIR-Services ★ Web-Hosting ★
    ★ Locations: Austria + Netherlands + USA ★ [email protected]

  • edited September 2021

    run smartctl -a against your nvmes to get an idea how worn out they are...

  • @Falzo

    [root@blue ~]# smartctl -a /dev/nvme0n1
    smartctl 7.1 2020-08-23 r5080 [x86_64-linux-4.18.0-305.17.1.lve.el8.x86_64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Number:                       SAMSUNG MZQL23T8HCLS-00A07
    Serial Number:                      S64HNE0R514262
    Firmware Version:                   GDC5302Q
    PCI Vendor/Subsystem ID:            0x144d
    IEEE OUI Identifier:                0x002538
    Total NVM Capacity:                 3,840,755,982,336 [3.84 TB]
    Unallocated NVM Capacity:           0
    Controller ID:                      6
    Number of Namespaces:               32
    Namespace 1 Size/Capacity:          3,840,755,982,336 [3.84 TB]
    Namespace 1 Utilization:            154,588,065,792 [154 GB]
    Namespace 1 Formatted LBA Size:     512
    Local Time is:                      Thu Sep 30 12:26:00 2021 CEST
    Firmware Updates (0x17):            3 Slots, Slot 1 R/O, no Reset required
    Optional Admin Commands (0x005f):   Security Format Frmw_DL NS_Mngmt Self_Test MI_Snd/Rec
    Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
    Maximum Data Transfer Size:         512 Pages
    Warning  Comp. Temp. Threshold:     80 Celsius
    Critical Comp. Temp. Threshold:     83 Celsius
    Namespace 1 Features (0x1a):        NA_Fields No_ID_Reuse *Other*
    
    Supported Power States
    St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
     0 +    25.00W   14.00W       -    0  0  0  0       70      70
     1 +     8.00W  0.0800W       -    1  1  1  1       70      70
    
    Supported LBA Sizes (NSID 0x1)
    Id Fmt  Data  Metadt  Rel_Perf
     0 +     512       0         0
     1 -    4096       0         0
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    SMART/Health Information (NVMe Log 0x02)
    Critical Warning:                   0x00
    Temperature:                        37 Celsius
    Available Spare:                    100%
    Available Spare Threshold:          10%
    Percentage Used:                    0%
    Data Units Read:                    10,941,488 [5.60 TB]
    Data Units Written:                 1,419,512 [726 GB]
    Host Read Commands:                 14,992,414
    Host Write Commands:                7,351,883
    Controller Busy Time:               118
    Power Cycles:                       5
    Power On Hours:                     74
    Unsafe Shutdowns:                   0
    Media and Data Integrity Errors:    0
    Error Information Log Entries:      0
    Warning  Comp. Temperature Time:    0
    Critical Comp. Temperature Time:    0
    Temperature Sensor 1:               37 Celsius
    Temperature Sensor 2:               47 Celsius
    
    Error Information (NVMe Log 0x01, max 64 entries)
    No Errors Logged
    
    [root@blue ~]# smartctl -a /dev/nvme1n1
    smartctl 7.1 2020-08-23 r5080 [x86_64-linux-4.18.0-305.17.1.lve.el8.x86_64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Number:                       SAMSUNG MZQL23T8HCLS-00A07
    Serial Number:                      S64HNE0R514263
    Firmware Version:                   GDC5302Q
    PCI Vendor/Subsystem ID:            0x144d
    IEEE OUI Identifier:                0x002538
    Total NVM Capacity:                 3,840,755,982,336 [3.84 TB]
    Unallocated NVM Capacity:           0
    Controller ID:                      6
    Number of Namespaces:               32
    Namespace 1 Size/Capacity:          3,840,755,982,336 [3.84 TB]
    Namespace 1 Utilization:            3,840,755,978,240 [3.84 TB]
    Namespace 1 Formatted LBA Size:     512
    Local Time is:                      Thu Sep 30 12:26:38 2021 CEST
    Firmware Updates (0x17):            3 Slots, Slot 1 R/O, no Reset required
    Optional Admin Commands (0x005f):   Security Format Frmw_DL NS_Mngmt Self_Test MI_Snd/Rec
    Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
    Maximum Data Transfer Size:         512 Pages
    Warning  Comp. Temp. Threshold:     80 Celsius
    Critical Comp. Temp. Threshold:     83 Celsius
    Namespace 1 Features (0x1a):        NA_Fields No_ID_Reuse *Other*
    
    Supported Power States
    St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
     0 +    25.00W   14.00W       -    0  0  0  0       70      70
     1 +     8.00W  0.0800W       -    1  1  1  1       70      70
    
    Supported LBA Sizes (NSID 0x1)
    Id Fmt  Data  Metadt  Rel_Perf
     0 +     512       0         0
     1 -    4096       0         0
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    SMART/Health Information (NVMe Log 0x02)
    Critical Warning:                   0x00
    Temperature:                        36 Celsius
    Available Spare:                    100%
    Available Spare Threshold:          10%
    Percentage Used:                    0%
    Data Units Read:                    135,580 [69.4 GB]
    Data Units Written:                 12,285,985 [6.29 TB]
    Host Read Commands:                 645,231
    Host Write Commands:                22,273,118
    Controller Busy Time:               30
    Power Cycles:                       5
    Power On Hours:                     74
    Unsafe Shutdowns:                   0
    Media and Data Integrity Errors:    0
    Error Information Log Entries:      0
    Warning  Comp. Temperature Time:    0
    Critical Comp. Temperature Time:    0
    Temperature Sensor 1:               36 Celsius
    Temperature Sensor 2:               46 Celsius
    
    Error Information (NVMe Log 0x01, max 64 entries)
    No Errors Logged
    
    
    [root@blue ~]# sudo sfdisk -l /dev/nvme0n1
    Disk /dev/nvme0n1: 3.5 TiB, 3840755982336 bytes, 7501476528 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 131072 bytes / 131072 bytes
    Disklabel type: gpt
    Disk identifier: 8A555D66-C2F6-4F35-8677-55BDFC148408
    
    Device           Start        End    Sectors  Size Type
    /dev/nvme0n1p1    4096    8392703    8388608    4G Linux RAID
    /dev/nvme0n1p2 8392704    9441279    1048576  512M Linux RAID
    /dev/nvme0n1p3 9441280 7501476494 7492035215  3.5T Linux RAID
    /dev/nvme0n1p4    2048       4095       2048    1M BIOS boot
    
    Partition table entries are not in disk order.
    [root@blue ~]# sudo sfdisk -l /dev/nvme1n1
    Disk /dev/nvme1n1: 3.5 TiB, 3840755982336 bytes, 7501476528 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 131072 bytes / 131072 bytes
    Disklabel type: gpt
    Disk identifier: 95B72003-3BE5-4905-ABFB-F6DB8851BA89
    
    Device           Start        End    Sectors  Size Type
    /dev/nvme1n1p1    4096    8392703    8388608    4G Linux RAID
    /dev/nvme1n1p2 8392704    9441279    1048576  512M Linux RAID
    /dev/nvme1n1p3 9441280 7501476494 7492035215  3.5T Linux RAID
    /dev/nvme1n1p4    2048       4095       2048    1M BIOS boot
    
    Partition table entries are not in disk order.
    
    

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • [root@blue ~]# cat /proc/mdstat
    Personalities : [raid1]
    md2 : active raid1 nvme0n1p3[0] nvme1n1p3[1]
          3745885504 blocks super 1.2 [2/2] [UU]
          bitmap: 6/28 pages [24KB], 65536KB chunk
    
    md0 : active raid1 nvme0n1p1[2] nvme1n1p1[1]
          4189184 blocks super 1.2 [2/2] [UU]
    
    md1 : active raid1 nvme0n1p2[2] nvme1n1p2[1]
          523264 blocks super 1.2 [2/2] [UU]
    
    unused devices: 
    
    

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • @Amadex said:

    these NVMe are brand new. there is nothing to argue over for having them replaced.

    did you power off the server via panel at some point after the first installation? maybe a power loss lead to the broken initial raid sync in the first place...

  • edited September 2021

    @Falzo said:

    @Amadex said:

    these NVMe are brand new. there is nothing to argue over for having them replaced.

    did you power off the server via panel at some point after the first installation? maybe a power loss lead to the broken initial raid sync in the first place...

    I've never did that. After Plesk installation + Centos 8 > CloudLinux 8 conversion I did a normal reboot via ssh. Server was bought on 27.09.2021 and everything was installed on that day + rebooted. Since then I've touched nothing.

    Thanked by (1)Falzo

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • @Amadex said:

    @Falzo said:

    @Amadex said:

    these NVMe are brand new. there is nothing to argue over for having them replaced.

    did you power off the server via panel at some point after the first installation? maybe a power loss lead to the broken initial raid sync in the first place...

    I've never did that. After Plesk installation + Centos 8 > CloudLinux 8 conversion I did a normal reboot via ssh. Server was bought on 27.09.2021 and everything was installed on that day + rebooted. Since then I've touched nothing.

    weird... I however doubt that there is anything wrong with one of the NVMes at all, whatever hickup that has been then.

  • @Falzo said:

    @Amadex said:

    @Falzo said:

    @Amadex said:

    these NVMe are brand new. there is nothing to argue over for having them replaced.

    did you power off the server via panel at some point after the first installation? maybe a power loss lead to the broken initial raid sync in the first place...

    I've never did that. After Plesk installation + Centos 8 > CloudLinux 8 conversion I did a normal reboot via ssh. Server was bought on 27.09.2021 and everything was installed on that day + rebooted. Since then I've touched nothing.

    weird... I however doubt that there is anything wrong with one of the NVMes at all, whatever hickup that has been then.

    I will wait and see If it happens again, thanks everyone for replies

    Thanked by (1)Falzo

    Amadex.com Domainer + IT Supporter | Brbljaona Balkan Chat Website | ICT Jobs Croatia

  • edited September 2021

    @SagnikS said:

    @CamoYoshi said:
    I don't understand why in 2021 people still do mdadm arrays... Logical Volume Groups, btrfs, or ZFS are the way to go. There's too many issues with write holes and desyncing on mdadm that require manual intervention for my tastes.

    LVM RAID uses the same underlying driver iirc, btrfs is still not stable and ZFS has a performance overhead/you need to
    tune it properly. mdadm still works just fine ootb and is a tested solution.

    LVM at least has the benefit of self-healing in RAID1 scenarios despite calling mdadm for the underlying RAID functionality, and offers greater flexibility over mdadm.

    btrfs is considered stable for RAID1: https://btrfs.wiki.kernel.org/index.php/Status - Performance tuning just needs to happen to the code and then it'll be a lot more viable, but that being said it is quite performant as it is now.

    ZFS performance overhead is way overblown; the "1TB of storage needs 1GB of RAM" is only for enterprise level applications with many clients simultaneously reading and writing to the array, and the recommended tuning settings are well documented and understood.

    ZFS performance tuning official recommendations: https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html

    ZFS developer on the "1GB for 1TB" rule:
    https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrh5iv/
    https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrngar/

    Cheap dedis are my drug, and I'm too far gone to turn back.

  • SagnikSSagnikS Hosting ProviderOG

    @CamoYoshi said:

    @SagnikS said:

    @CamoYoshi said:
    I don't understand why in 2021 people still do mdadm arrays... Logical Volume Groups, btrfs, or ZFS are the way to go. There's too many issues with write holes and desyncing on mdadm that require manual intervention for my tastes.

    LVM RAID uses the same underlying driver iirc, btrfs is still not stable and ZFS has a performance overhead/you need to
    tune it properly. mdadm still works just fine ootb and is a tested solution.

    LVM at least has the benefit of self-healing in RAID1 scenarios despite calling mdadm for the underlying RAID functionality, and offers greater flexibility over mdadm.

    btrfs is considered stable for RAID1: https://btrfs.wiki.kernel.org/index.php/Status - Performance tuning just needs to happen to the code and then it'll be a lot more viable, but that being said it is quite performant as it is now.

    ZFS performance overhead is way overblown; the "1TB of storage needs 1GB of RAM" is only for enterprise level applications with many clients simultaneously reading and writing to the array, and the recommended tuning settings are well documented and understood.

    ZFS performance tuning official recommendations: https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html

    ZFS developer on the "1GB for 1TB" rule:
    https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrh5iv/
    https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrngar/

    I'm not sure what you're referring to by LVM's self healing :sweat_smile:.

    I'm referring to BTRFS's stability, A few weeks back I had a friend lose data due to a power loss (OpenSUSE + Btrfs). This rarely happens with EXT4.

    And yes, not sure where that RAM requirement came from. I run a few large (100+TB) Proxmox servers that run ZFS (that's the only logical choice, xfs has been having problems with files disappearing apparently, ext4 is limited) and they run on very little RAM. I find it really nice that the writehole problems are patched there, but there are a few quirks. IOWait was much higher when the VM was in a zvol than when it was in a file in the same zpool that the zvol was in. Another problem with ZFS is the lack of mainstream linux support atm. It should improve in future though hopefully.

    I have high hopes on a project called bcachefs, it seems really cool :smiley:

Sign In or Register to comment.

This Site is currently in maintenance mode.
Please check back here later.

→ Site Settings