RAID Woes - Is My Disk Failing?

Lol. Well the server was working this entire time, so it’s not like the machine was unusable. First response to the ticket was within a couple hours, which is reasonable. At first he was going to find a replacement server for me to migrate to, so I’m guessing they were out which ended up in him offering the disk replacement. From the time I gave the go ahead and had a working machine again was just a few hours.

Could the ticket responses have come quicker? I guess so, if he had multiple staffers handling tickets. Could the replacement process have been faster? Sure, he could have immediately scheduled the disk replacement. Would I be paying what I’m currently paying (peanuts) for my machine to make the above happen? Certainly not.

I’m already paying dirt cheap pricing for a server that already has free upgrades (SSDs instead of HDDs) and would easily cost 2-3 times as much for a similarly spec’d machine elsewhere.

3 Likes

Rest in pieces… two other drives kicked themselves out of the RAID array this morning. Guess I’m shit out of luck with recovering any of my data, then. Here goes round two.

Well the free upgrades were good while it lasted. Obviously your only option now to wait for them to put in spares and for you to rebuild/restore from backups.

If you feels like this is a bit much even for a bottom of the barrel pricing, you may wish to consider a value priced data center instead. To limit hardware failures in the future while not paying ridiculous prices for “all of the trimmings”. Especially if what’s on it is used in any sorts of production.

Or even if your time can be leveraged more efficiently. Say you working somewhere and you gets paid $20/hour and say you spend 3 hours a month on this on average. In that situation it may be wise to consider paying equal to or less than $60/month more on this node elsewhere with better hardware and/or services so you can leverage your time more efficiently.

1 Like

Yep - my thoughts exactly. Already picked up a replacement machine that is over twice the cost of this one elsewhere to test out and see if it’ll be a suitable replacement for the long term.

I never really had any major issues with my machine with them in the 2+ years I’ve had it until these RAID issues started popping up (after the CPU error that made me move machines to begin with), so it’s a shame that I’ve had to put so much of my free time to getting things sorted out. I’ll likely be cancelling in the near future as a result.

2 Likes

I am glad that your investigating this and hope you will find someone better yet at a price you can afford for the long term. Now that this issue and related ones have gone frankly a bit much.

1 Like

Since i run Ceph and experiment with GlusterFS, i don’t worry about data integrity anymore.

Make sure you got ECC RAM on all nodes, and put them in different locations.

Speed is nice, but in the end, it’s about keeping your data safe.

Latest version comes with a nice dashboard : Ceph

https://www.youtube.com/playlist?list=PLrBUGiINAakNCnQUosh63LpHbf84vegNu

ZXHost remembers.

EDIT: the disaster.

3 Likes

from Hetzner Wiki

What is the difference between local and ceph disks for servers on the CX models?

Servers with local disks keep all data on a local RAID mirror on the host system. They are optimized for high I/O performance and low latency and are especially suited for applications which require fast access to disks with low latency, such as databases.

Servers with ceph disks store their data on a remote filesystem. Each block is stored on three different servers. They are especially suited for higher availability needs: If the local host hardware fails, we will boot the server on a different machine.

Ceph uses a fast kernel driver :
https://docs.ceph.com/docs/master/cephfs/kernel/

That was what ZXHost was using with the storage plans, IIRC.

And Hetzner, OVH, Tilaa and DO too.

Yeah, they did. Though, I doubt they had the amount of redundancy that Hetzner employs. Remember how dirt cheap their storage plans were?

2 Likes

https://ceph.com/ceph-blog/ceph-user-survey-2018-results/

Geo redundancy is nice and all but you have to remember it’s still FAULT TOLERANT. NOT A BACKUP strategy.

Cheap out on one or the other in application in question and you are at risks’ of comprising your data ACCESS or the data ITSELF. I don’t care if you have Ceph setup for 5 redundancy around the entire globe or whatever. If it not backed up then you value your data at precisely ZERO.

Anyways I don’t get why this was added since the OP was having issues with the hardware quality, not the fact that the data itself been comprised. I am sure they knew that they were responsible for backing it up or what not. :slight_smile:

it shows

Yeah I do. Happy happy days until it broke. I ran my first nextcloud server with them. Now it’s on an OVH ARM server but suspect I will need to move it again soon

Day 1 with emulated drives went well. Configured raid 1 and played around. Will be doing other raids next. Anything else you’d advice?

Power off hard and remove one of the drives. Recover with new disk image.

1 Like