Best hosting or data center story

HIVELOCITY · February 18, 2018, 5:19pm

Over the years I am sure everyone has a good story to tell about a customer or provider.

One of my favorite stories to tell is about a time I toured a data center looking for a colo solution. This is long before we had our own data centers. As we walked through the facility I noticed a puddle of water on the raised floor and when I asked about it I was told verbatim “ah, that’s just water and nothing to worry about.” I was like “um, it is actually something to worry about and why I just asked you about it.” As we moved on I noticed it was quite warm and there was no air-flow coming out of the raised-floor perforated tiles. The answer I was given was…wait for it…”that particular tile is for the air return and why you don’t feel air coming out”. Me- “Wait what…so you guys are able to push and pull air under the same raised floor?..and shouldn’t the return be up high where the hot air is? And even if you were somehow using the raised floor as a return that doesn’t answer why the heck is it so hot in here?” So they basically tried to pass off their ACs not working resulting in no air coming out of the floor as the raised floor being used as the return. I was flabbergasted. The blatant bs was so easy to disprove it was frankly awkward. I got the heck out of there and long story short the tour did not go well for anyone.

JackHadrill · February 18, 2018, 5:34pm

This is excellent! Did you ever get to the bottom of why there was water on the floor or has this remained a mystery ever since? I think it’s safe to assume you when ahead with an alternative facility instead?

Max · February 18, 2018, 5:53pm

I feel in this situation it would’ve just been better to say ‘Our AC is being repaired’ facepalm.

WSS · February 18, 2018, 6:07pm

I’ve mentioned it before, but back in the never-neverland times of yore (early 2000s), I had a few racks of my own.

One time I went to install a new Sparc, and I saw one of their 1st/2nd tier NOCites sitting perpelexed in front of an unpopulated rack, reading a “For Dummies” book. As memory serves, it was TCP/IP for dummies.

At one point, I worked for the largest colo-backbones in Santa Clara - not including transit points only (after a nearby-brush at HotMail), and, being the way dotcom heated up, folks were wanting to cash out and move on. We got acquired. The large company who bought us sent in management first, who immediately took their gear and tried to use our ports with their network.

They knocked off our TFTP/Netboot used by all of the NOC, first…

It was a fun few months, but the first week was not fun when you were dealing with an issue, and JoePHB decided to take your IP address at random, knocking you both offline.

Then, there’s the infamous time a “Linux Only” guy was asked to perform a service on an IRIX box. Why they blindly gave him root, I have no clue. Yep, he used “killall”, and it went silent. Luckily enough, the system needed a reboot for a recent update, but his face went incredibly pale when he was told that it’d be his task to dig up hardware and a 13W3 capable monitor if it didn’t come up…

At one point, a middle-manager at Motorola called up fuming and was immediately sent to the top of the queue (mine). It’s not uncommon for people to be a bit heated, but this guy was just seconds from giving himself a heart attack.

Long story short, their uplink was through us at work, and he couldn’t access us at his home, so it was obviously our fault.

Cutting to the chase, he had a half-broken NAT setup at home (this predated uPNP and other awful broken standards), and refused to open the port for his VPN onto his internal network, and it was still our fault, even when their network engineer explained several times that it wasn’t.

We ended up faxing him RFC1918 when he suggested that it was still our fault, and that anyone he had spoken to would be sued personally, blah, blah, blah.

Never did hear from him again during my reign there.

HIVELOCITY · February 18, 2018, 6:38pm

Never got an answer on the water. I think was in a state of shock or I would have pressed further. I believe it may have been this tour that gave us the kick in the butt we needed to just do it ourselves and eventually open up TPA1.

WSS · February 18, 2018, 6:41pm

My guess is that they had at least one unit die horribly, and with inadequate draining (and exit from a lack of spinning fans), the condensation found a direct pathway.

Of course, that design would have been obliterated if near the water during the torrential season. Was it in Florida, by chance?

Thought of another one!

There was that time when they were doing the general (every 6 or so months) “cutover” test at GC, kicking clients onto battery, then diesel. Rolling blackouts hadn’t happened yet, but they would soon.

Well, they were overloaded, poorly designed, and threw like 4 MORE full-sized conditioners on the roof to keep things coolish. They didn’t bother redesigning the temporary battery load before the generators kicked in, and everything died hard, and without the batteries to kickstart the automatic starter on the diesel (why they were integrated, I have no clue), things stayed offline.

It was about 48 hours before their datacenter was back to normal. Next time I went there, most of the racks were empty. Gee, I wonder why.

E: I really need to learn to edit offline. Clicking “Save” several times just makes my post look scattered in viewing the changes.

Jarland · February 18, 2018, 7:07pm

Best story:

yum update

Result:

Every reseller server down, at one of the largest hosts in the world, for 12+ hours.

Liam · February 19, 2018, 1:09am

why do i feel like this was at a certain crocodile flavored host

WSS · February 19, 2018, 1:32am

…that doesn’t really narrow it down all that much, though!

Harambe · February 19, 2018, 1:35am

Jarland · February 19, 2018, 1:36am

Harambe · February 19, 2018, 1:39am

Please tell me this was playing on loop on every TV in the office all the time.

Liam · February 19, 2018, 1:49am

https://tubedubber.com/?q=_JzDaxj-c6c:q_qUiytLYRc:0:100:0:0:1

they definitely missed a hit with this one

HIVELOCITY · February 19, 2018, 2:14am

One of my favorite pics of all time was the infamous cardboard box server chassis. This was reported on WHT waaaaay back in the day. Some disgruntled employee or customer of some hosting company posted this pic on their way out the door. Actually they dated the box…May 27, 2009. The fact they took the time to date it kills me.

redacted · February 19, 2018, 2:29am

daffy · February 19, 2018, 7:43am

One of my best memories comes from just a few weeks after I started here at my current employer.
I guess it is pretty standard, but anyway.

We had a quite large colo-customer were one of their techs phones of at 3AM, screaming (in indian accent):
“HEEEEELP, I TYPED ‘rm -rf /’”.
That was a fun couple of hours, halting the server, dumping backups back and getting everything up.
Cost them quite a lot of mooonies

Andrei · February 19, 2018, 7:53am

I get how you’d accidentally type in ‘exitr’ instead of ‘exit’ in your terminal, but I never understood how could one accidentally type in ‘rm -rf /’… reminds me of this:

Aviation · February 19, 2018, 7:57am

The only time I can see this happening is if they hit enter before they entered the directory they wanted to remove. However nowadays with --no-preserve-root this is an impressive feat

Michael · February 19, 2018, 11:32am

That looks like 123-Regs Nottingham “DC” pre Host Europe take over.

Francisco · February 21, 2018, 7:01am

You missed the “Physically raided” drives, which was a pair of drives strapped together with electrical tape.

Thankfully FDC got out of the crapbox hosting.

Now that they’ve moved (almost?) everything to Cogent facilities they’re forced to use proper rack mounts. I know there was a big hub bub from their Denver customers that all of a sudden had to source 1U’s w/ IPMI’s.

Francisco