[Poll] What Do You Use To Monitor Your Servers/Websites?

Michael · June 17, 2019, 7:09am

Hate to be that guy… I’m still rocking Nagios…

Solaire · June 17, 2019, 7:16am

PHP Server Monitor + LibreNMS (incl. Nagios plugin) + Smokeping + uptime robot.

Ympker · June 18, 2019, 10:31am

Also using Freenom for the .cf? Hehe

Mason · June 18, 2019, 11:36am

You bet’cha

WSS · June 18, 2019, 12:05pm

But rsyslog can do that tho…

FHR · June 18, 2019, 12:07pm

Yeah, one line in rsyslog and all system logs fly away to Graylog <3

WSS · June 18, 2019, 12:09pm

I was just being a dick, but I plan on looking further into greylog. Thanks for the tip.

FHR · June 18, 2019, 12:09pm

Hm. Technically speaking, wouldn’t HetrixTools be classified as a “self hosted” tool for you?

WSS · June 18, 2019, 12:12pm

It has to be. I was teasing someone else about “blaming their tools”, and @Andrei jumped into the comments about it being HIS tool, rather than the person commenting.

E: 100% across the board this week. Then, ns1 networking bounced a few times for no reason several times afterwards. I’d still like a way to centralize testing on HT before it sends me “END OF THE WORLD” notices, but I’m quite happy with it.

Andrei · June 18, 2019, 2:32pm

Technically true indeed, it’d fall under both the ‘HetrixTools’ and ‘Self-hosted’ categories, since it’s both

Guilty as charged.

Any more info on how you’d see such a feature function? Sort of an escalation feature? Or…?

WSS · June 18, 2019, 2:49pm

Well, not explicitly, but having your services able to pool together and say ‘well, this isn’t working in Atlantis, but it works fine in Constantinople, so it might just be local’ would be awesome. I literally got paged 5 times about ns1 going down within an hour because a local monitor was having shitfits to that location.

ns1:~$ uptime
 07:48:39 up 274 days, 18:04,  load average: 0.00, 0.00, 0.00

Its’ just fine.

Andrei · June 18, 2019, 3:14pm

WSS:

Well, not explicitly, but having your services able to pool together and say ‘well, this isn’t working in Atlantis, but it works fine in Constantinople, so it might just be local’ would be awesome. I literally got paged 5 times about ns1 going down within an hour because a local monitor was having shitfits to that location.
ns1:~$ uptime
 07:48:39 up 274 days, 18:04,  load average: 0.00, 0.00, 0.00
Its’ just fine.

All locations data is being pooled before any notifications are sent out. Just one location facing issues cannot trigger an alert under any circumstances.

I’d suggest you look through your Location Fail Log because I’m seeing a lot of locations timing out towards your NS1 monitor, all trying to connect multiple times within the same minute. Different providers, different geo-locations, trying multiple times, virtually impossible to be a false positive:

And your Network Diagnostics (which are collected after your monitor is declared as being down) also show network issues from multiple locations.

The ‘uptime’ command does not take into account network connectivity issues to/from your server.

WSS · June 18, 2019, 3:55pm

I really appreciate the work you did, but it was more of the frequency of it showing, and not across all monitors that I was requesting possible work on.

Shit, you’re my only SaaS other than MailCheap. That should mean something about your product.

imok · June 18, 2019, 6:22pm

Self-hosted SaaS

beagle · June 18, 2019, 6:34pm

Jumping on the bandwagon of features suggestions I’d really like to be able to setup recurring maintenance slots. I ran my Nextcloud backups every day at same time, so it would be good to be able to set these as maintenance slots and not be alerted about the outage.

Harambe · June 18, 2019, 8:37pm

A mix of HetrixTools + UptimeRobot, plus recently put up an Observium instance to play around with that a bit.

thagoat · June 18, 2019, 8:49pm

Smoke signals and a monkey that flings poo when he sees downtime. Great setup!

WSS · June 18, 2019, 9:14pm

Hold on sport- I haven’t signed the contract yet.

Daniel · June 18, 2019, 9:33pm

I use UptimeRobot for most things (status page at https://uptime.d.sb/), but I’ve been meaning to move some of the monitors to HetrixTools. I also use Prometheus + blackbox_exporter + Grafana to monitor ping times for a few servers.

Andrei · June 18, 2019, 10:45pm

Hey, if this is in regards to HetrixTools, you can schedule your maintenances with a simple cronjob to wget or curl our API’s maintenance endpoint: