Having a strange issue on one of my Plex box, all the damn CPUs are stuck at 800MHz? The box has been having strangely high load lately so I checked it out for what might be the issue. RAID array is fine, IO load is relatively low, but CPUs are maxed out (which doesn’t make sense for what it was doing at the time). CPU temps are fine (low 40’s).
CPU clock speed reads (during 100% CPU use) -
~$ cat /proc/cpuinfo | grep MHz
cpu MHz : 798.217
cpu MHz : 798.214
cpu MHz : 798.217
cpu MHz : 798.216
cpu MHz : 798.215
cpu MHz : 798.215
cpu MHz : 798.215
cpu MHz : 798.216
This is an E3 processor transcoding only 2 files, yet load is extremely high, CPU usage is at 100%, and CPU frequency is being limited to ~800MHz for some reason.
The issue presented itself after the machine had been running for a good long while (many months) without interruption, then suddenly it started. I regularly apply updates to this machine so it’s possible something was updated just prior to the issue starting.
I’ll check the BIOS now. Here’s cpufreq output:
analyzing CPU 7:
driver: intel_pstate
CPUs which run at the same hardware frequency: 7
CPUs which need to have their frequency coordinated by software: 7
maximum transition latency: 4294.55 ms.
hardware limits: 800 MHz - 3.80 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 800 MHz and 3.80 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 798 MHz.
Well, in the old AMD times, AMD FX, had a the same issue.
The CPU gets power over VRM’s (Voltage Regular Modules), the cpu could get throttled even if the CPU was at about 40-50 degrees celsia.
The issue was the VRM’s where gone over 50, if they go over 50, they reduce power delivery which reduces the CPU clock speed. No idea if this can be replicated to Intel, but I think its likely.
Maybe power down the machine for 30 minutes, boot it up and see if you get normal clock speeds.
Otherwise, let the swap the mainboard, maybe bad VRM’s or bad PSU.
Not sure how I can check its health or if it is acting properly, but I do see a steep drop off in power consumption just a few days ago when this all started.
Thanks, I’ll give that a shot now. If no luck, I’ll put in a ticket as suggested. Just wanted to make sure I do my due diligence before contacting the provider for an issue that might have been fixable on my end.
I somehow missed the part where this is about a hosted server. For some reason I thought this was about a server back at your own place.
Anyhow, the drop in power consumption can be explained by the CPU locking at 800 MHz or the other way around… Only way to really find out is swapping it out for a different one.
Can be throttled down in hardware. For example, when you remove a PSU from a MicroCloud unit, all blades will go down to 800 MHz to conserve the energy and prevent tripping the other power line. This ignores all requests from the OS.
Yeah, could be something like that. The server is a blade in a larger chassis, not sure what model he uses. Put in a ticket with all my troubleshooting steps and diagnostic info, I’ll update this thread when a solution is found.
Getting migrated to a new machine. They are prepping a new system for me to migrate my data over to, then they’ll swap IPs over to the new machine, and I’ll be on my merry way. Hopefully should be back in action by the day’s end.