Question: How do I troubleshoot high CPU Load? (Resource Issues)

Content Error or Suggest an Edit

Notice a grammatical error or technical inaccuracy? Let us know; we will give you credit!

Question

Today’s question comes from Facebook!

“Hi folks, how would I go about troubleshooting this high CPU load? Thanks!”

-Awesome facebook user! πŸ˜‰

Depending, this is normal. Why? Because that’s what PHP does. It will take up to 100% CPU usage because it’s a process that isn’t limited to how much of the system’s resources it can use.

Understanding Linux Load Averages

So let’s look at “load average” Your load average is what’s going to tell you your average CPU Load over time, which is what you should be concerned about. You can read more on this awesome article

Brendan Gregg’s Blog – Linux Load Averages: Solving the Mystery

Some interpretations:

If the averages are 0.0, then your system is idle.

If the 1-minute average is higher than the 5 or 15-minute averages, then the load is -increasing.

If the 1-minute average is lower than the 5 or 15-minute averages, then the load is decreasing.

If they are higher than your CPU count, then you might have a performance problem (it depends).

Gregg’s Blog – Linux Load Averages: Solving the Mystery

So now that you know what’s going on with the resources. What’s actually causing the load spike?

Investigating Server and Website Resource Issues

You now have two options, either upgrade your server resources or investigate what is causing the increased load.

1. Understanding Server Resources

As you can see from the htop command in the screenshot, this instance has 1 CPU Core and 2G of memory.

Personally, 1 CPU Core and 2G of memory is not the bare minimum I would consider. Even for a single site, I don’t fault this person. This isn’t common knowledge, and again it’s my personal preference.

However… it’s possible to run a single site or multiple sites on this configuration. You would have to put some considerable effort into making sure you tweak the memory and CPU consumption. Turning off services, tweaking MySQL buffers, etc. Typically the output of time and energy isn’t worth it for me.

I’d simply upgrade to a 2 CPU Core and 2GB or 4GB memory instance. This leaves plenty of room for system services and MySQL/Redis/Memcache for single or multiple sites. But now we’re getting into instance sizing, which is another whole conversation.

2. Review your Websites Configuration and Resources Consumption

The following is a list that you can use to try to narrow down potential issues with your WordPress site and the increase in resource consumption.

  1. Illegitimate traffic.
  2. Non-performant Code on Action
  3. Non-performant Timed Event Code (WordPress Cron)
  4. A legitimate request.

Let’s go through them all.

a. Illegitimate traffic.

Traffic can become illegitimate if it triggers an increase in resource usage. Enabling rate limiting, blocking requests and enabling security features at the server level is going to save you resources. Specifically, you want to look at the following.

  • Throttle or Block XML-RPC
  • Throttle or block logins (fail2ban).
  • There’s more, but review your access logs to see.

You will also want to check out our Cloudflare guides which can provide some much-needed protection before requests make it to your site and server.

b. Non-performant Code on Action

Any action towards your WordPress site can initiate functions that may trigger a resource spike. This can be the case for logged-in or logged-out users.

For logged-in users, you can test this with the Query Monitor plugin. Note that this plugin can cause increased loading times on your browser, site and server. The Query Monitor only works when you’re logged in, and won’t track requests from non-logged-in users or WordPress cron actions. So you’ll have to try replicating the actions that you think might be causing an increase in resource usage.

Once you’ve found the specific action, you can use the Query Monitor plugin to dig further into PHP functions, database queries, and outside API requests that might be slowing down requests or spiking resources.

If the issue is related to a plugin, you can screenshot the specific function or query and send it to the developer for review.

c. Non-performant Timed Event Code (WordPress Cron)

The WordPress cron, by default, runs whenever a new request to your WordPress site occurs. This means that a visitor might trigger the WordPress cron to run, and will ultimately have to wait for the WordPress cron events to complete before loading the page. This is because WordPress doesn’t trigger the cron events to run in the background. Instead, the events are run during the page load, which can delay or result in a time-out.

For timed events through the WordPress cron system that occur in the background, you will need something that records how long these events take to complete. Unfortunately, there isn’t anything native in the WordPress core or a plugin that measures how long the WordPress cron events take to execute.

It’s suggested to set up the WordPress cron to run as a Linux cronjob, stopping the WordPress cron events from running during visitor page loads. In the post linked below, you can read more about setting up the WordPress cron to run using the Linux cron. The article includes how to log the WordPress cron output, and measure how long the WordPress cron events take to complete.

Another option is to use Newrelic to record and measure everything from PHP functions to MySQL queries. Other APMs (Application Performance Monitoring) are available, such as Zoho’s Site 24×7, just to name one alternative.

d. A legitimate request.

Most legitimate requests should be served from cache, so set up full page caching or find out why the request isn’t being cached. You can also set up object caching, but that’s for another article.

Other Potential Causes for High Resource Usage

Here are a couple of other suggestions.

  • It could also be that your cronjob is broken.
  • Transients aren’t being saved correctly.
  • Misconfiguration of your WordPress site.
  • Custom mu-plugins causing issues.
  • Debug or Verbose logging or script actions. (Elementor Debug mode)
  • System tasks run via the cronjob outside of WordPress.
  • Backups jobs running.

Conclusion

So what can you take away from this? Overall, find the root cause first; treating the symptoms won’t address the problem.

Short Term. If you want to throw money at the problem, upgrade your server resources in small increments. Monitoring and reviewing to see if the problem goes away. However, something as simple as blocking xmlrpc.php requests will usually address resource usage.

Long term. You need to be aware of your server resource usage and apply best practices to your websites to ensure that bad actors, illegitimate traffic, and plugins aren’t causing resource spikes.

I hope to provide more articles on this subject, explicitly tracking resources and identifying resource issues. Stay tuned!

Changelog

  • 08-30-2022 – Rewrote the entire article.
0 Shares:

You May Also Like