My VPS is slow. How do I find out why?
What this is
"Slow" is a symptom, not a diagnosis. Behind it there is always a specific saturated resource, or a network path problem pretending to be one, and each has a check that takes minutes. This page is the systematic walk-through: where slowness actually comes from, how to measure each candidate, and what to do about what you find.
One thing to rule out first, because it reframes the search: our platform runs with deliberate headroom. Nodes are provisioned and monitored to keep ample free CPU, memory, storage throughput, and uplink capacity at all times, that's the design (and the reason we don't oversell). "The host is overloaded" is therefore the least likely explanation, and in practice slowness almost always turns out to live in one of two places: the network path between you and the VPS, or inside the VPS itself. The method below checks them in order of likelihood.
Step 1: is the server slow, or the path to it?
Separate the two with one observation: does the VPS feel slow from within?
- Over SSH, run something purely local:
time ls -R /usr/share > /dev/null, or watchhtoprespond. If local commands are instant and keystrokes only lag over the network, the server is fine, the path is your problem. - Confirm from a second vantage point: run your VPS IP through ping.pe (pings it from dozens of worldwide locations at once, green everywhere = the VPS and our network are fine), our Looking Glass, or your phone on mobile data. Fast from everywhere except your location means the problem sits between your location and us.
Long international routes deserve a special mention: from parts of South and East Asia, trans-continental paths to US or EU datacenters routinely carry higher latency and congestion at peak hours, which reads as "the server is slow" when the server is idle. The network problems guide shows how to prove where the path degrades with mtr, and the speed testing guide covers throughput specifically. If the path is the finding, that's the guide to follow, no change on the VPS will fix a mid-path problem.
If the VPS is slow from inside too, continue.
Step 2: memory pressure and OOM kills
Exhausted memory doesn't feel like an error, it feels like slowness and instability: services restarting for no visible reason, requests stalling, a database that "crashes sometimes".
- Check the real headroom:
free -h, and read the available column (how to read it correctly, high "used" alone is normal). - Check for OOM kills, the kernel killing your processes when memory truly runs out. Important platform fact: on Linux VPS, OOM kills are not visible inside your VPS, you won't find them in your own logs or dmesg. The My VPS is Down page surfaces them: run it for the VPS and it reports recent OOM kills. (On Premium VPS, full KVM,
dmesg | grep -i "out of memory"works as usual.) - Found OOM kills or near-zero available memory? Identify the consumer,
ps aux --sort=-%mem | head, then either fix its appetite (database buffer sizes are the usual oversized culprit) or upgrade the plan. Our Linux VPS run without swap by design, memory is fully dedicated, so sustained overcommitment ends in kills rather than silent disk-thrash (and swap wouldn't help).
Step 3: CPU saturation, and its four usual causes
Measure first:
htop(ortop): overall CPU, and which processes hold it.- Load average (in
htop's header oruptime): roughly, how many processes want CPU time at once. Compare it to your vCPU count, a load persistently above the number of cores means work is queuing and everything waits. Load 6 on a 2-core VPS is saturation; load 1.5 on 4 cores is idle. ps aux --sort=-%cpu | headfor the ranked list.
What you find is nearly always one of these four:
1. A cryptominer, meaning you've been compromised. All cores pinned at 100% around the clock by a process you don't recognize (miners masquerade under random or system-looking names; xmrig is the honest one). Killing it is not the fix, something let it in and modern bots respawn what you kill. Follow the full recovery procedure: My VPS was hacked.
2. Container sprawl. Automation that spawns containers, node-running and airdrop-farming stacks are notorious for quietly scaling to dozens or hundreds, will consume any CPU it can see. Check reality: docker ps | wc -l and docker stats (live per-container CPU/memory). If the workload is legitimate, cap it, --cpus and memory limits per container (or deploy.resources in compose), so one stack can't starve the VPS. If you didn't launch it, see cause 1.
3. Overlapping cron jobs. A script scheduled every minute that sometimes takes longer than a minute begins accumulating: two copies, then ten, then hundreds, each slower than the last, a self-inflicted fork bomb in slow motion. Diagnose with ps aux | grep yourscript, if multiple generations are running, that's it. The fix is a lock so a new run refuses to start while the old one lives: flock -n /tmp/job.lock your-command in the crontab line, or a systemd timer with a service (which serializes by design).
4. An unoptimized script or app. One process, pinned, legitimately yours. This is a code problem: profile it, batch its work, add the missing index (see the database section below), or accept it needs more cores, Extra vCPU exists for exactly this.
Step 4: storage I/O
Least common on NVMe, but cheap to check: install sysstat and run iostat -x 2, sustained %util near 100 with high await means something is hammering the disk, usually a database missing indexes (full-table scans are disk work), runaway logging, or a backup job at the wrong hour. And confirm the disk isn't simply full, a full or inode-exhausted filesystem produces spectacular, misleading slowness.
Step 5: check the Graphs for history
Everything above measures now. Your VPS's Graphs tab holds the long view, CPU, memory, disk, traffic over time. Two questions to ask it:
- Does the slow period line up with a resource spike? A CPU plateau starting Tuesday matches "slow since Tuesday" and points inward.
- Are the graphs flat during the slowness? Quiet graphs during a slow spell push suspicion firmly back to the network path (Step 1) or to something the app is waiting on externally (a third-party API, a remote database).
Step 6: still nothing obvious? Monitor continuously
Intermittent slowness, fine now, awful at 3 AM, can't be caught with a live htop. Put an agent on the VPS and let it record:
- Netdata is the low-effort choice: one-line install, per-second metrics for every subsystem (CPU, memory, disk, per-process, containers, MySQL/PostgreSQL plugins), a built-in dashboard, and it's free on a single server. The next time slowness strikes, you scroll back to the exact minute and see what moved.
- Prefer assembling your own: Prometheus node_exporter + Grafana is the standard DIY stack, more setup, more control.
An agent turns "it was slow yesterday, no idea why" into a chart with the answer on it, and it's the evidence that makes any escalation (to us, or to a developer) immediately actionable.
If "slow" means your website
When the server itself checks out but the site is sluggish, the wins are at the application layer, in descending order of effect-per-effort:
- Cache at the edge. Put the site behind a CDN, Cloudflare (free tier included) or Bunny, so static assets (images, CSS, JS) are served from a location near each visitor instead of from your VPS on every request. Visitors far from your datacenter feel this most, it converts geography (Step 1's unfixable problem) into cache hits. Bonus: your origin IP gets hidden, which also matters under attack.
- PHP sites: verify OPcache. It caches compiled PHP bytecode and is the single biggest free win for WordPress and friends:
php -i | grep opcache.enableshould say On (it usually is; if not, enable it in the PHP ini). Add an object cache (Redis or Memcached, bound to localhost) so the CMS stops recomputing queries on every page view. - MySQL/MariaDB: find the slow queries instead of guessing. Enable the slow query log (
slow_query_log = 1,long_query_time = 1), let it collect during real traffic, thenEXPLAINthe offenders, the overwhelming majority resolve to a missing index. For tuning beyond that, sizeinnodb_buffer_pool_sizeto hold your working set (a common starting point is 50 to 70% of RAM on a database-only server, much less when it shares the box), and MySQLTuner gives a sane automated review. - PostgreSQL: same discipline, native tools. Enable pg_stat_statements and query it for the top total-time statements,
EXPLAIN ANALYZEthose, and fix with indexes first. Checkshared_buffers(the common starting point is ~25% of RAM) and confirm autovacuum is on, a table that never gets vacuumed degrades into exactly this kind of creeping slowness. - If the app is still slow with warm caches and indexed queries, the remaining time is in the code path itself (N+1 query patterns, synchronous external API calls), which an APM view in Netdata or the framework's debug toolbar will show.
When to bring it to us
If you've localized a problem you can't act on, the path degrades mid-route (send both mtrs), or your measurements genuinely point at the platform, open a ticket with what you measured: when it happens, what htop/Graphs/Netdata showed, and any mtr output. We check the node side quickly, and with evidence attached, the first reply is usually the fix, but given the headroom we run, the wins above are where the answer is found in the great majority of cases.
Related questions
- "Why is my VPS slow?"
- "How do I check what's using my CPU or memory?"
- "What does load average mean and what's a normal value?"
- "How do I know if my VPS was OOM killed?"
- "Why are there hundreds of Docker containers on my VPS?"
- "My cron job made the server unusable, how do I stop it overlapping?"
- "How do I find slow MySQL or PostgreSQL queries?"
- "How do I make my website on a VPS faster?"
- "What monitoring should I install to catch intermittent slowness?"