List of Articles Icon

Knowledge Base

Guides and answers for your VPS, the client area, and billing

Fixing 502, 503, and 504 errors on your VPS

What this is

Your site shows 502 Bad Gateway, 503 Service Unavailable, or 504 Gateway Timeout. Before anything else, notice what the error proves: something answered the request, so your web server or proxy (nginx, Apache, Traefik, Cloudflare) is up and reachable. What failed is the thing behind it, the application, PHP-FPM, or a container, when the front door asked it for the page. That cuts the search space in half before you've run a single command: this is an application-layer problem inside the VPS, not an outage.

First, if you're behind Cloudflare

A Cloudflare-branded error page (their logo, or codes like 521/522/523) means the failure may be between Cloudflare and your origin, not inside it. Isolate by asking the VPS directly, from your machine:

curl -I -H "Host: yourdomain.com" http://YOUR.VPS.IP/

If that returns your site fine, the problem is the Cloudflare↔origin leg (origin firewall rules, TLS mode mismatch). If it returns the same 502/503, continue below, the problem is on the VPS.

Decoding the three errors

  • 502 Bad Gateway: the proxy asked the backend and got nothing usable, the app has crashed, PHP-FPM is down, or the proxy is pointed at a port/socket where nothing is listening.
  • 503 Service Unavailable: the backend is refusing work, an app in maintenance mode, or a worker pool with nothing free.
  • 504 Gateway Timeout: the backend is alive but too slow, a hung database query, a stalled external API call, or work that legitimately exceeds the proxy's timeout.

The five-minute diagnosis

1. Is the backend actually running?

systemctl status php8.2-fpm     # or php-fpm, gunicorn, your app's service

Dead? systemctl restart it, then find out why it died before calling it fixed: journalctl -u <service> -n 50 shows its last words.

2. Read the proxy's error log, it names the culprit.

tail -20 /var/log/nginx/error.log

The message is diagnostic on its own: connect() failed (111: Connection refused) = nothing listening at the target (backend down, or wrong port); connect() to unix:/run/php/php-fpm.sock failed (2: No such file or directory) = wrong socket path (or FPM not running); (13: Permission denied) on a socket = ownership mismatch; upstream timed out = the 504 case.

3. Check the proxy's target against reality. What does the config point at (grep -r proxy_pass /etc/nginx/, or the fastcgi_pass line), and is anything listening there: ss -tulnp. A version bump that renamed the PHP-FPM socket (php8.1-fpm.sockphp8.2-fpm.sock) is a classic silent 502.

4. Check the two silent assassins. Backends rarely die unprovoked:

  • Out of memory. The kernel kills the biggest process, which is usually your app or database. On Linux VPS, OOM kills don't show in your own logs, run My VPS is Down, which reports them (on Premium, dmesg shows them). If OOM is the story, fixing the 502 means fixing the memory, not just restarting.
  • Full disk (or inodes). Apps crash or refuse writes in strange ways when storage is gone, ten seconds to rule out: df -h and df -i, and the disk-full guide if either is at 100%.

If your app runs in Docker: container-to-container 502s

A very common modern setup: a reverse-proxy container forwards to app containers by name (proxy_pass http://app:3000). When the proxy can't reach the app across Docker's internal network, the visitor sees a 502 even though both containers show as "running". Check, in order:

  • Are they on the same Docker network? Name resolution only works within a shared network: docker network inspect <network> should list both containers. Containers on different networks (or one on the default bridge) can't resolve each other, the proxy logs "host not found in upstream".
  • Can the proxy actually reach the app? Test from inside the proxy container: docker exec <proxy> wget -qO- http://app:3000/ (or curl). Failure here confirms the problem is between containers, not in your site config.
  • Is the app listening on the right interface inside its container? An app bound to 127.0.0.1 inside the container is unreachable from other containers, inside a container it should listen on 0.0.0.0, and you control public exposure at publish time instead (the 127.0.0.1 publish rule).
  • Did the app container restart and change IP? nginx resolves proxy_pass hostnames once at startup and caches the IP, so a recreated app container leaves nginx forwarding to a stale address. Quick fix: restart the proxy container. Durable fix: use a variable with Docker's resolver so nginx re-resolves (resolver 127.0.0.11; set $up http://app:3000; proxy_pass $up;), Traefik and Caddy handle this dynamically on their own.
  • And check the app container itself: docker logs <app> , a crash-looping app produces exactly intermittent 502s.

Docker Swarm and panels built on it (EasyPanel)

If your stack runs on Docker Swarm, including panels that use Swarm under the hood such as EasyPanel, there's a platform-specific cause that looks exactly like a mysterious, persistent 502: Swarm's default virtual-IP routing relies on ipvs, which our Linux VPS platform doesn't provide. Services must run in dnsrr endpoint mode instead. The full explanation, the commands, and how to convert all of EasyPanel's services are in Running Docker Swarm on Linux VPS, if you're on Swarm and every other check on this page comes up clean, that's your answer.

Fixing a real 504

If the log says upstream timed out, the backend is too slow, and the durable fix is making it faster, not making the proxy wait longer: find the slow query or hung external call (the database section of the slow-VPS guide covers exactly this). Raise proxy_read_timeout/fastcgi_read_timeout only for work that is legitimately long (an export, a report), and raise it for that route, not globally.

For 503s from a worker pool

PHP-FPM logs server reached pm.max_children when every worker is busy, requests queue and then 503. Raising pm.max_children helps only if RAM allows it (each worker costs memory, size it as workers × per-worker usage against your plan), and if workers are busy because of slow queries, fix those first or more workers just pile onto the database.

Prevent the next one

  • Let systemd resurrect your app: Restart=always (plus RestartSec=2) in the service unit turns a crash into a blip instead of an outage.
  • Put an agent on it: Netdata records exactly what happened at the moment the backend died, memory spike, disk, crash loop, so the 3 AM 502 is diagnosable at 9 AM.
  • Keep the disk and memory hygiene from their own guides, most "random" backend deaths are one of those two.

Is it ever the platform?

A 502/503/504 is almost always application-layer, by the time you see it, the network delivered your request and your web server answered it. A platform problem presents differently: the whole VPS unreachable (nothing answers at all). If that's what you're seeing, check ping.pe and run My VPS is Down instead, and see network problems.

Still need help?

You can open a support ticket. So we can help on the first reply, it's worth mentioning:

  • the VPS and the site's domain,
  • which error it is (502, 503, or 504) and the matching line from the proxy's error log,
  • whether the backend runs directly or in Docker.
  • "What does 502 Bad Gateway mean on my VPS and how do I fix it?"
  • "Why does nginx say connection refused to upstream?"
  • "My site shows 502 after a PHP update."
  • "Why can't my proxy container reach my app container?"
  • "EasyPanel apps return 502 on my VPS, why?"
  • "How do I fix 504 gateway timeout properly?"
  • "What does pm.max_children reached mean?"
Last reviewed: 2026-07-02