Skip to content

Troubleshooting

This section captures the most common pitfalls and how to resolve them quickly.

Client does not appear in the UI

Symptoms: Client starts, but no node shows up in the dashboard.

Checks: - Verify the client can reach the host on --host-discover-port. - Confirm the host is reachable from the client (use a non-loopback --host-ip). - Check firewall rules between client and host.

UI loads but data is empty

Symptoms: UI opens, but no nodes or deployments show up.

Checks: - Ensure the backend is running on --host-backend-port. - Verify VITE_BACKEND_HOST in host/frontend/.env matches your host IP. - If you changed ports, restart the frontend service.

Blank page behind reverse proxy

Symptoms: The UI shows a blank page when accessed through Nginx or another reverse proxy.

Checks: - If you proxy under a path (for example /vllm/), set VITE_BASE_PATH=/vllm/ in host/frontend/.env. - Restart the frontend service so Vite picks up the new base path.

Consul port confusion

Symptoms: Clients fail to register when using the Consul default port (8500).

Explanation: - The host maps Consul's container port 8500 to a host port (default 47528). - Clients must use the host port (--host-discover-port, default 47528).

CUDA detection fails

Symptoms: Client install fails with an error about CUDA detection.

Checks: - Ensure nvcc or nvidia-smi is on PATH. - Verify NVIDIA drivers are installed and the GPU is visible.

vLLM wheel not found

Symptoms: Install fails after detecting CUDA.

Checks: - The vLLM wheel must exist for your CUDA version and CPU architecture. - If no wheel matches your exact CUDA version, the installer automatically tries the highest compatible version. If that also fails, consider installing a supported CUDA version or building vLLM from source.

uv not found when running as a systemd service

Symptoms: Deploying a model fails with "uv is not installed or not on PATH".

Explanation: - Systemd services run with a minimal PATH that may not include user-local directories.

Checks: - Verify uv is installed: which uv or check ~/.local/bin/uv and ~/.cargo/bin/uv. - The client automatically searches common locations (/usr/local/bin, /usr/bin, /home/*/.local/bin, /home/*/.cargo/bin), but if uv is installed elsewhere, add its directory to the systemd service's Environment=PATH=... line.

Deployment stuck in "loading"

Symptoms: A deployment stays in the loading state and never transitions to running.

Checks: - Open the deployment logs to see venv creation or vLLM startup errors. - Common causes: network issues downloading packages, insufficient GPU memory, model not found on Hugging Face, or missing HF_TOKEN for gated models. - The readiness check polls /health and /v1/models on the deployment port. Ensure no firewall blocks localhost access on the client node.

Deployments missing after backend restart

Symptoms: Running models disappear from the dashboard after restarting the host backend.

Explanation: - The sync loop automatically rediscovers running deployments from clients within ~10 seconds. If deployments still don't appear, check that the client nodes are reachable from the host.

Data disappears after host down

Symptoms: Previously created deployments are gone after shutdown.

Explanation: - host down runs docker compose down -v, which wipes the Postgres volume. - Remove -v in code if you want persistent data.

Tip

When debugging, start the host in the foreground to see backend and frontend logs in the terminal.