Getting Started
This guide takes you from a clean host to a working cluster with at least one client.
Prerequisites
Host:
- Docker + Docker Compose plugin (configure the docker group so no sudo is required)
- Node.js + npm
- Python 3.10–3.14
- uv (Python package manager)
Client:
- NVIDIA GPU with CUDA
- nvcc or nvidia-smi on PATH
- Python 3.10–3.14 + python3-dev and build-essential (Debian/Ubuntu)
- uv (Python package manager)
Install uv if you don't already have it:
curl -LsSf https://astral.sh/uv/install.sh | sh
On Debian/Ubuntu:
sudo apt update
sudo apt install -y python3-dev build-essential
Install (uv)
Create and activate a virtual environment:
uv venv
source .venv/bin/activate
uv pip install vllm-cluster-manager
Start the host
Foreground (no sudo):
vllm-cluster-manager host up --host-ip 0.0.0.0 --host-frontend-port 5173 --host-discover-port 11400
Persistent service (systemd):
vllm-cluster-manager host up --service --host-ip 0.0.0.0 --host-frontend-port 5173 --host-discover-port 11400
--host-discover-port sets the discovery port used for clients. Use --host-backend-port to override the backend API port (default 8000).
Start a client
Foreground (no sudo):
vllm-cluster-manager client up --host-ip 1.2.3.4 --host-discover-port 11400
Persistent service (systemd):
vllm-cluster-manager client up --service --host-ip 1.2.3.4 --host-discover-port 11400
Note
If the client cannot register, verify firewall rules and that the host is reachable from the client on the discovery port.
Stop services
vllm-cluster-manager host down
vllm-cluster-manager client down
Verify the UI
Open the UI at http://<host-ip>:<host-frontend-port>.
Common first-run checks: - The UI loads without a network error. - The host shows up as healthy. - The client appears under Nodes within ~30 seconds.
Next steps
Once a client node appears in the dashboard, you are ready to deploy models. See the Deployments page for details on vLLM version selection, GPU assignment, extra packages, and plugins.