OpenClaw Server Setup
Step-by-step guide for deploying an OpenClaw gateway on a VPS (e.g. Hetzner Cloud) and connecting it to ibl.ai as a chat runner.
Architecture
Student (browser) β ibl.ai Platform (Django Channels / ASGI)
β
ClawLLMRunner
β
OpenClawClient (WSS + Ed25519 device identity signing)
β
Caddy (on host, TLS via Let's Encrypt)
β reverse proxy to localhost:18789
βΌ
OpenClaw Gateway (systemd service, loopback only)
β
LLM Provider (Anthropic, etc.)
Why Caddy on the host (not Docker): Caddy must run directly on the host so that TCP connections to OpenClaw arrive from 127.0.0.1. This preserves loopback auto-approval for device identity. If Caddy ran in a Docker container, it would connect via Docker bridge (172.x.x.x) and OpenClaw would treat it as a remote connection.
Why device identity signing: On vanilla OpenClaw, the gateway requires Ed25519 device identity in the WebSocket connect handshake. Without it, connections succeed but the gateway grants zero scopes -- effectively treating the client as unauthenticated. This is the root cause of "missing scope: operator.read" failures. The platform backend signs each connect with its own Ed25519 keypair.
Prerequisites
Before starting, you need:
- A VPS or dedicated server -- Hetzner CX22 (2 vCPU, 4 GB RAM, ~$4/mo) is sufficient. OpenClaw is lightweight; the LLM API call is the bottleneck, not local compute. Use the Ashburn location for US East proximity.
- A domain or subdomain pointing to the server's actual IP (not an elastic IP -- see Snags Reference).
- Anthropic API key (or another LLM provider key).
- Ports 80 and 443 open on your cloud firewall before installing Caddy.
Critical: DNS and firewall must be ready first
Let's Encrypt ACME challenges will fail if:
- DNS points to an elastic IP that isn't routing to the actual server
- Port 443 is not open on the cloud firewall (only port 80 was initially opened)
After 5 failed attempts, Let's Encrypt rate-limits the domain for 1 hour. All three of these must be correct before Caddy's first start:
- DNS A record β server's real IP (verify with
dig your-domain.example.com +short) - Port 80 open inbound from
0.0.0.0/0(forhttp-01ACME challenge) - Port 443 open inbound (for
tls-alpn-01fallback and actual HTTPS traffic)
Also: don't toggle firewall rules while Caddy is retrying -- each failed attempt counts against the rate limit.
Part 1: Install OpenClaw
1.1 -- SSH in and install Node.js 22
ssh root@
# Install Node.js 22 (skip if already installed -- check with node --version)
curl -fsSL https://deb.nodesource.com/setup_22.x | bash -
apt-get install -y nodejs
node --version # should show v22.x.x
1.2 -- Install OpenClaw
npm install -g openclaw@latest
openclaw --version
1.3 -- Generate a gateway token
export OPENCLAW_GATEWAY_TOKEN=$(openssl rand -hex 32)
echo "$OPENCLAW_GATEWAY_TOKEN"
Save this token -- you need it when connecting to the ibl.ai platform.
Immediately persist it to ~/.bashrc so CLI commands work in future SSH sessions. Running openclaw devices list in a new SSH session will fail with MissingEnvVarError: Missing env var "OPENCLAW_GATEWAY_TOKEN" if the token was only exported in the original shell:
echo "export OPENCLAW_GATEWAY_TOKEN=$OPENCLAW_GATEWAY_TOKEN" >> ~/.bashrc
1.4 -- Write the full config
Writing the full config upfront skips the interactive onboarding wizard entirely.
mkdir -p ~/.openclaw
cat > ~/.openclaw/openclaw.json << 'CONF'
{
"meta": {
"lastTouchedVersion": ""
},
"wizard": {
"lastRunVersion": "",
"lastRunCommand": "onboard",
"lastRunMode": "local"
},
"auth": {
"profiles": {
"anthropic:default": {
"provider": "anthropic",
"mode": "api_key"
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-sonnet-4-6"
},
"workspace": "/root/.openclaw/workspace"
}
},
"commands": {
"native": "auto",
"nativeSkills": "auto",
"restart": true,
"ownerDisplay": "raw"
},
"session": {
"dmScope": "per-channel-peer"
},
"gateway": {
"port": 18789,
"mode": "local",
"bind": "loopback",
"controlUi": {
"allowedOrigins": [
"https://YOUR-DOMAIN-HERE"
]
},
"auth": {
"mode": "token",
"token": "${OPENCLAW_GATEWAY_TOKEN}"
},
"tailscale": {
"mode": "off",
"resetOnExit": false
}
}
}
CONF
Replace with the output of openclaw --version (e.g. 2026.3.13). Replace YOUR-DOMAIN-HERE with your actual domain. Change the model in agents.defaults.model.primary if needed (OpenClaw normalizes date-stamped IDs to short aliases, e.g. claude-sonnet-4-20250514 β claude-sonnet-4-6).
The wizard and meta fields tell OpenClaw that onboarding already ran, so openclaw onboard won't re-prompt. The session.dmScope: "per-channel-peer" is a security best practice for multi-user (each DM conversation gets its own session scope).
Optional: model fallbacks -- to prevent hard failures when the primary LLM provider has an outage, add fallback models:
"model": {
"primary": "anthropic/claude-sonnet-4-6",
"fallbacks": ["anthropic/claude-haiku-4-5", "openai/gpt-5"]
}
This is especially recommended for multi-agent setups where the probability of hitting an API error scales with the number of agents.
1.5 -- Set the Anthropic API key
export ANTHROPIC_API_KEY=
echo "export ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> ~/.bashrc
1.6 -- Create systemd service and start
# Create workspace directory
mkdir -p /root/.openclaw/workspace
# Enable lingering so user services survive SSH logout
loginctl enable-linger root
# Start the gateway
openclaw gateway --port 18789 &
# Or use the onboard wizard just for the systemd service:
# openclaw onboard --install-daemon
# (It will detect existing config and skip most prompts)
Verify the gateway is running:
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:18789/
# Expected: 200
Note: If you want the systemd service auto-created, you can still run
openclaw onboard --install-daemon-- it will detect the existing config ("Use existing values"), skip most prompts, and just install the service at~/.config/systemd/user/openclaw-gateway.service.
Why
loginctl enable-linger root? OpenClaw installs a user-level systemd service. Without lingering, the service dies when the last SSH session closes. Theopenclaw onboard --install-daemonwizard handles this automatically, but if you skip the wizard you must run it yourself. Verify with:loginctl show-user root 2>/dev/null | grep Linger-- should showLinger=yes. See Snag #11 for what happens when this is missed.
Part 2: Install Caddy (Reverse Proxy + TLS)
2.1 -- Install Caddy
apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' \
| gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' \
| tee /etc/apt/sources.list.d/caddy-stable.list
apt update && apt install caddy
2.2 -- Configure Caddyfile
cat > /etc/caddy/Caddyfile << 'EOF'
your-domain.example.com {
handle /api/status {
rewrite * /
reverse_proxy localhost:18789
}
reverse_proxy localhost:18789
}
EOF
systemctl restart caddy
systemctl status caddy
The /api/status rewrite shim maps the platform's health check path to / (the OpenClaw Control UI page), which returns 200 when the gateway is up. Vanilla OpenClaw has no /api/status endpoint -- this shim maintains compatibility with the ibl.ai platform's connectivity checks.
After restart, Caddy will automatically obtain a Let's Encrypt TLS certificate. Check the logs if it doesn't work:
journalctl -u caddy --no-pager -n 50
2.3 -- Control UI origin allowlist
OpenClaw's Control UI only allows connections from the gateway's own host (localhost) by default. If the Control UI shows "origin not allowed (open the Control UI from the gateway host or allow it in gateway.controlUi.allowedOrigins)", the config needs updating.
The full config in step 1.4 already includes controlUi.allowedOrigins with your domain. If you wrote a minimal config instead, or need to add origins after the fact:
openclaw config set gateway.controlUi.allowedOrigins '["https://your-domain.example.com"]'
systemctl --user restart openclaw-gateway
Part 3: Firewall
Cloud firewall (Hetzner, AWS, etc.)
Set these rules in your cloud provider's firewall console:
| Direction | Protocol | Port | Source | Purpose |
|---|---|---|---|---|
| Inbound | TCP | 22 | Management IPs | SSH |
| Inbound | TCP | 80 | 0.0.0.0/0 | ACME challenge (Let's Encrypt) |
| Inbound | TCP | 443 | 0.0.0.0/0 or allowlist | HTTPS (Caddy β OpenClaw) |
If restricting port 443 to specific IPs, you must include:
- The ibl.ai platform server's outbound IP -- find it with
curl -s ifconfig.mefrom the platform server - Your own IP -- for Control UI browser access
- Any VPN egress IPs used by your team
If the cloud firewall restricts port 443 to specific IPs and a user's IP isn't in the allowlist, the browser will show ERR_CONNECTION_TIMED_OUT. Dev containers also won't reach the server unless connected to a VPN with an allowlisted IP.
Host firewall (UFW)
ufw allow 22/tcp
ufw allow 80/tcp
ufw allow 443/tcp
ufw --force enable
Both the cloud firewall (network level, outside the server) and UFW (host level) must allow traffic for it to reach Caddy.
Part 4: Validate
4.1 -- Health check
curl -s -o /dev/null -w "%{http_code}" https://your-domain.example.com/api/status
# Expected: 200
4.2 -- Control UI
Open in browser: https://your-domain.example.com/?token=
The first browser access through Caddy will show "pairing required". Browser devices connecting through the reverse proxy are not auto-approved -- only loopback connections are. Approve the browser device:
# On the server:
openclaw devices list
openclaw devices approve
Each browser profile generates a unique device ID. This is a one-time step per browser. Do not use dangerouslyDisableDeviceAuth -- the docs call it a "severe security downgrade" and it only affects the Control UI, not programmatic WebSocket connections.
4.3 -- Chat test
In the Control UI, send a test message. You should get a response from the configured LLM.
Full stack confirmed: Browser β Caddy (TLS/Let's Encrypt) β OpenClaw Gateway β Anthropic API.
Part 5: Connect to ibl.ai
5.1 -- Register claw instance
Register through the ibl.ai platform API. See Platform Integration for the full API reference.
5.2 -- Generate and store device keypair
The ibl.ai platform backend needs an Ed25519 keypair for device identity signing. Without it, config push will fail with "missing scope: operator.read/write/admin".
Generate one:
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
from cryptography.hazmat.primitives.serialization import Encoding, NoEncryption, PrivateFormat
key = Ed25519PrivateKey.generate()
pem = key.private_bytes(Encoding.PEM, PrivateFormat.PKCS8, NoEncryption()).decode()
print(pem)
Store the private key in the claw instance's connection_params:
{
"device_identity": {
"private_key_pem": "-----BEGIN PRIVATE KEY-----\n\n-----END PRIVATE KEY-----\n"
}
}
Set this via the ibl.ai platform API (PATCH the instance's connection_params field) or through your admin interface.
How it works: OpenClawClient.connect() receives a connect.challenge from the gateway, signs it with the Ed25519 key using the v2 payload format (v2|deviceId|clientId|clientMode|role|scopes|signedAtMs|token|nonce), and includes the device object in the connect params. Fresh keypairs are auto-approved on loopback -- no manual openclaw devices approve step needed for backend connections. Each connect signs fresh (no session token caching).
5.3 -- Push config
Push configuration through the ibl.ai platform API. See Platform Integration.
Verify in logs that the push completed without "missing scope" errors. A successful push sets agents.files.set (IDENTITY.md, SOUL.md), config.get, and config.patch -- the gateway restarts itself after config.patch.
5.4 -- Test chat through the platform
- Open the mentor in any ibl.ai application (Mentor AI, Skills AI, etc.)
- Select the claw-backed mentor
- Send a test message (e.g. "Hello, say hi in 5 words")
- Verify:
- "Connected." acknowledgment appears
- Response streams in token-by-token
- Response completes (EOS received)
- Message persists on page refresh (chat history saved)
Multi-Agent Setup (Optional)
The default config creates a single agent (main). To run multiple agents on the same gateway (e.g. tutor, course-creator, admissions), add them via the CLI:
openclaw agents add tutor-agent
openclaw agents add course-creator-agent
Each agent gets its own workspace (~/.openclaw/workspace-) and agent directory (~/.openclaw/agents/). The agents appear in agents.list in openclaw.json. You can also add them by editing the config directly.
Note: More agents means more concurrent LLM API calls, which increases the chance of hitting provider rate limits or outages. Consider adding model fallbacks (see Step 1.4) if running multiple agents.
Keeping OpenClaw Updated
Check for updates periodically:
openclaw --version # current version
openclaw update # update to latest
The gateway logs a notice on startup when an update is available. After updating, restart the service:
systemctl --user restart openclaw-gateway
Caution: OpenClaw updates may wipe the paired devices list, requiring re-pairing. See Device Re-Pairing. Back up ~/.openclaw/ before major version upgrades.
Monitoring and Diagnostics
Live log tailing
Run in separate SSH sessions to watch both during operation:
# Gateway logs (WebSocket connects, chat requests, Anthropic API errors)
journalctl --user -u openclaw-gateway -f
# Caddy logs (incoming HTTPS requests, TLS issues)
journalctl -u caddy -f
What to look for in gateway logs
| Log pattern | Meaning |
|---|---|
protocol 3 | WebSocket handshake succeeded |
chat.send | Chat request sent to LLM provider |
error / ECONNREFUSED | Anthropic API call failed (key issue, rate limit, outage) |
close 4008 | WebSocket proxy issue |
missing scope | Device identity signing not working -- check keypair config |
Quick health checks (no restart needed)
# Gateway alive?
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:18789/
# Expected: 200
# Structured health status
openclaw health --json
# Connected devices
openclaw devices list
# Caddy + TLS working?
curl -s -o /dev/null -w "%{http_code}" https://your-domain.example.com/api/status
# Expected: 200
# Anthropic key still valid?
curl -s -o /dev/null -w "%{http_code}" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
https://api.anthropic.com/v1/models
# Expected: 200
# Disk/memory
df -h / && free -h
Verbose logging (not recommended during live use)
openclaw config set OPENCLAW_LOG_LEVEL debug
systemctl --user restart openclaw-gateway
# Remember to set back to info after debugging:
# openclaw config set OPENCLAW_LOG_LEVEL info
Device Re-Pairing After Gateway Restarts / Updates
The problem
OpenClaw gateway updates (npm install -g openclaw@latest) or gateway restarts can wipe the paired devices list. When this happens, the platform backend's device identity is no longer recognized by the gateway, and all mentors on that server fail with PAIRING_REQUIRED / NOT_PAIRED errors.
Users see: "The mentor is starting up, please wait..." β "The mentor is currently unavailable. Please try again later."
This affects all mentors linked to the server -- the device identity is per claw instance, not per mentor. One re-pairing fixes all mentors on that server.
How to re-pair manually
Trigger a connection attempt -- send any message to any mentor linked to the affected server. This creates a pending pairing request on the gateway.
SSH into the OpenClaw server and approve:
# List devices -- look for the "Pending" section
openclaw devices list
# Approve the pending request (use the requestId, NOT the device ID)
openclaw devices approve
- Retry the chat -- the next message should connect successfully. All mentors on this server are now fixed.
Why loopback auto-approval doesn't work
The design intent was that Caddy (on the same host) proxies to OpenClaw at localhost:18789, so connections arrive from 127.0.0.1 and are auto-approved. However, Caddy adds X-Forwarded-For headers with the remote client's IP, and OpenClaw uses those to determine the "real" client IP. Since the platform backend connects from a remote server, OpenClaw sees a non-loopback IP and requires manual approval.
Solution options (for review and automation phase)
The goal is a robust, non-fragile solution so that gateway restarts and OpenClaw updates do not require manual re-pairing. The options below are proposed for evaluation -- no implementation is specified here.
A. OpenClaw upstream (preferred if feasible)
- A1. Persist paired devices across restarts and version upgrades -- OpenClaw stores paired devices in
~/.openclaw/devices.json(or equivalent) and loads this file on startup. - A2. Config-based trusted device registration --
gateway.trustedDevicesorgateway.trustedDevicePublicKeysin config, survives restarts because it lives in config. - A3. Admin API for device approval --
POST /api/admin/devices/approveprotected by gateway token.
B. Platform-side automation
- B1. Health check detects NOT_PAIRED and alerts -- WebSocket connect attempt in health check, notify admins on failure.
- B2. Admin action: "Trigger re-pair" -- triggers connect attempt and shows the requestId.
- B3. Automated re-pair via agent on OpenClaw host -- a small service that auto-approves known devices.
C. Infrastructure / deployment
- C1. Backup and restore
devices.json-- backup before update, restore after. - C2. External persistence (e.g. R2 / S3) -- persist device state externally.
D. Reverse-proxy behavior (Caddy)
- D1. Strip forwarded headers so OpenClaw sees loopback -- Caddy strips
X-Forwarded-ForandX-Real-Ip. OpenClaw sees127.0.0.1and auto-approves. Tradeoff: no real client IP in gateway logs.
Recommendation: Pursue A1 and/or A2 with the OpenClaw project so device pairing survives restarts by design. Short term, manual re-pair plus B1 (detect and alert) reduces silent outage.
Device identity scope
- Device identity is per claw instance (stored in
connection_params.device_identity.private_key_pem) - All mentors linked to the same server share the same device
- One re-pairing approval covers all mentors on that server
- Each server with a different keypair needs its own pairing
Snags Reference
Issues encountered during initial deployments, collected here for quick reference.
| # | Issue | Root cause | Fix |
|---|---|---|---|
| 1 | Let's Encrypt ACME challenges fail | DNS pointed to elastic IP not routing to server; port 443 not open | Point DNS to actual server IP; open ports 80+443 before Caddy starts |
| 2 | Let's Encrypt rate limit (1 hour) | 5 failed ACME attempts from the above | Wait for cooldown; don't toggle firewall while Caddy retries |
| 3 | Control UI "origin not allowed" | OpenClaw only allows localhost origins by default | openclaw config set gateway.controlUi.allowedOrigins '["https://..."]' |
| 4 | Control UI "pairing required" | Browser device not auto-approved through reverse proxy | openclaw devices approve (one-time per browser) |
| 5 | Browser ERR_CONNECTION_TIMED_OUT | Cloud firewall restricting port 443; user IP not in allowlist | Add IP to cloud firewall allowlist |
| 6 | OPENCLAW_GATEWAY_TOKEN not found in new SSH sessions | Token only exported in original shell | Add export OPENCLAW_GATEWAY_TOKEN=... to ~/.bashrc |
| 7 | Config push "missing scope: operator.read" | OpenClawClient was omitting device identity from connect handshake | Implement Ed25519 device signing (see Part 5.2) |
| 8 | Dev container can't reach server | Cloud firewall restricts port 443; dev IP not allowlisted | Connect via VPN with allowlisted IP, or broaden firewall rule |
| 9 | Model ID mismatch | OpenClaw normalizes claude-sonnet-4-20250514 β claude-sonnet-4-6 | Use short alias in agent config |
| 10 | NOT_PAIRED after gateway update | Update/restart wiped paired devices; Caddy forwards X-Forwarded-For so auto-approval doesn't work | Manual re-pair (see Device Re-Pairing) |
| 11 | Gateway dies when SSH session ends | loginctl enable-linger root was skipped during manual setup | Run loginctl enable-linger root (see Step 1.6). Verify with loginctl show-user root 2>/dev/null | grep Linger |