Scaling and Performance¶

This page covers scaling targets, load test plans, sizing recommendations, and monitoring guidance for OpenClaw Enterprise deployments.

Scaling Target¶

OpenClaw Enterprise is designed to support 500 concurrent users per deployment with acceptable response times across all operations.

Performance Targets¶

Operation	Target	Notes
Policy hot-reload	< 60 seconds	Time from PolicyBundle CR update to OPA reflecting new policies
Audit queries	< 10 seconds	Complex queries across partitioned audit tables
Briefing generation	Within cron window	Depends on configured schedule; default 30-minute windows
Auto-response classification + response	< 30 seconds	End-to-end from message receipt to response dispatch
Policy evaluation (per call)	< 100ms (P95)	Single OPA policy evaluation via REST API

Load Test Plan¶

The load test plan validates that the system meets performance targets under full load. Tests use k6 for HTTP load generation and a custom harness for WebSocket/sessions_send testing. Metrics are collected via Prometheus and visualized in Grafana.

Test Environment¶

Component	Configuration
Gateway Replicas	3 (minimum HA configuration)
OPA Sidecars	1 per gateway pod
PostgreSQL	16.x with PgBouncer connection pooling
Redis	7.x cluster mode
Cluster	Production-equivalent K8s cluster

Scenario 1: Briefing Generation Under Load¶

Parameter	Value
Concurrent Users	500
Action	Each user triggers `generate_briefing` simultaneously
Pass Criteria	P95 < 30s, P99 < 60s, zero errors

This scenario validates that the briefing generation pipeline (task scanner, correlator, scorer) can handle all users requesting briefings at the same time, such as during a morning start-of-day spike.

Scenario 2: Policy Evaluation Throughput¶

Parameter	Value
Concurrent Users	500
Action	Each user triggers 10 tool invocations (5,000 total `policy.evaluate` calls)
Pass Criteria	P95 < 100ms, P99 < 500ms

This scenario validates that the OPA sidecar can handle burst policy evaluation load without becoming a bottleneck.

Scenario 3: Audit Log Write Throughput¶

Parameter	Value
Concurrent Users	500
Action	Each action generates an audit entry (5,000+ writes)
Pass Criteria	Zero dropped entries, P95 write < 50ms

This scenario validates the append-only audit writer under sustained write pressure. No audit entries may be dropped under any circumstances.

Scenario 4: Connector Read Burst¶

Parameter	Value
Concurrent Users	100 (connector rate limits apply)
Action	Each user reads from all 5 connectors simultaneously
Pass Criteria	No crashes, rate-limited requests queued or retried

This scenario validates graceful degradation when external connector API rate limits are hit. The system must not crash or lose data; requests should be queued and retried.

Scenario 5: Mixed Workload (Realistic)¶

Parameter	Value
Concurrent Users	500
Duration	30 minutes sustained
Error Rate Target	< 0.1%

Workload distribution:

Activity	Percentage
Reading briefings/tasks	40%
Triggering auto-responses	25%
Querying audit log	20%
Admin operations (policy CRUD)	10%
OCIP agent-to-agent exchanges	5%

Pass criteria: Error rate < 0.1%, P95 < 5s for reads, P95 < 10s for writes.

Load Test Results Template¶

Results will be recorded after execution against a production-equivalent environment:

Scenario	P50	P95	P99	Error Rate	Status
Briefing Generation	--	--	--	--	Not run
Policy Evaluation	--	--	--	--	Not run
Audit Write	--	--	--	--	Not run
Connector Burst	--	--	--	--	Not run
Mixed Workload	--	--	--	--	Not run

Scaling Recommendations¶

Gateway Pod Replicas¶

Deployment Size	Minimum Replicas	Recommended	Notes
Development	1	1	`deploymentMode: single`
Small (< 50 users)	2	2	Basic redundancy
Production (< 500 users)	3	3--5	`deploymentMode: ha`
Large (500+ users)	3	5+	Use HPA (see below)

Set replicas in the OpenClawInstance CR:

spec:
  deploymentMode: ha
  replicas: 3

Database Connection Pooling (PgBouncer)¶

PgBouncer is strongly recommended for production deployments to prevent connection exhaustion.

Scale	`max_client_conn`	`default_pool_size`	`reserve_pool_size`
10 users	100	10	5
100 users	200	25	10
500 users	500	50	20

Example PgBouncer configuration:

[databases]
openclaw = host=postgres-primary port=5432 dbname=openclaw

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
pool_mode = transaction
max_client_conn = 500
default_pool_size = 50
reserve_pool_size = 20
server_tls_sslmode = verify-full
server_tls_ca_file = /etc/ssl/certs/pg-ca.crt

Redis Cluster Sizing¶

Scale	Nodes	Memory per Node	Max Connections
10 users	1 (standalone)	256Mi	100
100 users	3 (cluster)	512Mi	500
500 users	6 (3 masters + 3 replicas)	1Gi	2000

OPA Sidecar Resource Limits¶

The OPA sidecar runs in the same pod as each gateway replica. Adjust resource limits based on policy complexity and evaluation frequency.

Scale	CPU Request	CPU Limit	Memory Request	Memory Limit
Small (< 50 users)	100m	500m	128Mi	256Mi
Medium (< 200 users)	250m	1	256Mi	512Mi
Large (500 users)	500m	1	256Mi	512Mi

Note: OPA's memory usage is primarily driven by the number and complexity of loaded policies, not by query throughput. Monitor actual usage and adjust accordingly.

Horizontal Pod Autoscaler (HPA)¶

For production deployments, configure an HPA to automatically scale gateway pods based on CPU and memory utilization.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: production-gateway-hpa
  namespace: openclaw-enterprise
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: production-gateway
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Pods
          value: 1
          periodSeconds: 120

Key HPA settings:

Setting	Value	Rationale
`minReplicas`	3	Minimum for HA
`maxReplicas`	10	Cap to prevent runaway scaling
CPU target	70%	Scale before saturation
Memory target	80%	Memory is less spiky than CPU
Scale-up window	60s	React quickly to load increases
Scale-down window	300s	Avoid flapping during variable load

Pod Disruption Budget¶

To maintain availability during voluntary disruptions (node upgrades, cluster scaling):

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: production-gateway-pdb
  namespace: openclaw-enterprise
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: openclaw-enterprise
      app.kubernetes.io/instance: production
      app.kubernetes.io/component: gateway

Metrics to Monitor¶

Application Metrics¶

Metric	Target	Alert Threshold
Request throughput (req/s)	> 1,000	< 500 sustained
P50 response time	< 500ms	> 1s sustained
P95 response time	< 5s	> 10s sustained
P99 response time	< 15s	> 30s sustained
Error rate	< 0.1%	> 1% sustained

Infrastructure Metrics¶

Metric	Target	Alert Threshold
Gateway CPU utilization	< 70%	> 85% sustained for 5m
Gateway memory utilization	< 80%	> 90% sustained for 5m
OPA sidecar CPU	< 50%	> 80% sustained for 5m
OPA sidecar memory	< 70%	> 85% sustained for 5m

Database Metrics¶

Metric	Target	Alert Threshold
DB connection pool utilization	< 70%	> 85% sustained
DB query P95 latency	< 100ms	> 500ms sustained
DB active connections	< `default_pool_size`	> 80% of pool size
DB disk usage	< 80% of provisioned	> 85% of provisioned

Business Metrics¶

Metric	Target	Alert Threshold
Audit entry completeness	100%	Any gap
Policy evaluation success rate	100%	Any failure
Connector sync success rate	> 99%	< 95% over 1 hour

Recommended Monitoring Stack¶

graph LR
    GW[Gateway Pods] -->|metrics| PROM[Prometheus]
    OPA[OPA Sidecars] -->|metrics| PROM
    OP[Operator] -->|metrics| PROM
    PG[(PostgreSQL)] -->|pg_exporter| PROM
    RD[(Redis)] -->|redis_exporter| PROM
    PROM --> GRAF[Grafana]
    PROM --> AM[Alertmanager]
    AM --> PD[PagerDuty / Slack / Email]

Use the following exporters:

Exporter	Target
kube-state-metrics	Kubernetes resource state
node-exporter	Node-level CPU, memory, disk
postgres_exporter	PostgreSQL connection pool, query stats
redis_exporter	Redis memory, connections, command stats
OPA built-in metrics	Policy evaluation latency and counts