Self hosting is an enterprise-only feature. Please contact us at [email protected] to get started.
Hardware Requirements
At a minimum, the training component requires a machine with an A100, H100, or H200 GPU. In cloud environments, we recommend the following instance types as a minimum:| Cloud Provider | Instance Type |
|---|---|
| AWS | p5.4xlarge |
| Azure | Standard_NC40ads_H100_v5 |
| GCP | a2-ultragpu-1g |
Scaling the Deployment
Typically, we scale the deployment by breaking out several components from the single instance into separate instances or managed services, in this order:ClickHouse DB
Scale out ClickHouse to a dedicated instance for data redundancy and performance.
Router & Logger
Run the AI Gateway, Request Logger, and Evaluator Runner on separate instances for horizontal scaling.
Inference Workers
Deploy inference workers separately for easier horizontal scaling and autoscaling.
Quick Start
localhost cannot be used due to Docker networking constraints. The app container needs to reach the host machine, which requires a resolvable hostname mapped via /etc/hosts.Architecture
Profiles
Docker Compose profiles allow you to selectively enable optional services:| Profile | Services | Use Case |
|---|---|---|
| (none) | Core Supabase + App | Minimal deployment with external ClickHouse |
clickhouse | + ClickHouse | Local ClickHouse instance |
gateway | + AI Gateway, Request Logger, Evaluator Runner, ClickHouse | LLM inference proxy with logging |
studio | + Supabase Studio | Database management UI |
workers | + Evaluation/Training Workers | Local GPU workers (requires NVIDIA GPU) |
full | All services | Complete local deployment |
Usage Examples
Configuration
Required Environment Variables
Edit.env and configure at minimum:
| Variable | Description |
|---|---|
POSTGRES_PASSWORD | PostgreSQL password |
JWT_SECRET | JWT signing secret (min 32 chars) |
ANON_KEY | Supabase anonymous API key |
SERVICE_ROLE_KEY | Supabase service role key |
Generating Secure Values
External Services
To use external managed services instead of local containers: External ClickHouse (see Scaling Out ClickHouse for full details):AI Gateway
For running the gateway on separate instances, see Scaling Out the Router & Logger. Thegateway profile enables a complete LLM inference proxy stack:
| Service | Description | URL |
|---|---|---|
| AI Gateway | Proxies LLM requests, handles routing, caching | http://{domain}/gw/{project}/{endpoint}/... |
| Request Logger | Batches and stores inference logs | Internal only |
| Evaluator Runner | Runs custom evaluators on inference results | http://{domain}/eval/... |
- Clients send LLM requests to the AI Gateway
- Gateway routes requests to configured model providers
- Responses are logged via the Request Logger to ClickHouse
- Evaluators can automatically score inference results
Using Pre-built Images
By default,docker compose up builds images from local source code. For production deployments, you can use pre-built images from a container registry instead.
Configure image sources in .env:
If you omit
--no-build, Docker Compose will attempt to build from source even if the image variables are set.URL Routing
Caddy routes requests based on URL path:| Path | Destination | Description |
|---|---|---|
/sbapi/* | Kong → Supabase | Supabase API (auth, database, storage) |
/gw/* | AI Gateway | LLM inference proxy (gateway profile) |
/eval/* | Evaluator Runner | Evaluator testing (gateway profile) |
/* | Datawizz App | Main application (catch-all) |
- Supabase Studio:
http://{domain}:3001(studio profile)
Production Deployment
Enable HTTPS
For production, set your domain in.env:
Security Checklist
- Change all default passwords in
.env - Generate secure
JWT_SECRET(min 32 chars) - Generate new
ANON_KEYandSERVICE_ROLE_KEY - Set
DOMAINto your actual domain - Review
DISABLE_SIGNUPsetting - Configure SMTP for email delivery
Common Operations
View Logs
Restart Services
Update Services
Upgrading / Re-running Migrations
Database migrations run automatically on first startup. They won’t re-run on subsequentdocker compose up calls (Docker caches completed containers).
When to re-run migrations:
- After pulling a new version with database schema changes
- After changing ClickHouse connection settings in
.env - If migrations failed and you’ve fixed the issue
Check Service Health
Access Database
Backup & Restore
Quick Backup
Troubleshooting
Services won’t start
- Check logs:
docker compose logs - Verify
.envfile exists and has required values - Check Docker daemon is running
Database connection errors
- Wait for database to be healthy:
docker compose ps - Check PostgreSQL logs:
docker compose logs db - Verify
POSTGRES_PASSWORDin.env
App shows “initializing” or errors
- Migrations may still be running - wait a few seconds
- Check migration logs:
docker compose logs supabase-migrations - Verify Kong is healthy:
docker compose logs kong
Workers fail to start
- Workers require NVIDIA GPU with Docker GPU support
- Verify GPU is available:
nvidia-smi - Check Docker GPU runtime:
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Gateway services not responding
- Check all gateway services are healthy:
docker compose ps | grep -E "gateway|logger|evaluator" - Restart Caddy to pick up new routes:
docker compose restart caddy - Test internal connectivity:
docker exec datawizz-caddy wget -qO- http://ai-gateway:3000/health - Check gateway logs:
docker compose logs ai-gateway request-logger evaluator-runner
SSL certificate issues (production)
- Ensure domain DNS points to your server
- Check Caddy logs:
docker compose logs caddy - Ports 80 and 443 must be accessible from the internet