Getting Started with Cerebrium

Install the Cerebrium CLI and deploy a training or inference workload directly from your terminal:

# Install CLI
pip install cerebrium

# Deploy a training script on 8x H100 GPUs
cerebrium run training_script.py::train --hardware HOPPER_100:8

Cerebrium handles containerization, scheduling, and scaling automatically — no Kubernetes manifests or Terraform required.

Endpoints: REST API, streaming, and WebSocket endpoints supported out of the box
Custom Dockerfiles: Bring your own image; Cerebrium runs it exactly as defined
Autoscaling: Instant scale-out with concurrency & batching controls; no capacity planning needed
GPU Types: 12+ options including H100 (Hopper), with multi-GPU job support
Persistent Storage: Distributed storage available for checkpoints and artifacts
Observability: Full OpenTelemetry integration for metrics, logs, and scaling events
CI/CD: Gradual rollouts, versioned deployments, secrets management built-in
Security: gVisor isolation per workload, SOC 2 / HIPAA / GDPR / ISO certified, data residency controls

vLLM, SGLang, TensorRT-LLM, Triton Inference Server, Pipecat, LiveKit, WandB, Gradio, Stable Diffusion XL, Twilio