NVIDIA DGX Cloud Lepton logo

NVIDIA DGX Cloud Lepton

Connect developers to a global network of GPU compute for building and deploying AI

Paid·Technical·Powered by NVIDIA·API available

Key strengths

Global GPU compute network unificationDesigned for AI-native teams and model buildersPowered by NVIDIA DGX-class hardware (Blackwell, Hopper)Supports full ML lifecycle: build, train, deployNo infrastructure management required
Paid only
Santa Clara, USA
No ratings yet

Developer Documentation: NVIDIA DGX Cloud Lepton

Platform Access

Authenticate via your NVIDIA account. API keys and CLI tooling are available to interact with the Lepton platform programmatically.

Key Concepts

  • Unified Compute: Lepton aggregates GPU resources across multiple data centers and cloud providers into a single control plane.
  • Job Scheduling: Submit training or inference jobs via API or UI; the platform handles GPU allocation and cluster orchestration using NVIDIA Run:ai under the hood.
  • GPU Architectures Supported: H100 (Hopper), GB200/GB300 NVL72, Blackwell-class GPUs — all with NVLink and NVSwitch interconnects for multi-GPU workloads.

Example Workflow (CLI)

# Install Lepton CLI
pip install leptonai

# Authenticate
lep login

# Deploy a model as an inference endpoint
lep photon run --name my-model --model hf:meta-llama/Llama-3-8b

# List running deployments
lep deployment list

Key Parameters

  • --model: Specifies the model source (HuggingFace, custom, etc.)
  • --resource-shape: Select GPU type and count (e.g., gpu.a10, gpu.h100)
  • --replicas: Number of inference replicas for horizontal scaling

Integrations

Works with NVIDIA AI Enterprise Suite, CUDA-X libraries, Base Command Manager, and NVIDIA Run:ai for orchestration.