Skip to content
NVIDIA Open Source

AI Cluster Runtime

Tooling for optimized, validated, and reproducible GPU-accelerated Kubernetes.

AICR makes it easy to stand up GPU-accelerated Kubernetes clusters with version-locked recipes you can deploy anywhere.

Install AICR
$ brew tap NVIDIA/aicr
$ brew install aicr
or
$ curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
Why We Built This

Running GPU-accelerated Kubernetes
clusters reliably is hard.

Small differences in kernel versions, drivers, container runtimes, operators, and Kubernetes releases can cause failures that are difficult to diagnose and expensive to reproduce.

Historically, this knowledge has lived in internal validation pipelines and runbooks. AI Cluster Runtime makes it available to everyone.

0%
of organizations run AI workloads on Kubernetes
~0%
are fully in production
Source: CNCF Survey
How It Works

Define a Working GPU Kubernetes Cluster in Minutes

You describe your target (cloud, GPU, OS, workload intent), and AICR generates a version-locked configuration you can deploy through your existing pipeline.

AICR end-to-end workflow: Ingest, Recipe Generation, Recipe, Bundling, Deploy, Validate
Recipe
Generate an optimized, version-locked configuration for your specific environment.
Bundle
Convert the recipe into deployment-ready artifacts for Helm, ArgoCD, or OCI images.
Deploy
Apply through your existing CD pipeline. No new tooling required.
Validate
Verify deployment, performance, and conformance checks against your live cluster.
Recipes

Every AICR recipe is

Optimized

Tuned for a specific combination of hardware, cloud, OS, and workload intent.

Validated

Passes automated constraint and compatibility checks before publishing.

Reproducible

Same inputs produce identical deployments every time.

Composable

Recipes compose from layered overlays: base defaults, cloud, accelerator, OS, and workload-specific tuning.

Secure

SLSA Level 3 provenance, SPDX SBOMs, and Sigstore cosign attestations on every release.

Standards Based

Built on existing standards. Recipes are YAML, bundles produce Helm charts, and deployment works through Helm, ArgoCD, or any CD pipeline.

Supported Environments

Configure any Kubernetes cluster.

AICR generates recipes for managed or self-hosted Kubernetes deployments. Current recipes are optimized for EKS, GKE, and self-managed clusters with H100 and GB200 GPUs on Ubuntu and COS. Support for additional environments and accelerators is on the roadmap.

Cloud or On-Premises
Amazon EKS, GKE, self-managed
GPUs
NVIDIA H100, GB200
Workloads
Training (Kubeflow), Inference (Dynamo)
Open Source

Don't see your environment? Add it.

AI Cluster Runtime is Apache 2.0. We welcome contributions from CSPs, OEMs, platform teams, and individual operators: new recipes, bundler formats, validation checks, or bug reports.

Copy an existing overlay, update the criteria and component configuration, run make qualify, and open a PR.

Quick Start

Install and generate your first recipe in under two minutes.

$ brew tap NVIDIA/aicr
$ brew install aicr
or
$ curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
then
$ aicr recipe --service eks --accelerator h100 --intent training --output recipe.yaml
$ aicr bundle --recipe recipe.yaml --deployer helm --output ./bundle
NVIDIA AI Cluster Runtime
Apache License 2.0

Released under the Apache 2.0 License.