NVIDIA Open Source

AI Cluster Runtime

Tooling for optimized, validated, and reproducible GPU-accelerated Kubernetes.

AICR makes it easy to stand up GPU-accelerated Kubernetes clusters with version-locked recipes you can deploy anywhere.

Get Started GitHub

Install AICR

$ brew tap NVIDIA/aicr
$ brew install aicr
or
$ curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --

Why We Built This

Running GPU-accelerated Kubernetes
clusters reliably is hard.

Small differences in kernel versions, drivers, container runtimes, operators, and Kubernetes releases can cause failures that are difficult to diagnose and expensive to reproduce.

Historically, this knowledge has lived in internal validation pipelines and runbooks. AI Cluster Runtime makes it available to everyone.

of organizations run AI workloads on Kubernetes

~0%

are fully in production

Source: CNCF Survey

How It Works

Define a Working GPU Kubernetes Cluster in Minutes

You describe your target (cloud, GPU, OS, workload intent), and AICR generates a version-locked configuration you can deploy through your existing pipeline.

Recipe

Generate an optimized, version-locked configuration for your specific environment.

Bundle

Convert the recipe into deployment-ready artifacts for Helm, ArgoCD, or OCI images.

Deploy

Apply through your existing CD pipeline. No new tooling required.

Validate

Verify deployment, performance, and conformance checks against your live cluster.

Recipes

Every AICR recipe is

Optimized

Tuned for a specific combination of hardware, cloud, OS, and workload intent.

Validated

Passes automated constraint and compatibility checks before publishing.

Reproducible

Same inputs produce identical deployments every time.

Composable

Recipes compose from layered overlays: base defaults, cloud, accelerator, OS, and workload-specific tuning.

Secure

SLSA Level 3 provenance, SPDX SBOMs, and Sigstore cosign attestations on every release.

Standards Based

Built on existing standards. Recipes are YAML, bundles produce Helm charts, and deployment works through Helm, ArgoCD, or any CD pipeline.

Supported Environments

Configure any Kubernetes cluster.

AICR generates recipes for managed or self-hosted Kubernetes deployments. Current recipes are optimized for EKS, GKE, and self-managed clusters with H100 and GB200 GPUs on Ubuntu and COS. Support for additional environments and accelerators is on the roadmap.

Cloud or On-Premises

Amazon EKS, GKE, self-managed

GPUs

NVIDIA H100, GB200

Workloads

Training (Kubeflow), Inference (Dynamo)

Open Source

Don't see your environment? Add it.

AI Cluster Runtime is Apache 2.0. We welcome contributions from CSPs, OEMs, platform teams, and individual operators: new recipes, bundler formats, validation checks, or bug reports.

Copy an existing overlay, update the criteria and component configuration, run make qualify, and open a PR.

Contributor Docs GitHub

Quick Start

Install and generate your first recipe in under two minutes.

$ brew tap NVIDIA/aicr
$ brew install aicr
or
$ curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
then
$ aicr recipe --service eks --accelerator h100 --intent training --output recipe.yaml
$ aicr bundle --recipe recipe.yaml --deployer helm --output ./bundle