Stack

GitHub Actions runners for vLLM

vLLM serving tests need a real GPU with continuous batching; CPU fallback isn't exercised in CI. Cirun runs your vLLM CI on a right-sized cloud VM on your own account — pick AWS, GCP, Azure, Oracle, DigitalOcean, OpenStack or on-prem, billed by your cloud, never per CI minute.

Run vLLM on Cirun Docs

Why this fits

vLLM serving tests need a real GPU with continuous batching; CPU fallback isn't exercised in CI.
Cirun is cloud-neutral — every supported cloud has at least one SKU that fits vLLM; pick whichever account you already have.
Ephemeral by default — no state from the previous PR leaks into yours, no flaky-cache surprises.

.cirun.yml

1runners:
2  - name: vllm-runner
3    cloud: aws
4    instance_type: p4d.24xlarge
5    # Use AWS Deep Learning AMI GPU PyTorch on Ubuntu 22.04, or the
6    # Cirun-published NVIDIA AMI.
7    machine_image: ami-04823729c75214919
8    labels:
9      - cirun-vllm

.cirun.yml

1runners:
2  - name: vllm-runner
3    cloud: gcp
4    instance_type: a2-highgpu-1g
5    # Use a GCP Deep Learning VM image family
6    # (deeplearning-platform-release) for pre-installed CUDA + drivers.
7    machine_image: projects/deeplearning-platform-release/global/images/family/pytorch-latest-cu121
8    labels:
9      - cirun-vllm

.cirun.yml

1runners:
2  - name: vllm-runner
3    cloud: azure
4    instance_type: Standard_ND96asr_v4
5    # Use Microsoft's HPC/DSVM Ubuntu image for pre-installed CUDA +
6    # drivers.
7    machine_image: microsoft-dsvm:ubuntu-hpc:2204:latest
8    labels:
9      - cirun-vllm

Drop this in your repo root. The first workflow that requests the runner label spins this configuration up on your cloud account.

Related stacks

Ready to run your CI here?

Cirun is free for open source. For private repos, flat monthly plans by repo count — never per CI minute.

Run vLLM on Cirun Docs

Sources · verified 2026-05-25