Vast.ai
Cheapest GPU cloud — peer-to-peer marketplace for budget training
- Absolute cheapest GPU compute available
- Widest GPU variety including consumer cards
H100 cloud comparison · April 2026
Where to actually get NVIDIA H100 capacity — 16 clouds compared on on-demand price, availability and cluster size. From $1.99/h.
The NVIDIA H100 is the dominant accelerator for serious LLM training and high-throughput inference in 2026. Compared to the A100, it delivers ~3× FP16 throughput and ~6× FP8 throughput thanks to the Transformer Engine — but availability is the bottleneck, not performance.
Across the 16 GPU clouds with on-demand H100s, hourly pricing spans $1.99/h to $4.10/h for identical hardware. The choice is rarely just price — it's where you can actually get H100 capacity right now.
Specialist clouds win on price. RunPod, Lambda Labs and CoreWeave dominate on-demand H100 availability and cost 40–60% less than AWS p5 / GCP A3 / Azure NDA100 v5 for equivalent compute.
| Provider | Starting Price | Top GPUs | Highlights | Rating | CTA |
|---|---|---|---|---|---|
| Vast.ai Editor's Choice | from $0.10/h | RTX 3090, RTX 4090, A100 ≤80GB |
| ★★★★☆ | View pricing |
| Hyperstack | from $0.11/h | RTX A6000, A100 80GB, H100 ≤80GB |
| ★★★★☆ | View pricing |
| RunPod Editor's Choice | from $0.20/h | RTX 3090, RTX 4090, A100 80GB ≤80GB |
| ★★★★★ | View pricing |
| TensorDock | from $0.21/h | RTX 4090, RTX 3090, A100 80GB ≤80GB |
| ★★★★☆ | View pricing |
| Massed Compute | from $0.35/h | RTX A6000, A40, A100 80GB ≤80GB |
| ★★★★☆ | View pricing |
| Jarvis Labs | from $0.39/h | RTX 6000 Ada, A100 40GB, A100 80GB ≤80GB |
| ★★★★☆ | View pricing |
| Lyceum Editor's Choice | from $0.39/h | A100 80GB, H100, H200 ≤141GB |
| ★★★★☆ | View pricing |
| Crusoe | from $0.40/h | H100, H200, B200 ≤192GB |
| ★★★★☆ | View pricing |
| Scaleway | from €0.83/h | L4, L40S, H100 ≤80GB |
| ★★★★☆ | View pricing |
| Lambda Labs Editor's Choice | from $1.10/h | A100 40GB, A100 80GB, H100 ≤80GB |
| ★★★★★ | View pricing |
| Together AI | from $1.49/h | H100, H200, A100 80GB ≤141GB |
| ★★★★☆ | View pricing |
| Nebius Editor's Choice | from $1.55/h | H100, H200, B200 ≤192GB |
| ★★★★★ | View pricing |
| CoreWeave | from $2.06/h | H100 SXM, A100 SXM, A40 ≤80GB |
| ★★★★☆ | View pricing |
| Google Cloud GPU | from $2.48/h | A100 40GB, A100 80GB, H100 ≤80GB |
| ★★★★☆ | View pricing |
| Azure GPU (NCv3/NDA) | from $2.94/h | A100, H100, V100 ≤80GB |
| ★★★★☆ | View pricing |
| AWS GPU (EC2) | from $3.06/h | A100, H100, V100 ≤80GB |
| ★★★★☆ | View pricing |
Cheapest GPU cloud — peer-to-peer marketplace for budget training
Global GPU cloud specialist — H100, A100 80GB and L40 from $0.11/h
Best value GPU cloud — huge selection, community + secure cloud
Marketplace GPU cloud — RTX 4090 from $0.21/h, H100 from $1.99/h
Workstation-grade GPUs for AI/ML/VFX — A100 from $1.79/h
On-demand H100 / A100 / RTX 6000 Ada from $0.39/h
RunPod Secure Cloud at $1.99/h is the cheapest on-demand H100 80GB. RunPod Community can be cheaper but is interruptible. For reserved/long-term commits, Lambda Labs and CoreWeave can quote significantly lower than the $1.99/h on-demand rate.
AWS p5 (8× H100) instances are concentrated in select regions (us-east-1, us-west-2, eu-west-1) and are heavily reserved by enterprise customers. On-demand stockouts are common during US working hours. Specialist clouds like RunPod and CoreWeave have larger free-pool inventories.
For Llama-3 70B fine-tuning or large-scale training, H100 is 2–3× faster and despite costing more per hour, often cheaper per training run. For inference of <13B models or research workloads, A100 80GB is more cost-effective.
For full fine-tuning: 8× H100 (one DGX-equivalent node) for ~12-24 hours per epoch with 100K samples. For QLoRA: 1× H100 80GB suffices for ~6-8 hours. CoreWeave and Lambda Labs are best for multi-node H100 jobs (InfiniBand interconnect).
H100 SXM (used by CoreWeave, AWS p5, GCP A3) has NVLink up to 900 GB/s for multi-GPU jobs, while H100 PCIe (RunPod, Lambda) is limited to PCIe Gen5 ~128 GB/s but is ~10-15% cheaper. SXM is essential for ≥4-GPU training, PCIe is fine for single-GPU inference and ≤2-GPU training.
Get an email when GPU prices drop or availability changes at your preferred provider.
No spam. Unsubscribe any time.