GPU compute cost estimator · April 2026

GPU Cloud Cost Calculator

Estimate what your training run or inference workload will cost across 20 GPU clouds. Side-by-side, with savings vs the most expensive provider.

Model preset Tokens to train on (millions)

Quick presets:

Cost across 20 GPU cloud providers

Rank	Provider	GPU	Time	Total Cost	vs Most Expensive	Action

Estimates use public TFLOPS specs (NVIDIA datasheets), empirical utilization factors from MLPerf, and our verified April 2026 pricing. Actual cost depends on framework, batch size, sequence length, and provider availability. Always test on your workload before committing.

How the calculator works

Training cost formula

FLOPs needed = M × params × tokens
hours = FLOPs / (gpu_TFLOPS × util × 3600)
cost  = hours × gpu_price_per_hour

Where M is the model-multiplier (6 for full fine-tune, 0.6–1.2 for QLoRA/LoRA per Hu et al. 2021 + Dettmers et al. 2023).

Inference cost formula

seconds = tokens / tokens_per_second
hours   = seconds / 3600
cost    = hours × gpu_price_per_hour

Throughput (tokens/sec) is from public benchmarks at typical batch sizes. Real throughput varies with batch + sequence length.

Why specialist clouds win

For identical hardware (e.g. H100 80GB), price spans $1.99/h to $11.06/h across providers. Specialist clouds (RunPod, Lambda, Vast.ai) skip enterprise overhead. Hyperscalers price for ecosystem value, not raw compute.

Source data

GPU TFLOPS: NVIDIA datasheets (A100, H100, RTX 4090)
Utilization: MLPerf v4.0 results, FlashAttention-3 paper
Pricing: Verified April 30, 2026 — see llms-full.txt for full price matrix