kaichicken

gpu memory calculator

Will it fit? Pick a model, turn the knobs, and watch where the gigabytes go — before your GPU finds out the hard way.

8.0B params · 32 layers · 4096 hidden · 8 KV heads
weights
kv cache
gpus
estimated memory
16.4GB

fits on a single RTX 4090 (24 GB)

where the gigabytes go
  • weights15.0 GB
  • KV cache0.50 GB
  • workspace0.17 GB
  • CUDA context0.75 GB
will it fit?
  • RTX 306012 GBdoes not fit
  • RTX 409024 GBfits
  • RTX 509032 GBfits
  • A600048 GBfits
  • A10080 GBfits
  • H10080 GBfits
  • H200141 GBfits
  • B200192 GBfits
  • M4 Max128 GB unifiedfits
  • M3 Ultra512 GB unifiedfits

estimates for dense transformers with flash attention — MoE models are a different animal and not modeled. activations are ±20%; real frameworks fragment and buffer, so keep ~5% free. formulas live in lib/vram.ts of this site’s repo. spot an error? tell me in the roost.