Is the earlier point that people should see what they can do with "common househ...

kkielhofner · on June 15, 2023

Don't get me wrong - websites like this still have a lot of value. My overall point is the assumption that you need A100/H100 to make productive use of LLMs isn't accurate. You can go a very long way with a single 3090 in a workstation for $1000 (as frequently noted on HN, including this thread, r/localLLamA, etc). Or you can rent nearly anything you want on various platforms (most of the consumer RTX stuff is usually only available on platforms like Vast.ai). Whatever works for you in your situation but the somewhat common belief (especially in the more mainstream press) that you need at least 40-80GB of VRAM to do anything LLM related is flat out wrong.

The other benefit I'd add with buying your own GPUs is availability. They are yours and always yours, in a commercial application with deadlines, etc it's a real risk to depend on being able to get the necessary on-demand GPU compute on various cloud platforms at any point in time. There is nothing worse than logging into a cloud provider console and seeing "no availability" when you really need to get something done. For me personally this is what pushed me to buying vs cloud because I ended up in scenarios where Vast.ai was the only option left and I haven't had the best experiences with Vast.ai in terms of reliability and performance (I'm pretty sure many of the benchmarks are gamed, although I'm not sure how).

Speaking of performance, I've also seen very real issues with virtualized CPUs, what I assume is network attached storage, etc feeding data to high end GPUs fast enough (again, noted elsewhere in this thread). In benchmarking that I've done with various cloud providers unless you go for the much more expensive options on GCP and elsewhere with directly attached NVMe storage a single NVMe drive and decent CPU in a workstation will run circles around many of these cloud providers.