Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do we have any estimate on the size of OpenAI top of the line models? Would they also fit in ~512GB of (V)RAM?

Also, this whole self hosting of LLMs is a bit like cloud. Yes, you can do it, but it's a lot easier to pay for API access. And not just for small users. Personally I don't even bother self hosting transcription models which are so small that they can run on nearly any hardware.



It's nice because a company can optionaly provide a SOTA reasoning model for their clients, without having to go through a middleman e.g. HR co. Can provide an LLM for their HRMS system for a small 2000$ investment. Not 2000/month, just a one time 2000 investment.


No one will be doing anything practical with a local version of Deepseek on a $2000 server. The token throughout of this thing is like, 1 token every 4 seconds. It would take nearly a full minute just to produce a standard “Roses are red, violets are blue” poem. There’s absolutely no practical usage that you can use that for. It’s cool that you can do it, and it’s a step in the right direction, but self-hosting these wont be a viable alternative to using providers like OpenAI for business applications for a while.


> but self-hosting these wont be a viable alternative to using providers like OpenAI for business applications for a while.

Why not? While 3-4 tok/s is still on the lower end, it is still usable to the point that I can use it for any task that doesn't require me to get into a real-time communication with the model.

In other words, I don't mind waiting a 1-minute for good-enough response from the model for topic that would take me multiples of that to compile and research on my own. It's a clear net win.


If you have so little throughput that you don’t need more than 3 tokens a second, you are processing so little data that your costs from the LLM providers won’t even sniff the $2000 you will spend on the hardware to self host.


The OP said they were getting 3-4 TPS on their $2000 rig.


We crawl so we can learn to walk.


Is the size of OpenAI‘s top of the line models even relevant? Last I checked they weren’t open source in the slightest.


it would make sense if you don't want somebody else to have access to all your code and customer data.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: