Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Also worth mentioning that tools have stable output. An LLM is not a tool in that sense – it’s not reproducible. Changing the model, retraining, input phrasing etc can change dramatically the output.

The best tools are transparent. They are efficient, fast and reliable, yes, but they’re also honest about what they do! You can do everything manually if you want, no magic, no hidden internal state, and with internal parts that can be broken up and tested in isolation.

With LLMs even the simple act of comparing them side by side (to decide which to use) is probabilistic and ultimately based partly on feelings. Perhaps it comes with the territory, but this makes me extremely reluctant to integrate it into engineering workflows. Even if they had amazing abilities, they lower the bar significantly from a process perspective.



> An LLM is not a tool in that sense – it’s not reproducible.

LLMs are perfectly reproducible. Almost all public services providing them are not. The fact that changing the model changes the output doesn't make it not reproducible, in the same way reproducible software packages depend on a set version of the compiler. But you can run a local model with zero temperature, set starting conditions and you'll get the same response every time.


I know it’s technically reproducible under the right conditions, and sure it might help in some cases. But it matters little in practice – the issue is that it’s unstable relative to unrelated parameters you often have good reason to change, often uninitentionally. For instance, you can and will get vastly different output based on usual non-semantic variations in language. I’m not a logician, but this is probably even a necessity given the ginormous output space LLMs operate on.

My point is that it’s not a tool, because good tools reliably work the same way. If, for instance, a gun clicks when it’s supposed to fire, we would say that it malfunctioned. Or it fires when the safety is on. We can define what should happen, and if something else happens, there is a fault.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: