Also worth mentioning that tools have stable output. An LLM is not a tool in tha...

viraptor · on May 28, 2025

> An LLM is not a tool in that sense – it’s not reproducible.

LLMs are perfectly reproducible. Almost all public services providing them are not. The fact that changing the model changes the output doesn't make it not reproducible, in the same way reproducible software packages depend on a set version of the compiler. But you can run a local model with zero temperature, set starting conditions and you'll get the same response every time.

klabb3 · on May 28, 2025

I know it’s technically reproducible under the right conditions, and sure it might help in some cases. But it matters little in practice – the issue is that it’s unstable relative to unrelated parameters you often have good reason to change, often uninitentionally. For instance, you can and will get vastly different output based on usual non-semantic variations in language. I’m not a logician, but this is probably even a necessity given the ginormous output space LLMs operate on.

My point is that it’s not a tool, because good tools reliably work the same way. If, for instance, a gun clicks when it’s supposed to fire, we would say that it malfunctioned. Or it fires when the safety is on. We can define what should happen, and if something else happens, there is a fault.