Some countervailing forces off the top of my head:
* Hardware improvements will reduce costs
* Model training improvements (read: more efficient model training) will reduce costs
* Better models will reduce costs (more inference for less hardware time while keeping quality constant)
* Tooling and platform will stabilize—less need to dump money into applications and backend systems because they will become mature—also improvements in AI efficiency and quality will lower the cost of maintenance and future feature development
* Energy buildout will stabilize (we will eventually have enough energy supply to meet AI demand)
* Chips market will stabilize (chip supply will catch up to AI demand, lowering the hardware costs)
What's your time estimation for the last 2 points? Last I heard TSMC is not willing to commit dozens of billions to build new fabs for what might be a fad. Granted, they're not theonly foundry, but that's a signal nonetheless. Given the current craziness around hardware, I doubt the stabilization will come soon. Probably not before token costs soar.
I'm far from an expert, but I'm pretty sure the chips bottleneck at the moment is memory chips which has little to do with TSMC and the memory industry is _dumping_ capital into increasing manufacturing capacity. I'm not sure when energy prices will stabilize.
* Hardware improvements will reduce costs
* Model training improvements (read: more efficient model training) will reduce costs
* Better models will reduce costs (more inference for less hardware time while keeping quality constant)
* Tooling and platform will stabilize—less need to dump money into applications and backend systems because they will become mature—also improvements in AI efficiency and quality will lower the cost of maintenance and future feature development
* Energy buildout will stabilize (we will eventually have enough energy supply to meet AI demand)
* Chips market will stabilize (chip supply will catch up to AI demand, lowering the hardware costs)