

Right now it just makes more sense to not use the API directly, but the subscriptions, they seem to be better priced. Also, Anthropic seems to be more expensive…
How that will look in the future, I don’t know, I’m fairly sure they’re progressively increasing the price, or rather reduce the amount of tokens you can use in the subscription (as they did with gpt 5.5 and claude opus 4.7). The chinese competitors are getting increasingly more interesting. It’s also quite impressive how well small models like qwen 3.6 27B run already on a (not so affordable) 24+GB GPU, unfortunately still far from the quality of say gpt 5.5, but probably comparable to o4 or something like that, certainly usable.






As much as I’d like this to be true (don’t believe all the benchmarks), in reality, using e.g. gpt 5.5 is still a lot less pain in the ass, mostly has to do with more reprompting (gpt is just smarter, oneshots stuff more often) + a lot slower (on an RTX 3090 for reference).
I’ve tried using it for some time, but I think I’m faster writing (better, although that’s also true for gpt-5.5) code by hand, than using this (+ I need the valuable VRAM for other stuff, as I’m a graphics/shader programmer most of the time).
That said, it’s already fairly impressive how much progress these smaller models have made the last year, it’s usable, you can “vibe-code” at least simple stuff.