Long story short, there's a paper that shows GTP-4's capabilities got worse over time. Why's that? Let's get into the details.
Was GPT-4 so powerful, they nerfed it? 🧐 Yes, they did. Massively. 😿
Some interesting paper from Stanford & UC Berkeley researchers just got published in this matter.
We might wonder why exactly this happened, whatever it was economic viability, performance or scalability issues or if some companies/organisations have found that as it was this tool gave too big of a power to the masses, closing the gap that was previously rigorously maintained, between selected, chosen group of orgs/people and the rest of the society, the gap that the original GPT-4 helped to bridge, allowing wide swathes of people to do things that previously were out of their reach.
This is pure speculation of course, we don't know that, however my tin-hat sense is tingling here. Jokes aside, it's probably a little bit of both. Economic viability and scalability are important factor when running any software and then various orgs being annoyed how their fancy coding challenges, they worked years on, could be, until recently, thrown into the bin, plus their advantage they had from the scale effect got reduced too much. That's why I think we are observing this phenomena.
Can't confirm the details, but AFAIK stuff changed from GPT-4 being one 'big piece' into having more but smaller, more specialised 'pieces' tied together. And turns out these smaller pieces don't play as well together as one central piece. Or some pieces got disabled, or were made smaller.
Also I bet you my money that what we have access to as GA GPT-4 is one thing, but somewhere out there there's the original un-nerfed GPT-4 running with only a selected few having access to it.
Now, this, instead of bridging the gap, will only increase it.
Or it's just me and my conspiracy theories. Who knows.
Maybe it's just that suddenly the training data changed or whatever :)
The nerf however is quite considerable: "For example, GPT-4 (March 2023) was very good at identifying prime numbers (accuracy 97.6%) but GPT-4 (June 2023) was very poor on these same questions (accuracy 2.4%). Interestingly GPT-3.5 (June 2023) was much better than GPT-3.5 (March 2023) in this task. GPT-4 was less willing to answer sensitive questions in June than in March, and both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March."
Don't take me wrong - GPT-4, even nerfed, is still AMAZING. However not as good as it used to be.
The paper mentioned can be found here.
ven with the reduced capabilities, it still remains an amazing tool. IMO, a lot of people, except tech-savvy passionates of the topic, will not even notice, but IMO this will set a precedent so I expect more competition to pop out and greater diversification of the models/companies at least, but I doubt any of the truly powerful ones will be publicly available, unless the Open Source catches up. So I'd expect the GA stuff to be always at least a step or two behind. Unless they release it too quickly like they did with gpt-4 and only after they realised (or got pressured coz of economic viability OR various powers)
I think the best stuff will be reserved for big enterprises/orgs, however it's a good motivation to work on the OS efforts. LLaMa 2 being released (and available through Azure soon as on-prem option!) Claude v2 just got released. We've had couple of talks with eg. Google's team and they are also doing interesting stuff, new updates incoming. IMO the direction is towards a future where it won't be just openai as the only sensible solution, maybe even already we are there.
TLDR: I think more diversity in the topics is incoming.