Updated: Mar 24
For a brief primer, start here.
There are many articles that describe the expanded capabilities of GPT-4 vs the previous versions. Multi-modal, enhanced language support, system prompts, etc...this is not one of those articles.
Here we explore, with one concrete example, what the increase in the size of the new model translates to in terms of finding deeper meaning in the training material, i.e. in 'Reasoning'.
(and conciseness, read on...)
During the GPT-4 Developer Livestream, Greg Brockman gave an example of a task that GPT-4 was finally able to complete that was too taxing for previous models. People that have been playing with the large language models since last year (I'll include myself in that population) also have their favorite prompts that have been relatively fruitless until now. In those cases where we kept trying to push the model out of its comfort zone--and I'm not talking about Jailbreaking--the conversation would go one of a few ways. It would:
Refuse to answer, tell you to start over, or possibly error out.
Refuse to answer and provide objections that you can qualify, i.e. of an objectionable nature, can't be specific enough, etc... In those cases you can re-ask the question or add the qualifiers, which takes you in the direction of what Daniel Miessler refers to as AI Whispering.
Try to redirect you, like any good parent that wants to extinguish certain behaviors, i.e. I can't answer that, but did you know about ...?
Answer as best it can, but start to jaunt into the territory of hallucination.
-OR- it will attempt to answer as best it can, but just misses the point.
We already know that ChatGPT can answer questions about movies, books, and anything in the public Internet zeitgeist, and we know that the new model has been specifically trained in disciplines like medicine, law, and philosophy. So, I asked a question about a book. If you're like me, you're enjoying the new AI superpower: the ability to dialog with books and research papers, as if you were the ghost editor or the co-author asking 'what if...?' at the eleventh hour.
GPT-3.5 understood that a reference was made in the book, but didn't quite grasp the teachable moment. Bing tried to search for the answer on the Internet but didn't find anything (as expected). Bard's answer was almost exactly that same as GPT-3.5. However, GPT-4 was spot on:
GPT-4 reasoned like a reader, read between the lines, and proffered the answer I was looking for. So, I don't discount my confirmation bias, but notwithstanding that, I was glad that it didn't dance around the topic and find the GI Joe "knowing is half the battle" teachable moment. It went right to the point. As we rely more on AI to do pre-processing for us, we (I) would like to know that it's gleaning the deeper meaning in the material.
As a parting thought, whenever I interact with AI, I am reminded of the scene where Picard and Data were working together in Stellar Cartography ...solving a complex problem, backed up by the Isolinear super computer that powered the Enterprise. Meanwhile, here, in the 21st century, I remain impressed with that we can do with transistor technology.