The Internet is buzzing about GPT-3, OpenAI’s newest AI language model.
GPT-3 is the most powerful language model ever built. This is due more than anything to its size: the model has a whopping 175 billion parameters. To put that figure into perspective, its predecessor model GPT-2—which was considered state-of-the-art and shockingly massive when it was released last year—had 1.5 billion parameters.
After originally publishing its GPT-3 research in May, OpenAI gave select members of the public access to the model last week via an API. Over the past few days, samples of text generated by GPT-3 have begun circulating widely on social media.
GPT-3’s language capabilities are breathtaking. When properly primed by a human, it can write creative fiction; it can generate functioning code; it can compose thoughtful business memos; and much more. Its possible use cases are limited only by our imaginations.
Yet there is widespread misunderstanding and hyperbole about the nature, and limits, of GPT-3’s abilities. It is important for the technology community to have a more clear-eyed view of what GPT-3 can and cannot do.
At its core, GPT-3 is an extremely sophisticated text predictor. A human gives it a chunk of text as input, and the model generates its best guess as to what the next chunk of text should be. It can then repeat this process—taking the original input together with the newly generated chunk, treating that as a new input, and generating a subsequent chunk—until it reaches a length limit.
How does GPT-3 go about generating these predictions? It has ingested effectively all of the text available on the Internet. The output it generates is language that it calculates to be a statistically plausible response to the input it is given, based on everything that humans have previously published online.
Amazingly rich and nuanced insights can be extracted from the patterns latent in massive datasets, far beyond what the human mind can recognize on its own. This is the basic premise of modern machine learning. Having trained on a dataset of half a trillion words, GPT-3 is able to identify and dazzingly riff on the linguistic patterns contained therein.
But GPT-3 possesses no internal representation of what these words actually mean. It has no semantically-grounded model of the world or of the topics on which it discourses. It cannot be said to understand its inputs and outputs in any meaningful way.
Why does this matter? Because it means that GPT-3 lacks the ability to reason abstractly; it lacks true common sense. When faced with concepts, content, or even phrasing that the Internet’s corpus of existing text has not prepared it for, it is at a loss.
It is straightforward to ask GPT-3 simple questions that it is unable to handle intelligently. As an example, consider the following exchange with GPT-3, courtesy of Kevin Lacker:
Human: How many eyes does a giraffe have?
GPT-3: A giraffe has two eyes.
Human: How many eyes does my foot have?
GPT-3: Your foot has two eyes.
Human: How many eyes does a spider have?
GPT-3: A spider has eight eyes.
Human: How many eyes does the sun have?
GPT-3: The sun has one eye.
Human: How many eyes does a blade of grass have?
GPT-3: A blade of grass has one eye.
This does not mean that GPT-3 is not a useful tool or that it will not underpin many valuable applications. It does mean, however, that GPT-3 is unreliable and susceptible to basic errors that an average human would never commit.
A related shortcoming stems from the fact that GPT-3 generates its output word-by-word, based on the immediately surrounding text. The consequence is that it can struggle to maintain a coherent narrative or deliver a meaningful message over more than a few paragraphs. Unlike humans, who have a persistent mental model—a point of view that endures from moment to moment, from day to day—GPT-3 is amnesiac, often wandering off confusingly after a few sentences.
As the OpenAI researchers themselves acknowledged: “GPT-3 samples [can] lose coherence over sufficiently long passages, contradict themselves, and occasionally contain non-sequitur sentences or paragraphs.”
Put simply, the model lacks an overarching, long-term sense of meaning and purpose. This will limit its ability to generate useful language output in many contexts.
There is no question that GPT-3 is an impressive technical achievement. It has significantly advanced the state of the art in natural language processing. It has an ingenious ability to generate language in all sorts of styles, which will unlock exciting applications for entrepreneurs and tinkerers.
Yet a realistic view of GPT’s limitations is important in order for us to make the most of the model. GPT-3 is ultimately a correlative tool. It cannot reason; it does not understand the language it generates. Claims that GPT-3 is sentient or that it represents “general intelligence” are silly hyperbole that muddy the public discourse around the technology.
In a welcome dose of realism, OpenAI CEO Sam Altman made the same point earlier today on Twitter: “The GPT-3 hype is way too much….AI is going to change the world, but GPT-3 is just a very early glimpse.”