Is it ok to write off GPT-3’s op-ed writing experiment?

Artificial Intelligence will not destroy humans. Believe me.” Although the article is not perfect, there is a clarity of argument that previous AI models have …
The programme called Eugene Gootsman simulated a 13-year Ukrainian boy achieved this feat in 2014, at a tournament organised by University of Reading, UK.

It took nearly 65 years after the Turing Test was devised—13 years if one considers the start of competitions based on the test like the Loebner Prize —for a computer programme to beat the Turing test. The programme called Eugene Gootsman simulated a 13-year Ukrainian boy achieved this feat in 2014, at a tournament organised by University of Reading, UK.

Since then each year the Loebner Prize has been awarded to one or other programmes—in 2016, Japanese chatbot Mitsuku won the award. AI has broken far greater barriers since, than just convincing humans that a computer is human. It has beaten grandmasters in Chess, defeated players in the game Alpha Go, administered sermons, acted as an advocate and helped resolve parking ticket and driving violations. Every day, some form of artificial intelligence is being used by us, when we interact with personal assistants like Google, Siri, Alexa or Cortana.

Last week, AI achieved yet another landmark when The Guardian published an opinion piece by GPT-3, a language model created by OpenAI. The third iteration of the model, GPT-3, which was recently unveiled by OpenAI, was fed a few lines by editors and it produced eight iterations of the article titled “A robot wrote this entire article. Are you scared yet, human?” using cues from feeds and the following introduction. “I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could “spell the end of the human race.” I am here to convince you not to worry.

Artificial Intelligence will not destroy humans. Believe me.” Although the article is not perfect, there is a clarity of argument that previous AI models have not been able to achieve. However, given that The Guardian picked up the best of paras from each of the eight articles, it really can’t be determined how many corrections had to be made. But given its predecessor’s success in writing children’s storybooks—GPT-2 had an accuracy of 89%, slightly behind humans at 92%—and its 100-times larger database of 175 billion parameters, it would not be unfathomable to presume that GPT-3 was able to get it right the first time.

Although artificial intelligence has been used to write news reports or report on companies’ results, this is the first significant attempt to get the model to write an opinion. More important, it is a first major test of its auto-complete feature in a setting where GPT-3 is expected to do more than just repeat a task. Many may claim that The Guardian experiment was marred by the fact that the editors had picked out the best paras, and that not all versions of GPT-3’s writing were made public.

But what also needs to be considered is that GPT-3 may get better with writing. For instance, in this case, while Guardian picked best paras from eight iterations, over time, the model can learn the basis on which paras are selected. The results would not be perfect, but they would still be better. This is also evident from how GPT-3 has been using its autocomplete feature to create full images. Over a few iterations, it has been able to get the results right.

Besides, we need to temper down are expectations of what the technology can or cannot do. The purpose of the model is not to replace human intelligence but to complement it. So, even if cues were given and the model could come out with eight meaningful iterations, it still is an achievement. Technology has not yet reached the point where it can think like a rational person.

This is also evident from winners of the Turing test. A majority of the winners have used AI models to behave like a teen between the age of 13-17. So, AI also has its limitations.

This is also the reason that major companies are not working towards enhancing AI smartness but making it more trustworthy. One of the biggest problems with GPT-3’s auto-complete feature is that it does not cross-verify its information inputs. This is a problem that leads most people to question the credibility of AI processes.

However, companies like Diffbot are creating models to overcome this disadvantage.

Earlier this month, in an article in MIT Technology Review, the company claimed that it was using knowledge graphs to create a more trustworthy AI system. The idea is that artificial intelligence will use knowledge graphs to cross-verify systems that can lead to a more trustworthy information setting. The problem is there aren’t always two versions of any story, and computers can, for now, best work in binary models.

Knowledge graphs are the first step in this direction. Diffbot isn’t the first company to make use of these. Google has been using knowledge graphs on information for some famous personalities. While the model has its challenges, over time, it has improved. One can expect Diffbot to improve upon such features.

Until then, humans can be rest assured that robots do come in peace. They may automate many jobs, but will also create new opportunities. We also need a new benchmark for testing artificial intelligence.