Every company incorporated under Companies Act-2013 or its predecessor Act are required to comp...
GPT-3 model is making waves on the internet for its ability to generate human-like text. Will it be able to live up to its hype? Implementing GPT 3 is expensive, and it can also be misused to create content that looks like human written content, and it can be used to spread hate.
GPT stands for Generative Pre-Trained Transformer. GPT-3 is presently the world’s largest language learning model. It can be used to write poems, articles, books, sift through a legal document, and can even be used to translate or write code and or even better than humans.
GPT-3 was released on June 11 by open AI which is a non-profit AI research company founded by Elon Musk, who resigned from the board but remains a co-chair, and others as an Application Programming Interface for developers to test and build a host of smart software products. Open AI is planning to commercialize the model.
The Predecessor of GPT 3, much smaller GPT-2, had 1.5 billion parameters though a smaller dataset was released to avoid potential misuse and trained on a dataset of 8 million web pages. Parameters allow Machine Learning, which is a subset of Artificial Intelligence; models make predictions on new data. For instance- weights in a neural network.
GPT-2 has trained on more than 10X the amount of data than its predecessor, GPT, which was released in the month of June 2018. GPT-2 do require any task-specific training data such as Wikipedia, books, news, to learn language tasks like question answering, reading comprehension, summarization, and translation from raw text. The reason being data scientists can use pre-trained models and a technique of machine learning called Transfer Learning to solve problems identical to the one that was solved by the pre-trained model.
India’s regional social media platform i.e., Sharechat, pre-trained a GPT-2 model on a corpus constructed from Hindi Wikipedia and Hindi Common Crawl data to generate poetry in Hindi.
As per Debdoot Mukherjee, vice-president AI at Sharechat, GPT-3 is a big leap for the NLP community. Firstly it does not bother about syntax parsing, grammar, etc. each of which is laborious tasks. Secondly, he said that he doesn’t need to be a linguist or a Ph.D. All he needs is to have some data in the language he needs to translate and knowledge of deep learning.
In a 22 July paper with the title “Language models are few shot learners,” the authors describe the GPT-3 model as an autoregressive language model with 175 billion parameters. Autoregressive models use past values to predict future ones.
Usually, humans learn a new language with the help of a few examples or simple instructions. They are also able to understand the context of the words. For instance- humans know that the word “bank” can be used either for talking about a river or finance, depending upon the context. GPT-3 intends to use this contextual ability and the transformer model, which reads the entire sequence of words in a single instance instead of word-by-word, thereby consuming less computing power, to achieve similar results.
GPT-3 is an extremely well-read AI language model. A human on average can read about 600-700 books, assuming 8-10 books a year for 70 years and about 125000 articles, assuming five every day for 70 years in their lifetime. It is humanly impossible for many of us to memorize this vast reading material and reproduce it on demand.
In contrast, this GPT-3 model has digested about 500 billion words from sources such as the internet and books. Common crawl is an open repository that can be accessed as well as analyzed by anyone. It has petabytes of data collected over 8 years web crawling. Further, GPT-3 can recall and instantly draw inferences from this data repository.
Kashyap Kompella, CEO of technology industry analyst firm RPA2AI Research said that no doubt, GPT-3 is a grand achievement. The Scale is superlative, and the AI is stupendous, but the wow factor wears off after a bit. He said that GPT -3 has issues where it may create nonsensical text or insensitive text and may create unnecessary headaches for those deploying it. He further voiced his concern saying that it can be weaponized to create realistic phishing e-mails.
Ganesh Gopalan, CEO of the gnani.ai, a deep tech AI company, also has a similar take. He said that GPT-3 had revolutionized language models to solve specific NLP domain tasks as it would just require limited additional training for the domain, compared to conventional models. However, it offers APIs and not the complete model. He further said that if it lives up to its hype, it or its future enhanced model can potentially put people like content writers and traditional programmers out of work.
It can also be used to create human-like content and can spread hate and communal bias. These are valid concerns.
The authors of the GPT-3 believe that the AI language models can be misused to:
They also point out that biases present in training data can lead models to generate stereotyped or prejudiced content. The authors of the paper note that large pre-trained language models are not grounded in other domains of experience, thus lack a large amount of context about the world.
Read our article:Chatbots on Steroids can rewire Business- GPT-3 Model