Revolutionizing Natural Language Processing with GPT-based Chatbots: A Review

Smita Adhikari,Bhawana Dhakal

doi:10.3126/tj.v3i1.61943

Abstract

OpenAI introduced a language model called GPT (Generative Pre-trained Transformer model. The algorithm learns to predict the following word in a phrase based on the context of the preceding words after being trained on a large text dataset. GPT employs a transformer architecture, a class of neural networks that have been demonstrated to perform exceptionally well on tasks involving natural language understanding. Pre-training was one of the fundamental advances of GPT, allowing the model to learn a variety of broad language representations from a vast amount of text input before being fine-tuned on tasks. Through this pre-training phase, the model can recognize pertinent information from the data that could be used for various upcoming tasks, such as information extraction, language translation, summarization, etc. Since its initial release, GPT has undergone several iterations since its initial release, including GPT-1, GPT-2, and GPT-3. GPT-1 was the model's first version, followed by GPT-2 and GPT-3. With a significantly larger model performance and a broader dataset, GPT-2, launched in 2019, performed better on various tasks. GPT-3,the most recent version of the model, was released in 2020. With 175 billion parameters, it is the largest model to date and could generalize across various domains. The GPT language model has a variation called ChatGPT that was created exclusively for conversational AI challenges. It is pre-trained on conversational data and tailored to tasks such as generating conversational responses and comprehending conversations. In general, ChatGPT is a powerful conversational AI technology that has demonstrated promising outcomes in various use cases and industries, including customer service, e-commerce, and entertainment.

Full Text