Website GPT-3 Alternative Large Language Models (LLMs) DialoGPT

GPT-2

Generative Pre-trained Transformer 2 by OpenAI

About GPT-2

GPT-2 is a state-of-the-art artificial intelligence developed by OpenAI in February 2019.

OpenAI GPT-2 model was first proposed in the paper Language Models are Unsupervised Multitask Learners by Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever.

It is a machine learning algorithm based on the transformer architecture, which was trained on a huge dataset containing approximately 40GB of text data.

The paper summarizes GPT-2 as:

GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.