Analytics India reports that according to the GPT franchise’s release cycle, the fourth generation is imminent, if not overdue. Last year, Sam Altman, the CEO of OpenAI, discussed the upcoming GPT-4 release during a Q&A session at the AC10 online meetup.
Analytics India reports that according to the GPT franchise’s release cycle, the fourth generation is imminent, if not overdue. Last year, Sam Altman, the CEO of OpenAI, discussed the upcoming GPT-4 release during a Q&A session at the AC10 online meetup. The release is most likely scheduled for July-August of this year. However, OpenAI has kept the release date under wraps, and no definitive information is available in the public domain. GPT-4, on the other hand, will not have 100 trillion parameters.
GPT-3, which will be available in May 2020, has 175 billion parameters. Deep learning is used in the third generation of the GPT-n series to produce human-like text. Microsoft licensed the exclusive use of GPT-3 on September 22, 2020. We’ve compiled a list of GPT-4 improvements based on available information and Sam Altman’s comments during the Q&A session.
Apparently, size doesn’t matter.
Large language models like GPT-3 have achieved outstanding results without much model parameter updating. Though GPT-4 is most likely to be bigger than GPT-3 in terms of parameters, Sam Altman has clarified that size won’t be the differentiator for the next generation of OpenAI’s autoregressive language model. The parameter figures are likely to fall between GPT-3 and Gopher; between 175 billion-280 billion.
Future deep learning will be multimodal. Our multisensory brains reflect our multimodal reality. Only one mode of perception at a time significantly limits AI navigation and comprehension. GPT-4 could be a text-only model to push language models to their boundaries before going on to multimodal AI.
Sparse models that process diverse inputs using conditional computing in different regions of the model have been successful. Without incurring substantial computing expenses, such models can quickly scale above 1 trillion parameters. On very large models, however, the benefits of MoE techniques start to fade. GPT-4 will be a dense model, like GPT-2 and GPT-3. To put it another way, any input will be processed using all parameters.
If GPT-4 is larger than GPT-3, the amount of training tokens necessary to be compute-optimal (according to DeepMind’s findings) might be approximately 5 trillion, which is a factor of a billion more than current datasets. The number of FLOPs needed to train the model with low training loss would be 10–20 times that of GPT-3. Altman stated in the Q&A that GPT-4 would necessitate more compute than GPT-3. OpenAI will prioritize variable optimization over model scaling.
A helpful AGI is OpenAI’s north star. The InstructGPT models, which are trained with people in the loop, are expected to be used by OpenAI. Using strategies gained through their alignment research, InstructGPT was deployed as the default language model on OpenAI’s API and is considerably better at following user intents than GPT-3 while also making them more true and less toxic. However, only OpenAI staff and English-speaking labellers were aligned. In comparison to GPT-3, GPT-4 is more likely to be human-like.