- Microsoft and OpenAI plan to launch GPT-4 this week
- No official confirmation was made but CTO let it slip last week
- GPT-4 will use the multimodal language according to reports
The tech world is buzzing after Microsoft Germany’s Chief Technology Officer let it slip that they would have GPT-4 up running later this week.
In a conversation with German news site, Heise, Andreas Braun said, “We will introduce GPT-4 next week…we will have multimodal models that will offer completely different possibilities – for example videos.”
Multimodal language model
Multimodal language model refers to a model with the ability to source information from different sources. Thus, GPT-4 would be able to reply to users with images, videos or even music. This is because GPT-4 will have 100 trillion learning parameters, which will be 500 times the current ChatGPT-3.5 has – about 175 million learning parameters.
More learning parameters mean more neural networks and, thus, much more capable AI. Most AI currently, from GPT-3.5 to Dall-E, are strictly text-based, and therefore, for GPT-4 to integrate text with videos, images, and music is unprecedented. We are talking about an AI that could write movie scripts, perhaps even capable of making a movie based on a script prompt. An AI that could help authors report a fully illustrated book from scratch.
Now, all the above is speculation, but there is no doubt that GPT-4 if the little we have gleaned from Microsoft CTO is to be believed, will still be a much more advanced AI than anything we have seen.
With OpenAI and Microsoft possibly making GPT-4 multimodal, the next logical step is creating an AI capable of learning by itself (emergence), rather than simply collating data from different sources.
There is no report that GPT-4 will be capable of this, but if GPT-4 comes with multimodal learning capabilities and it proves successful, then there is no doubt the next logical step would be to build an AI capable of self-improving. It’s only a matter of time.