Google Gemini 1.5, the next-gen AI model, has recently been announced. The AI model can output highly accurate and efficient results by analyzing a huge amount of information. Google showcased some of the amazing capabilities of Gemini 1.5. This article may interest you if you are curious about multimodal AI technology.
Google states that Gemini 1.5 has “dramatically enhanced performance.”
In this article
Intro to Gemini | Accessing Gemini | Basics of AI | Performance | The Catch
Intro to Google Gemini
According to Demis Hassabis, CEO and Co-Founder of Google DeepMind, Gemini can “generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image, and video.” Gemini can run on almost all platforms from data centers to mobile devices as a multimodal AI.
Google Gemini has 3 different sizes or variants: Gemini Ultra – for highly complex tasks, Gemini Pro – for scaling across a wide range of tasks, and Gemini Nano – for on-device tasks.
Accessing Gemini
To access Google Gemini, you need to update Google Play from your Android device. You can, then, download the application from Google Play. After that, you need to sign in to start chatting with Gemini. It is worth mentioning that Gemini is now available to most of the regions. Moreover, the web version of Gemini is available to all.
Basics of AI
Before we proceed forward, we need to understand a few terms about AI models. We will be dealing with 2 AI terminologies in this article. However, this article will be updated as soon as Google Gemini incorporates new features, you can Bookmark this page. Anyway, let’s talk about Token and Context Window.
What is Token of an AI model?
Answer: A token is a building block of information. This information block is used to process and represent words, images, videos, audio clips, and/or codes.
What is Context Window of an AI model?
Answer: Context Window is the space that an AI model uses to accommodate Tokens for further operations. A bigger Context Window means more Tokens to receive and process at a time. This will lead to more accurate and consistent output.
Here’s a chart showing the efficiency of Gemini 1.5 Pro.
AI Model | Context Window |
---|---|
Gemini 1.0 Pro | 32k tokens |
GPT-4 Turbo | 128k tokens |
Claude 2.1 | 200k tokens |
Gemini 1.5 Pro | 1M tokens |
Google Gemini 1.5 Pro offers seamless analysis. It can classify and summarize a large amount of content within its Context Window. To compare, 128,000 tokens mean around 700,000 words, codebases with over 30,000 lines of code, 11 hours of audio, and/or 1 hour of video.
Gemini 1.5 Pro Performance
Here is the 402-page transcript from the Apollo 11 Moon Mission, Gemini 1.5 Pro could read and analyze the entire PDF with 100% accuracy.
Gemini 1.5 Pro also showed a very promising response to multimodal inputs (combination of text and image). Here’s Gemini Pro 1.5 working with a 44-minute silent Buster Keaton movie. This clip also shows how efficiently Gemini Pro 1.5 extracted data from a video input.
The Catch
Even though Gemini 1.5 Pro is 87% smarter and more efficient than Gemini 1.0 Pro, it “performs at a broadly similar level” to the Gemini 1.0 Ultra. Gemini 1.5 Pro with a 1M token context window will be available to limited developers and enterprise customers. Therefore, most of the population will not benefit directly from such a powerful AI model at the moment.
When it comes to performance, Gemini Ultra has aced it. According to Google, “With a score of 90.0%, Gemini Ultra is the first model to outperform human experts.” This is “on a combination of 57 subjects such as math, physics, history, law, medicine, and ethics for testing both world knowledge and problem-solving abilities.” To compare, ChatGPT4 scored 86.4% in these tasks.
Wrapping up
Google Gemini 1.5 seems on the right track to take multimodal AI to the next level. However, we believe only Gemini Ultra will be able to do the most. Other versions of Gemini should be powerful enough to tackle our day-to-day endeavors. We are still testing Gemini. Stay tuned to learn more.
Comment below what your thoughts are. Share with your family and friends if you find this article interesting. Let us know if you have any leaks or want to share something with us. You can suggest technology to be explained by Xplnrs.
Leave a Reply