Today, it is quite difficult to have a conversation through voice or video calls if our interlocutor speaks a language unknown to us. Without an interpreter, communication becomes almost impossible, making us depend on third parties to understand us. Well, this is about to change, and it is that the artificial intelligence NVIDIA, Maxine, can now translate your voice in real time, among many other improvements.
But in addition to this impressive technology capable of translating in real time, NVIDIA Maxine has other benefits. One of them is the ability to point the speaker’s gaze in the direction of the camera. All this, of course, through artificial intelligence. This way, it will always appear as if you are looking at the listeners, even if you are looking away.
NVIDIA assures that Maxine will be available soon for all users in the world. In this way, it will not be only a few who can enjoy it. Since its official introduction, artificial intelligence has caught the eye of technology enthusiasts, and rightly so. NVIDIA Maxine can offer an unprecedented enhancement to the way we communicate with other human beings around the world.
The main augmented reality features available in NVIDIA Maxine are divided into the following:
- face tracking
- waypoint tracking
- face mesh
- Body pose estimation
- Eye contact
- Estimation of facial expression
NVIDIA Maxine arrives to change the game, but she still has a lot to improve
Until now, those who have been able to test this artificial intelligence have reached a fairly similar conclusion, and that is that it is a more than interesting tool, but it still has a lot to polish. However, considering that it is barely in development and does not even have an official release, NVIDIA Maxine is a pretty amazing proposition.
The function called Eye Contact is one of its most striking aspects. By having it activated, your gaze will look directly —but naturally— at the camera. In addition, emulates blinks, position and shape of the eyes; while it allows the eyes to focus according to the position of the face. While all this is happening, you can have your gaze pointing anywhere, since NVIDIA Maxine takes care of correcting it for the rest of the participants.
NVIDIA Maxine processing, yes, will not be available to everyone. According to Alex Qi, one of the heads of the software team behind this artificial intelligence, the tool has some basic requirements to work. One of them is a webcam, of course, but also an NVIDIA RTX-series graphics card. However, there are ways to run the tool on any computer, thanks to the delegation of the video signal to other data centers that will take care of its processing.
“NVIDIA Maxine is the suite of a GPU-accelerated AI SDK and cloud-native microservices for implementing AI functions that enhance audio, video, and augmented reality effects in real time. Maxine’s next-generation models create effects quality that can be achieved with standard microphone and camera equipment.”
NVIDIA
An AI with a bright future
While the Eye Contact feature is the most popular feature of NVIDIA Maxine, it’s not the only one. Furthermore, artificial intelligence allows you to improve audio by removing background noise and echo. In addition, you will also be able to apply resolution enhancements, set a virtual background for your image, and instantly translate between languages like English, French, Spanish, German, and more.
Maybe one day we will see this same technology integrated into programs like Zoom, Discord, Skype or Teams. After all, NVIDIA Maxine would make it possible to further communicate between people, even if they don’t speak the same language.