Microsoft is betting heavily on GPT-3, the artificial intelligence designed by OpenAI, for several of its applications and services, such as Bing or Word. The company led by Satya Nadella, however, is also developing its own models. Proof of it is VALL-E, an AI capable of imitating the voice of any person with just listening to three seconds of audio.
VALL-E, specifically, is a language model for text-to-speech synthesis (TTS) based on EnCodec, Meta’s audio codec, and is very similar to other AIs that allow you to generate audio through a short description. of text. Microsoft itself, in fact, has a similar service: Text to Speech, which allows you to convert text into synthesized speech. The difference, however, is that VALL-E is capable of analyzing a person’s voice, to subsequently interpret how that voice would sound with different phrases. All this, in addition, preserving intonation and emotion of the speakerthe company claims. And you can achieve great results with just three seconds of voice.
Specifically, we train a neural codec language model (named VALL-E) using discrete codes derived from a standard neural audio codec model, and consider TTS as a conditional language modeling task rather than continuous signal regression as in previous jobs.
VALL-E can be promising, but also very dangerous
The new Microsoft AI capable of replicating the voice of any person, in addition, can be used with other generative AI models. Among them, GPT-3. Thus, users, for example, could ask ChatGPT to imitate the voice of a specific individual.
The objective, therefore, is to be able to create voice speech through a text input. This, however, brings with it a major drawback. If VALL-E is finally available to the public, many could use it to impersonate people. Microsoft, in this case, details that “it is possible to build a detection model to discriminate if an audio clip has been synthesized by VALL-E”.
VALL-E is just one more example of what Microsoft plans to do with artificial intelligence. The firm founded by Bill Gates, we reiterate, is also interested in including models from other companies, such as OpenAI’s GPT, in some of its services. Among them, Bing, with the aim of offering better search results and, in this way, competing against Google.