Meta unveils AI Speech Generator: Voicebox

• Meta unveiled AI-based speech generation model called Voicebox.
• The model can convert text to speech and match audio style based on a two-second sample.
• It can also edit recordings, support six languages, and be used by virtual assistants, content creators, and users with accessibility needs.

Meta Unveils Speech Generation AI: Voicebox

Meta, the parent company of Facebook and Instagram, has announced the launch of its new AI-based speech generation model called Voicebox. According to the company, this model is capable of converting text into speech while matching an audio style based on a two-second sample.

Voicebox Features

Voicebox’s features include editing existing recordings to remove background noise as well as creating speech that is modeled on diverse samples. Additionally, it can translate texts into different languages and read the translated text in the speaker’s original voice in English, French, German, Spanish, Polish or Portuguese.

Potential Uses for Voicebox

Voicebox could be used by various groups of people including virtual assistants within Metaverse as well as content creators who need realistic voices for their projects. The tool can also be beneficial for users with accessibility needs who require text-to-speech technology to access digital content more easily.

Other AI Models

In addition to Voicebox, other similar AI models are being developed in order to further enhance user experience with voice technology such as natural language processing (NLP) and automatic speech recognition (ASR). These models are designed to provide better accuracy when understanding spoken commands or requests from users.


Overall, Meta’s new Voicebox AI model promises more realistic voices for virtual assistants and non-player characters while providing potential benefits for content creators and users with accessibility needs alike. With more advancements in voice technologies like NLP and ASR expected in the near future; developers are continuing to push the boundaries of what is possible with these types of tools.