
Alibaba’s Emote Portrait Alive (EMO) AI model can turn static images into talking and singing videos. Technically speaking, it’s a novel framework that utilizes a direct audio-to-video synthesis approach, thus bypassing the need for intermediate 3D models or facial landmarks.
Just in 👀
this is the most amazing audio2video I have ever seen.
It is called EMO: Emote Portrait Alive pic.twitter.com/3b1AQMzPYu— Stelfie the Time Traveller (@StelfieTT) February 28, 2024
EMO’s method ensures seamless frame transitions and consistent identity preservation throughout the video, giving the characters a more lifelike feel. Unlike similar AI models, this one not only generates convincing speaking videos, but also singing clips in various styles with more expressiveness as well as realism than ever seen before.
- Experience total immersion with 3D positional audio, hand tracking and easy-to-use controllers working together to make virtual worlds feel real.
- Explore an expanding universe of over 500 titles across gaming, fitness, social/multiplayer and entertainment, including exclusive releases and...
- Enjoy fast, smooth gameplay and immersive graphics as high-speed action unfolds around you with a fast processor and immersive graphics.
Our method ensures seamless frame transitions and consistent identity preservation throughout the video, resulting in highly expressive and lifelike animations,” said the researchers.
[Source]





