Hello, Dev Community!
I'm thrilled to bring you the latest updates on the open-source project - Wunjo AI v1.5, a venture that initiated from a personal necessity has now evolved into a tool harboring an array of features.
Backstory
My journey commenced with an ambition to fabricate podcasts for another of my ventures, «Infinite Neural Radio», where music breathes through the creativity fueled by neural networks. I have written about this on dev in previous post. As time progressed, I ventured beyond just synthesizing my voice. Wunjo AI began offering remarkable dialogue potentials without discernible limitations.
You can watch video on YouTube to find out how it was in first version of Wunjo AI.
What’s Evolved?
From animating photos and images to bringing them to life with speech capabilities via Stable Diffusion, the journey has been nothing short of fascinating. The desire to constantly enhance led to the integration of features allowing you to animate lips in video clips, synchronize them with audio, and even play around with different faces and environments to create signature effects.
The horizon is endless as I am set on enhancing the application's versatility with text-query induced video alterations, leveraging the power of Stable Diffusion. But let's dive into the existing offerings:
Wunjo AI v1.5 Features
-
Voice Cloning: Now supporting English and Chinese, with an imminent inclusion of Russian, this feature allows you to clone voices seamlessly from audio files or by recording directly in the app.You can listen on Soundcloud a Russian synthesized voice witch was cloned on English and Chinese.
Multilingual Support: Wunjo AI is now user-friendlier, supporting multiple languages via Google Translate integration, promising convenience in your native language.
Video Face Swap: From single photo-based face swaps to ensuring continuity through various frames and angles, the application uses neural networks to provide a flawless deepfake experience.
Real-time Speech Recognition: Streamlining dialogue creation, this feature converts speech to text automatically, saving you from the hassle of manual typing.
Video Retouching and Object Removal: Enhance your deepfakes' quality or remove unwanted elements from your videos with this intuitive feature.
- Emotion Deepfake: Still in its nascent stage, it promises to deliver intriguing yet eerie outputs as it crafts deepfakes resonating with different emotions, leveraging wav2lip technology.
Additional Features: From improving background and face to easy splitting and merging of video frames and audio, the tool aims to simplify your workflow extensively.
User Panel Updates: Track generation status, download models, and keep an eye on errors through a user-friendly panel, ensuring a transparent process.
You can watch video about Wunjo AI v1.5 update.
Looking Forward
As Wunjo v1.5 stands as a significant milestone, I am eager to keep pushing boundaries. For an enhanced experience, leverage the GPU acceleration, facilitated through easy CUDA driver installations as guided in the documentation.
Also I created a video how you can train neural network in Wunjo AI on your voice to use model for speech synthesizing from text.
Wunjo AI is open-source and each user can modify the project for themselves, or suggest improvements, talk about the project, in general, there are many ways to support the development of the project. This is project page on GitHub.
Get Involved
Embark on your content creation journey with Wunjo AI, an open-source, free, and local platform devoid of restrictions and censorship. Discover more and subscribe on GitHub or official website to find out more about Wunjo AI and on YouTube channel and GitHub Profile to don't miss new articles about generative neural networks, open-source projects, deepfakes and text to speech content.
I look forward to seeing you on Dev.
Top comments (0)