Professional AI Voice Production Suite
Design, clone, and batch-produce AI voices locally.
No cloud. No subscription. Your GPU. Your creative control.
Windows 10/11 · NVIDIA GPU 8GB+ VRAM · ~15 GB disk space
Feature Walkthrough
A full tour of every engine, the Batch Studio workflow, and the plugin system — from first launch to finished audio.
Three Creative Engines
Three purpose-built synthesis engines for every production scenario — from scripted narration to zero-shot character creation.
Drive a library of professionally-tuned vocal characters using plain English style instructions. Consistent, high-quality output across every take — perfect for narration, audiobooks, and reliable character voice-over.
Generate entirely new vocal identities from text descriptions alone. Define the body and the performance. The model constructs a unique vocal fingerprint from scratch — no reference audio required.
Capture any voice from as little as 3–10 seconds of reference audio. Feed it any script and the model delivers that same voice, performing your direction. The integrated Prep Station handles reference transcription automatically.
Batch Studio
A non-linear audio director for multi-voice scripts. Produce entire podcast episodes, game dialogue trees, or audiobook chapters with a single run.
Each script block carries its own speaker, engine, style, language, seed, temperature, and Top P. Mix any combination of engines and voices in a single scene — Auto-Switch handles all model transitions automatically.
More Features
From production utilities to developer tooling, Qwen3 Studio ships with a complete ecosystem for serious voice work.
Enable, disable, create, and inline-edit all custom styles and Voice Design profiles from a dedicated tab. Changes sync live to every dropdown instantly — no restart needed.
GitHub-synced plugin hub with SHA-256 verification. Toggle features on/off without restarting. Pull the latest official plugins in one click, or ship your own headless extensions.
Pre-process documents and scripts into clean, segment-ready input. Strips timestamps, normalises formatting, and splits at natural sentence boundaries for consistent rendering.
Built-in guided tutorials walk through each engine, the batch workflow, and advanced voice design techniques — right inside the app, without leaving the UI.
Every tab has an inline help panel with practical tips, tone recipes, and action tag references. Always one click away — no separate documentation window needed.
Aggressive VRAM flush between every generation and take, real-time GPU memory indicator, meta-tensor safety guard, and an emergency Reset button that never hangs or crashes.
Audio Demo
These samples were generated locally using the Voice Clone engine on the 12Hz High-Fidelity architecture. No cloud. No API call.
"Look, people ask me all the time — they say 'Sir, how is your voice so clear?' And I tell them, it's Qwen Studio..."
"Y déjenme decirles algo más. Hablo español perfectamente. Nadie habla español mejor que yo..."
"Here we observe the modern content creator in their natural habitat... utilizing the new high-fidelity architecture..."
"Y observen la facilidad con la que cambia de piel. Ahora habla en la lengua de Cervantes, conservando su elegancia natural..."
"Of all the GitHub repos in all the towns in all the world... she walks into mine. This is the one."
"Escúchame bien, muñeca. Esto no es un juego. Esto es calidad de estudio local."
"Yo me paso años perfeccionando mi voz... y esta IA local la clona en cuatro segundos. Cuatro. Tengo sentimientos encontrados."
"Mi manager me llamó muy alterado. Le dije que se tranquilizara... Luego le pregunté si sonaba mejor que yo en directo. Me colgó."
"J'ai d'abord refusé — je suis un artiste, pas une machine. Puis on m'a dit: sans abonnement, sans nuage. J'ai ouvert un Bordeaux... et j'ai dit oui."
"Hanno copiato la mia voce senza internet, senza pagare ogni mese — solo la GPU che lavora come un pazzo. Amico mio, questo è genio puro."
"Bossa Nova não é sobre gritar. É sobre o silêncio. Esta inteligência artificial entende isso... sussurra a minha voz. Mas onde está o meu violão?"
"You come to me... into my browser... and you ask me to clone a voice. You don't even offer me a GPU. I will make you an audio file you cannot refuse."
Documentation
From quick start to advanced plugin development — fully documented and kept up to date with every release.
Full documentation covering engine architecture, the Batch Studio workflow, VRAM management, stability features, and system integration.
View on GitHub ↗Precision slider reference, in-script action tags, tone recipes, Voice Design formulas, Batch tips, and pro techniques for getting the best results.
View on GitHub ↗Build your own tabs, background services, and automation extensions. Full API reference for the ModuleHub plugin system with working examples.
View on GitHub ↗To me, good development is simply finding the best solution to a problem. Imagine you are in charge of finding the best way to get from your home to work. As a developer, I have to know all the options — whether that is walking, digging a tunnel, or taking a helicopter.
My task is to find the way that makes the journey as smooth as possible for the person traveling. I don't need to know how to build the airplane to solve the problem — I just need to know exactly when to use it. Everything I learn from life helps me find a better way to get people where they need to go. And I am always open to listening and learning from anyone who thinks we should take a different path.