I’m using https://github.com/rhasspy/piper mostly to create some audiobooks and read some posts/news, but the voices available are not always comfortable to listen to.
Do you guys have any recommendation for a voice changer to process these audio files?
Preferably it’ll have a CLI so I can include it in my pipeline to process RSS feeds, but I don’t mind having to work through an UI.
Bonus points if it can process the audio streams.
That’s called text to speech, not a voice changer. A voice changer is the thing in the Darth Vader halloween masks.
There’s been discussion on TTS programs here recently: https://lemm.ee/search?q=tts&type=All&listingType=All&communityId=185&page=1&sort=TopAll
Or you can search via your local instance/interface.
Text to speech is what piper is doing.
What I’m looking for is called voice changer since I want to change a voice which already read something.That’s exactly what I want: “the thing in the Darth Vader halloween masks” but for linux, preferably via CLI to ingest audio files and be able to configure it to change the voice as I want, not only Darth Vader.
Oh, I see. I think it would still be easier to either use a different voice in piper (the github page talks about this) or use a different tts program entirely.
So, all of the awkward pauses, the lack of inflection - you’re saying keep those, just change who it sounds like is speaking?
In case you wanted to try other TTS providers, here’s a leaderboard based on user votes.
Removed by mod
I haven’t completely looked into creating a model for piper, but just having to deal with a dataset is not something I look forward to, like gathering the data and all of what this implies.
So, I’m thinking it’s easier to take an existing model and make adjustments to fit a bit better on what I would like to hear constantly.
Removed by mod
Check out Pied: https://github.com/Elleo/pied
I don’t want to manage piper voices, I can handle that directly in my file system as I only have a few.
The issue is none of the ones I’ve found are good for me, so what I need is something to change the voice once it has been generated by piper.what you’re looking for is called RVC. It’s integrated into some voice-cloning github projects but i don’t use it. Here for example: https://github.com/codename0og/rvc-realtime-voice-changer
There are a few voices included with pied which is why I suggested it.
Coincidentally, I just found this other thread that mentions EasyEffects: https://programming.dev/post/17612973
You might be able to use a virtual device to get it working for your use case.