Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

rpmurray

macrumors 68020
Original poster
Feb 21, 2017
2,147
4,330
Back End of Beyond
Still using Monterey 12.6, haven't "updated" to Venturda yet.

When I go into System Preferences and choose Accessibility and then Spoken Content I can download new voices. Is there any Mac app that will let you create a custom voice that you can then import? Let's say I was thinking of turning my own voice into a custom voice. I'm assuming I'd need an app that would essentially have me speak a lot of different words that would contain all the sounds in the words of spoken English, and then would be able to combine them to speak any word in the dictionary. Because this would be a bit hit or miss I also think it would need me to listen to sentences read in the created voice and tweak any sounds that don't sound correct, by perhaps speaking another list of words until it gets the right sound.

I'm sure this would take a bit of time, possibly weeks or maybe even months until the voice was trained to mimic my own voice. But it sounds like something fun I'd be interested in doing, especially during those down times when I don't have much else going on. It would be even better if there was an option to tweak the voice to make it sound better than the original.
 
That's an awesome question, and I know voice synthesis is an area bearing plenty of advances lately. I've encountered a few companies (and even individuals) that have demonstrated technology that converts a reasonable amount of source audio data into a voice, but I don't think I've found an app that's meant to apply that kind of process automatically to a data sample a user would just record by themselves at home. I get the overall impression that even the best such app today would produce hit-and-miss results, which might be equally cool and amusing, but I don't think we're quite at the point of anyone being able to "deepfake" a voice, if you will.

I'd love to be surprised by anything anyone else has found, though.
 
That was true a year ago. Since then a lot has happened. Have a look at this video, it shows how to create text-to-speech in English with a voice sample of only 5 seconds. The results are stunning, I tried it myself with different voice and you couldn't tell it's generated.

So yeah, technically it's already possible. Some podcast editing software already uses it, but with severe restrictions to prevent abuse: https://podcastle.ai/products/revoice

I guess that's why apple is shying away from it, it's too hard to prevent abuse.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.