Experimenting with AI Voice Generation
As regular readers know, I am always looking at different passive income generation ideas, especially if linked to my main occupation of photography. I’ve been moving towards creating YouTube videos and trying to get to the scale that is required to start to generate income from that site. The ability to show ads around my videos requires 1000 subscribers and 4000 viewing hours in the past 365 days. I’m a long way away from that at the moment, but still interesting to watch my numbers grow.
But back to the photography. I have been creating images for the stock agencies for years now – 17 years! I’ve been choosing my favorite images that I could see on a wall somewhere and putting those on Fine Art America, Pictorem and now in my own Etsy Store as well as a few specialist outlets. I’ve been writing illustrated articles using my best photos on BackyardImage.com and linking from there to one of the print outlets I use, partly so that someone who really likes an image can follow a link and purchase it, and partly so that my portfolio ranks higher in Google Image searches. I get around 80-100 unique visitors a day to my articles. And, finally, I’ve wondered if there is an opportunity to create travel-oriented videos based a wider set of images that I have taken on a trip. I could just add music and hope that someone just has a video on their TV to watch the photos, but that doesn’t sound very likely and so I have been creating narrated videos that provide more value-add.
Narrating a script and getting a good quality out of it is not easy. I work in a small room that gives any recording a lot of reverberation and then you have the issue of making mistakes in reading the script and having to spend time getting a good quality output. So, instead, I thought I would try the AI route instead.
A friend in the stock video market (who used to be an editor for Greek TV) told me about his experiences with ElevenLabs and I thought I would try it out to create an AI voice that I could just feed a script to and get back a pretty reasonable narration. Turns out it is quite a bit more difficult than it sounds, but I have eventually made it!
It is a paid service with the following packages, and I do have an affiliate link to ElevenLabs if you are interested.

As I wanted to create a Professional Voice Clone (which requires between 20 minutes and two hours of high-quality audio), I joined as a creative. My plan was to record 45-60 minutes of narration to create that Professional voice and then use it for my own projects but also license it to others for use in their projects. My first attempt was OK – I had a Shure microphone on a stand and narrated long passages from my articles about the Nile Cruise. Perhaps not the ideal subject with some pretty strange names to pronounce. My voice was accepted and did license a few times, but it failed to get the High-Quality badge which means that it would not be promoted and probably not very useful to someone wanting a high-quality narration. Their support told me that the issue was reverberation – echo from my room. I did manage to persuade them to let me try again, this time using a condenser microphone and I built myself a sound deadening recording space with yoga mats, heavy curtains and some photography light stands!

This time I narrated a range of articles from my blog about different topics and places around the world. I created about an hour of recording over 2 days, and then spent several weeks editing them, first reducing the slight background noise, removing or lowering high peaks in my speech, getting rid of breathing and lip noises and removing areas where I thought the speech didn’t sound just right. I reminded myself that the story I was narrating didn’t have to make sense and so removing a sentence that had a bad pronunciation didn’t matter overall. My new recordings added up to about 45 minutes, and this time, the voice was accepted and then accepted as a High-Quality voice and also made available in their reader App where someone can choose one of the voices to narrate a book. My voice is known as Steve – A native English senior male speaker with soft Northern English accent.
I’ve now used the new voice to create the voiceover for a video I put together from the photos taken on our recent trip to Washington State and the Cascades Mountain Loop drive. We did 1000 miles in total over 8 days and were really impressed with the places and, of course, the scenery we traveled through. That video is on YouTube in my channel, but embedded here so that you can hear how the AI voice performs. Not perfect, although there are various markers you can use in ElevenLabs to change the way the voice reads certain paragraphs, but I didn’t bother with anything complex.
Earnings Potential from ElevenLabs
It is probably too early to really tell, as my voice was only accepted on September 11th. Since then, I have earned $10 or so (plus $4 from my earlier voice attempt). I’m hoping that it becomes more popular over time – there are already 354 people who have used it, and it is now available in six different languages. I’ll report from time to time if it has turned into an interesting passive income stream.
UPDATE – February 2026. This has improved! In February 2026, I actually received $74 in license fees for my voice. So this is turning into a “nice little earner!” If you are interested in trying this, I have an affiliate link for ElevenLabs for you to use here.


Do you still own your voice
In the same way as we license stock photos, we own the “copyright” but we give a license to allow others to use it and ElevenLabs to relicense it.
Excellent voice over and video!
Thanks Barbara! I was impressed with how easy it was to write the script (which was based on the original blog post) and also edit out bits of the text that I didn’t need because there is no background noise – you just cut out what you don’t want and it still sounds good.
Very interesting experience. AI voices can of course be an interesting option, but I always feel they sound machine like with little inflection and natural high and lows. Of course not everyone can be a pro speaker, but, apart creating and income stream from selling the voice, I would refrain doing this. It is much to less recognised how we remember voices and one and the same voice style at unrelated programs and producers could in the longer run backfire.
I’m not sure I agree. I’m not expecting a lot from licensing, but if I want to create more travel videos on YouTube, then I need some way to narrate them. I could also narrate an audio version of my books perhaps. I could never do the narration for real as it would take too much time, but this voice doesn’t sound too machine like to me.