It’s a craft exposed to the severe threat of dilution— possibly even extinction—in the face of Artificial Intelligence tools cloning voices
In November 2022, a ruling by the Delhi High Court stated that actor Amitabh Bachchan’s voice cannot be used without his permission, after Bachchan moved the court appealing that his recognisable baritone not be misused for commercial gains. A year later, in September 2023, the same court debarred social media and e-commerce platforms from infringing on the personality and publicity rights of actor Anil Kapoor, which included his name, image, dialogues, and voice.
To put it simply, one of the biggest and most discernible assets of the artistes—their voices—have been insured against any fraudulent usage, thereby not only granting them legal protection, but also a means of fair compensation in the occasion of fair usage of said asset. The same, however, can’t be said for voice artistes who earn a living by lending their voices to someone else’s face on screen.
Mumbai-based senior voice artiste Pawan Kalra’s initiation into the field 25 years ago was serendipitous. His brother Pankaj Kalra, also a voice artiste, was the one who sparked his curiosity in the art form, and later mentored him, even though Pawan claims it wasn’t a job he always dreamt of doing.
Soon enough, there was no turning back for Pawan. From voicing Sean Bean’s Boromir in Hindi for The Lord of the Rings franchise and Henry Cavill’s Sherlock Holmes in Enola Holmes, to Bob in the now-iconic animated cartoon series Bob the Builder, and Cavill’s Geralt of Rivia in The Witcher, Kalra is a veteran whose voice has been a fixture in our popular subconscious for decades. Evidently, the wizardry in the craft lies in the ability to become not just a character but an actor, simply through one’s voice. Today, this craft is exposed to the severe threat of dilution—possibly even extinction—in the face of Artificial Intelligence (AI) tools cloning voices.
Pawan Kalra is a veteran voice artist who has voiced Henry Cavill’s Sherlock Holmes in Enola Holmes. Image: Netflix
/established/media/post_attachments/theestablished/2024-08-23/5qhj17g8/Lord_of_the_Rings_credits_IMDB.jpg)
Sean Bean’s Boromir in Hindi for The Lord of the Rings franchise has also been voiced by Pawan Kalra. Image: IMDB
“I have lost some gigs, especially corporate ones, to studios already moving to AI cloning tools; this has been such a rapid change we have witnessed in only the past six months,” says Pawan. As the Vice President of the union body, the Association of Voice Artists (AVA) in Mumbai, Pawan mentions that consistent efforts are being made to secure the jobs and livelihoods of over 2,500 voice artistes working in Mumbai alone—and enrolled as members of the AVA—across cinema, television, ad films, and corporate projects. “We have been in conversation with studios, producers, legal counsels, and every stakeholder to make sure that more jobs than the ones we have already lost are not on the line. If this is the rate at which things progress, we may not even have a future generation of voice artistes,” he adds.
How exactly is AI replacing voice artistes?
Imagine an English-language film with ten characters, including two leads, and eight supporting characters with each having at least a few lines of dialogue. What would earlier entail casting voice artistes for all ten when the project is up for being dubbed in, say, Hindi, may now get a casting call for only the two leads. “[A] shift to any technology is always done keeping cost-cutting in mind. So if a studio thinks they can’t afford a voice artiste for an actor in a supporting role, they’ll say they’ll spend on only casting for the leads. It cumulatively takes away a few thousand jobs, if it happens for every project,” says Pawan.
Rakhee Sharma, another senior artiste who has lent her voice to the Hindi dubs of popular animated characters like Daphne from the Scooby Doo series, Cersei Lannister for Game of Thrones, and Monica/ Stockholm for Money Heist, among a host of others, was also behind Vodafone’s Interactive Voice Response (IVR) System. “It was a lucrative means of earning money but that’s a thing of the past now,” she says, as corporate recordings that often require more mechanical and less emotive voices are being handed over to AI. “I understand why that might happen. We saw it even on Instagram Reels where some voiceovers would be done in robotic voices, but this is concerning if not checked,” points out Sharma.
/established/media/post_attachments/theestablished/2024-08-23/ku430pz6/pexels_george_milton_6953871.jpg)
Back in 2005, Tanuja Chandra remembers being introduced to a software called “Pro Tools” that could turn a less-than-average singer into a maestro with the click of a button. Image: Pexels
/established/media/post_attachments/theestablished/2024-08-23/pjv0gwbp/Scooby_Doo_Velma_Daphne_credits_screencrush_com.jpg)
Rakhee Sharma, a senior artiste who has lent her voice to the Hindi dubs of popular animated characters like Daphne from the Scooby Doo series, was also behind Vodafone’s Interactive Voice Response (IVR) System. Image: Screencrush.com
The hyper-real creaselessness of AI-generated human figures and their steely glares—coupled with their sanitised voices—adhere to our modern-day ideas of perfection, and consequent fixation with it. And it’s definitely not the first time the creative industries have fallen prey to such trappings. Back in 2005, Chandra remembers being introduced to a software called “Pro Tools” that could turn a less-than-average singer into a maestro with the click of a button. “I would see music directors and engineers in studios say they could fix someone if they went off-key after a song was recorded. You know, how it’s on auto-tune,” she says. If people in decision-making positions don’t bat an eyelid before replacing veteran artistes with technology, they are going to go ahead and adopt these tools anyway, “and there’s nothing we can do about it. It’s a losing battle,” adds Chandra.
After all, a machine, even if it can read out words, is still unable to accurately capture the nuances of tone and emotions according to scenarios written into the script. Moreover, it poses questions of ethicality.
According to Pawan, instances of artistes’ voices being used to train the AI tool, without their explicit consent, has alerted the fraternity to make the legalese in their contracts more watertight. “Because if you oppose the technology of your times, only you stand to lose and get left behind,” he observes.
The ruthlessness of sanitised art
Filmmaker Tanuja Chandra agrees with Pawan, albeit with a hint of heartbreak in her voice. Only recently, while pitching an idea for a film to a producer, Chandra resorted to using imagery and footage generated by an AI tool—instead of doing the standard cut-and-paste from existing footage—to create a dummy that left her feeling queasy. “Then when I told my editor that I could record and send them my dub, they said they could generate that voice with AI! Imagine, the images are fake and so are the voices,” exclaims Chandra. She ultimately decided to scrap it.
“I HAVE LOST SOME GIGS, ESPECIALLY CORPORATE ONES, TO STUDIOS ALREADY MOVING TO AI CLONING TOOLS; THIS HAS BEEN SUCH A RAPID CHANGE WE HAVE WITNESSED IN ONLY THE PAST SIX MONTHS”
Pawan Kalra
Music producer and DJ duo Rupinder Nanda and Kedar Santwani, aka Tech Panda X Kenzani, however, approach the debate with lesser trepidation. They see AI's potential in buttressing their craft by unlocking new possibilities for them, sensing that the changes will set in rapidly. “We feel so far there are more benefits than concerns with the use of the technology,” says Nanda. “Voiceover apps have opened up new avenues for producers like us where they can now experiment with vocal elements without the need to hire a special voice talent for every project. It is helping producers break away from traditional barriers,” adds Santwani.
Being producers, they don’t overlook the rather tempting cost-effective aspect of the innovation, allowing them to “allocate resources more effectively while not compromising on quality,” according to Santwani. However, too much of anything cannot always be good, which means having reasonable checks and balances in place is imperative.
Indians on Instagram welcomed 2024 with a song from the Shah Rukh Khan-starrer Dunki (2023) recreated in Mohammad Rafi’s voice using AI by music producer Anshuman Sharma and singer Aditya Kalway. It almost immediately caught the Internet’s fancy and has garnered over eight million views till date. While inventive on the surface, scratching it opens a can of worms that leads to questions of provenance—a murky slope that has riddled AI ever since it became the buzzword of our times.
“We feel it’s invasive, because when an artiste’s voice is cloned or not used without their consent, issues related to originality and artistic integrity arise. We also truly feel that music is such an emotional connection for people that AI generated voices will not be able to give that human connection to the listener,” shares Nanda.
Does it, then, boil down to legalese?
When Scarlett Johansson voiced Samantha, an AI voice assistant that Joaquin Phoenix’s character in Spike Jonze’s now-prophetic sci-fi rom-com Her (2013) fell in love with, she didn’t quite expect she would manifest it into her real life a decade later. OpenAI CEO Sam Altman approached the actor last year, requesting her to voice ChatGPT, which she turned down. But the company went ahead and used a soundalike anyway, leaving Johansson feeling sour in the mouth. It led her to seek legal counsel to question Altman and co., as to how they created that voice, and why it eerily resembled hers.
"VOICEOVER APPS HAVE OPENED UP NEW AVENUES FOR PRODUCERS LIKE US WHERE THEY CAN NOW EXPERIMENT WITH VOCAL ELEMENTS WITHOUT THE NEED TO HIRE A SPECIAL VOICE TALENT FOR EVERY PROJECT"
Kedar Santwani
So while Hollywood actors have the means to challenge the infringement of their personality assets, and some in Bollywood have had theirs insured against misuse, a voice artiste does not enjoy the same privileges. “We don’t have such famous, easily recognisable voices that someone hears it and says, oh, this is Rakhee Sharma,” says Sharma. As a result, one can’t just decide to have their voice protected against misuse on a whim without knowing the protocols that precede it. “Where does one go with these queries? How do you go about it? We don’t know any of that and it’s difficult to figure these things out during the early stages. Moreover, these changes are occurring so rapidly that we barely have time to catch up,” says Sharma, to which Chandra adds that if a voice artiste today were to register an official complaint against someone misusing or cloning their voice, the authorities wouldn’t even know which section that would fall under. “They’d wonder what the issue is actually all about, and which IPC section it’s covered by,” rues Chandra.
It’s perhaps the same as stealing someone’s identity, says voice-artiste Aditya Mathur, who has been in the field since 2004, and has been the voice for channels like MTV, VH1, Comedy Central, and Cartoon Network. The only way forward, according to him, is to have more robust laws, which, at present, do not have provisions to safeguard their livelihoods. In fact, corporations have already started leaning heavily on the technology. “I got a call from a corporate client the other day who sent me a rough cut (a sample audio recording) for me to refer to, and that cut was made using an AI tool, which was unheard of even until a few months ago,” shares Mathur.
/established/media/post_attachments/theestablished/2024-08-23/7qs0o7mq/Bob_the_Builder_Credits_www_peacocktv_com.jpeg)
The only way forward, according to Aditya Mathur, who has been in the field since 2004, and has been the voice for channels like MTV, VH1, Comedy Central, and Cartoon Network, is to have more robust laws to safeguard their livelihoods. Image: Instagram.com/peackocktv.com
/established/media/post_attachments/theestablished/2024-08-23/irucpy1b/pexels_shotbyrain_8105748.jpg)
sources at Meta revealed it’s in talks with actors Judi Dench, Awkwafina and comedian Keegan-Michael Key for the right to use their voices in a digital assistant product called MetaAI. Image: Pexels
This makes it even more essential for artistes to stand their ground and demand their due in the face of an aggressive technology threatening to leave them jobless. “We must unionise, if push comes to shove,” says Pinky Rajput, dubbing artiste, director and producer who co-founded Mumbai’s Mayukhi In Sync Studios in 2008, which has Netflix as one of its biggest clients.
Earlier this month, sources at Meta revealed it’s in talks with actors Judi Dench, Awkwafina and comedian Keegan-Michael Key for the right to use their voices in a digital assistant product called MetaAI. This conversation is likely to be a long-drawn one, as it comes exactly a year since the 118-day-long SAG-AFTRA strike—comprising the union bodies of the Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) and the Alliance of Motion Picture and Television Producers (AMPTP)—the longest in the unions’ history. The striking professionals negotiated over several clauses on fair work and pay policy, including ways to deal with AI. They eventually closed on a three-year deal with Hollywood studios, which stated that companies cannot use AI tools to create digital replicas of performers without payment to them or without their approval.
Rajput cites this example to say it’s the only logical way forward, and that her company would never let AI do the talking, literally and otherwise. “I wouldn’t say I wasn’t considering it,” she reveals. “I got an offer from an Israeli company that said they’d introduce AI into our studios earlier this year, and I flat-out refused, even though they were offering some stupid amount of dollars to work with them. They said for 20 characters they would only need four voice artistes and the rest would be done through AI, and that’s when we backed out.”
“I GOT A CALL FROM A CORPORATE CLIENT THE OTHER DAY WHO SENT ME A ROUGH CUT (A SAMPLE AUDIO RECORDING) FOR ME TO REFER TO, AND THAT CUT WAS MADE USING AN AI TOOL, WHICH WAS UNHEARD OF EVEN UNTIL A FEW MONTHS AGO”
Aditya Mathur
But her real concern? Those entering the field today who need work to stay afloat. For veterans like Kalra, who now enjoy the privilege of having built a name for themselves, not choosing to work with producers using AI is an option, which may not be so for an aspiring artiste. “It’s them that we need to reassure that work is there so they should not lend their voices to train AI that will ultimately eat into their jobs. They might end up agreeing to such projects, that too for a lot less than what they deserve, which is understandable, but has to be avoided at all costs,” says Rajput.
She is hoping for a miracle, much like her colleagues, that the juggernaut’s wheels hit a snag, and no matter how hard the technology is trained, it can never match up to human acumen and sentience.
“If they can train these machines to talk like Kate Winslet in several different languages, why would they need us?” asks Sharma, who, in fact, voiced Winslet as Ronal in Avatar: The Way of Water (2022). She hopes she can do so again, and wouldn’t have to compete with a machine to do the job she loves. “It’s not even a fair or dignified competition anymore,” Sharma says.
Also Read: AI is now narrating audiobooks. Does that bode well for reading?
Also Read: How the fusion of folk and electronic is renewing the tone for dance music in India
Also Read: Tajdar Junaid on why making music—for concerts or Oscar-bound films—is about marrying minds