No, the Pope did not wear a Balenciaga puffy jacket to walk the streets of Rome last May.
Taylor Swift did not participate in the porn that faked her likeness last week— images which went viral on X and Telegram. Some bad actors swapped her face and likeness, investigators say, onto the face and body of someone else.
Donald Trump did not get arrested and bullied by police outside Trump Tower in Manhattan last year. A video was doctored to make it look like he had, and then went viral with the support of his fans.
And no, Hillary Clinton never voiced her support for Ron DeSantis—nor did President Joe Biden voice a recent robocall telling Democratic voters in New Hampshire to avoid casting their ballots in that state’s Jan. 23rd primary. “Voice-cloning” is now cheaply available to even the most cash-strapped adversaries.
Perhaps we should have seen this coming.
Some of you may remember seeing Facebook co-founder Mark Zuckerberg, a few years ago, bragging about having “total control of billions of people’s stolen data.” Sure, it may have sounded true at the time, but in reality, it was a video made by tech rivals using artificial intelligence to make a “deepfake”—a single image or video intended to diminish someone’s reputation, twist the truth, or worse, to maliciously disinform and erode public trust.
The earliest deepfakes weren’t taken very seriously. The technology wasn’t as accessible nor as sophisticated; lip-synchs were often botched and face features copied badly. Others contained obviously questionable content, or were made to simply be fun to watch. Take, for example, the ongoing entertainment series of @deeptomcruise videos being shared on TikTok by actor Miles Fisher, a Tom Cruise impersonator with 5.2 million followers. Fisher doesn’t look exactly like Cruise in real life, but Fisher’s TikTok videos are made to create a sharper resemblance, and to showcase action few fans would expect of Cruise. In one of his videos, Fisher uses voice-cloning tech to make Tom speak fluent Spanish. [In real life, Cruise does not.]
Enable 3rd party cookies or use another browser
But last year changed everything. When ChatGPT4 , DALL-E and other generative AI tools were released to the public a year ago, these more advanced tools offered users new ways to visually manipulate images. They also encouraged widespread experimentation—and often, without offering guidance to discourage misuse. Pablo Xavier, a 31-year-old Chicago construction worker, was an early user among the millions of ordinary citizens trying out generative AI’s new visual image tools last year. Using MidJourney, Xavier’s spoof to “make the Pope drip” [street slang for fashionable] went viral. He told BuzzFeed afterwards that he didn’t mean any harm, and was surprised the Vatican saw it differently. “I just thought it would be funny,” Xavier said.
Election-Year Guardrails
Now, at the start of the 2024 election season, generative AI tools are being taken much more seriously, given their potential to be used by bad actors to make fakes to further drive polarization and diminish trust in the election process. More than 40 countries will be electing new leaders this year, and the stakes couldn’t be higher.
DeepMedia, a company making tools to detect synthetic media, predicts that some 500,000 election-year deepfakes will be shared globally in the coming months. Darrell West, a senior fellow at the Brookings Institution, recently told Reuters: “It’s going to be very difficult for voters to distinguish the real from the fake, and you could just imagine how either Trump supporters or Biden supporters could use this technology to make the opponent look bad. … And it can all happen so fast, that deepfakes could drop right before the election that nobody will have a chance to take down.”
Ten states—including Virginia and Texas—have criminal laws against deepfakes but there is currently no federal law in place. Meanwhile, Democrats in Hawaii, South Dakota, Massachusetts, Oklahoma and Nebraska, as well as Republicans in Indiana and Wyoming, have introduced additional legislation that would ban media created with the help of AI within a specific time frames before elections if that media doesn’t include disclosure. Arizona Republicans proposed a bill to allow any candidate for public office who will appear on the ballot, or any Arizona resident, to sue for relief or damages from any people who publish a “digital impersonation” of that person.
Some tech companies also have offered some safeguards. Just before the Senate Judiciary Committee began holding hearings on AI last fall, Google, and then Meta, announced they would require political ads appearing on their platforms to disclose whether AI was used to create them. Enforcement of these new policies, however, will pose unprecedented challenges.
“We are in the middle of a growth of authoritarianism globally, a decline in trust and in mainstream media, (and) pervasive mis- and dis-information,” says Sam Gregory, Executive Director of witness.org, a nonprofit media watchdog. “We need a globally inclusive response to this broader phenomenon of generative AI and synthetic media.”
Steps to Fight Back
So how can we, as tech- and media-savvy citizens, help to find fakes, or at least detect many of them when they cross our radar?
It’s still difficult without a team of tech whizzes to help spot AI-driven disinformation. But here’s a short list of some of the more popular AI technologies and uses to watch for in the months ahead:
Voice Cloning. This is voice-mimicking software that uses advanced AI algorithms to convincingly imitate the voice of a politician or celebrity to create confusion and mislead consumers, voters and the general public. One of the best warnings about this technology was released last September by the Polish startup, Elevenlabs. The company posted a short video on YouTube to show how voice-cloning can be filtered into disinformation schemes. Elevenlabs took the widely-shared speech given by actor Leo DiCaprio at the UN’s 2014 Climate Summit, and then used speech synthesis and voice cloning capabilities to make Leo speak in the voices and pacing of famous people like Joe Rogan, Steve Jobs, Robert Downey, Jr., Bill Gates—and even Kim Kardashian.
Corporate fundraising clones. A person who is manipulating a company executive’s voice, or that of a political donor, needs just a few seconds of real audio to make a good fake. The U.S. Federal Trade Commission earlier this month began accepting public submissions for its Voice Cloning Challenge, which is aimed at promoting the development of ideas to protect executives, employees and consumers from the misuse of AI-enabled voice cloning to commit financial fraud and other harms.
Face fakes. In deepfake videos, a person’s face is swapped with another to make it look like that person did or said something they didn’t—like what happened in the Taylor Swift case. Face-fake technologies also are being used to persuade consumers to sign up for expensive products and services online. Last October, actor Tom Hanks warned people that an AI-created video using his likeness was being used to sell dental insurance online. "I have nothing to do with it," Hanks said in an Instagram post. Soon after, CBS Mornings co-anchor Gayle King sounded the alarm over a video purporting to show her touting weight-loss gummies. "Please don't be fooled by these AI videos," she said. One of the more widely shared face-faking videos was created to warn people about the technology, called This is not Morgan Freeman.
Robocalls. The use of AI in elections is becoming more prevalent and poses a serious threat to election outcomes, says The Hill, because there are few, if any, safeguards in place to prevent false information from being disseminated. The recent Joe Biden robocall case was a form of “deepfake disinformation designed to harm Joe Biden, suppress votes and damage our democracy,” said Samir Jain, vice president of policy at the Center for Democracy and Technology in Washington. Robocalls can make a candidate appear to say something he or she did not, and spread false information about voting.
New Detection Tools
Some universities and news organizations have begun offering “fake news detection” guides to help consumers spot deepfakes more easily. MIT’s Detect Fakes is a short quiz that enables users to compare two videos to decide which is real. Microsoft’s Spot the Deepfake is a 10-question quiz that has users detect signs like mismatched shoes or earrings, or eye movements that don’t synch.
Other new detection tools also are emerging, including free versions, like Content at Scale and GPTZero, as well as versions that charge for access, such as Sensity. Tools—which is designed to spot embedded markers in AI-generated images and look for unusual patterns in how the pixels are arranged, including in their sharpness and contrast.
But deepfake detection tools are still early in development, and it’s not yet certain which tools to trust, says witness.org’s Sam Gregory, who specializes in deepfake detection. Mya Zepp, with IJNet’s Disarming Disinformation project, says it’s essential that journalists and creatives who are committed to sorting fact from fiction have access to reliable tools so they can more easily spot what’s fake and warn their audiences and stakeholders. Most critical, says Professor Lilian Edwards of Newcastle University in the U.K., a specialist in Internet law, is to “stem the potential chaos” of both real deepfakes and claims of deepfakes that seriously erode public trust.
“The future doesn’t have to be one in which anything can be called a deepfake, anyone can claim something is manipulated, and trust is further corroded,” Edwards recently told Guardian writer Ian Sample. “The problem may not be so much the faked reality as the fact that real reality becomes plausibly deniable.”