By Salma Abdi
Disclaimer: This article provides information on deepfakes and the techniques used to create them, and is intended solely for educational purposes. While deepfakes can be fascinating, they also come with significant ethical and legal considerations. We do not condone or support the misuse of this technology. Please use this information responsibly, be mindful of the potential consequences, and always obtain consent from any individuals you may wish feature in deepfake content.

Last year in Putnam County, New York, three high school students created deepfake videos of the principal of another nearby school, George Fischer Middle School. In the videos, the man appeared to be yelling racial slurs, going so far as to threaten violence against students of colour. Parents and guardians received a vaguely worded letter stating that some students had “used artificial intelligence to impersonate” the staff member in a video in which he appears to make “inappropriate comments.” Despite assurances that there was no actual threat, many were deeply concerned about potential danger, and even considered withholding their children from school.
What Are Deepfakes?
Deepfakes are digitally altered photos, videos, or audio recordings of a person or people to make them appear as someone or something else. This technology utilizes artificial intelligence (AI) to create a convincing false identity, often for the purpose of spreading misinformation. While early attempts at deepfake content were easily recognizable, more recently the results have become so realistic that it can be very difficult to distinguish them from recordings of actual events. Deepfakes are starting to become mainstream (although such content is usually easily debunked and labelled as fake), as even beginners can make this content with access to the necessary software and a few hours of internet research.

Face Replacement
Face replacement software allows the user to manipulate images of faces. Users might alter their own facial appearance to mimic someone else’s, such as that of a celebrity, or create an image in which one person’s face appears on someone else’s body. While this was once a complex task that had to be done manually, advanced AI technology now enables people to manipulate photos and videos easily.
Despite being simple for the user, this kind of software has to undertake several steps to generate a user’s request. Let’s say you want to put the face of a celebrity, like the painter Bob Ross, over your own face in a photo or video. First, the AI deepfake tool you choose needs enough content to train its face replacement model. Together with a photo or video of yourself—the source material you want Bob Ross’s face to inhabit—you need to upload several photos or videos of Bob Ross in various scenarios. The algorithm will then split the files of your target (in this example, Bob Ross) into hundreds of pictures so it can analyze the face in each one. Depending on the algorithm used by the software, as many as 5,000 images of your target could be required.

The next step is where the magic, so to speak, happens. The AI uses one algorithm, called a generative model, to create the deepfake photo or video you requested, while a second algorithm (the discriminator model) detects differences between the generative model’s output and the original source material. The discriminator model’s feedback is then delivered to the generative model, which responds by altering the deepfake until it is practically identical in quality to the source material. This cycle is repeated until the discriminator model can no longer identify differences between the deepfake and the genuine content. This continuous improvement of the deepfake’s accuracy is referred to as a generative adversarial network (GAN).
Face Re-Enactment
Face re-enactment refers to the manipulation of the facial expressions of the target. To begin, the algorithm needs to be trained to detect faces and their individual features. Then, it will pinpoint certain places on the faces of the user and the target (Bob Ross in this case), which are called facial landmarks. This is what allows the user to control the image of the target’s face. Then, the source actor (you!) uses a camera to record the facial expressions you would like to see in the manipulated video of the target. Using that video information, the algorithm alters the target’s face in the video so it mimics the new facial expressions you have provided in the video of yourself. As it did in the process of face replacement, the AI tool uses a GAN to improve the quality of the deepfake. Some software even lets you see the progress in real-time.
Voice Cloning
Now you know it’s possible to change an existing video file to place Bob Ross’s face on an image of your body and even make the new image of him smile or frown. But you might have realized that any deepfake video you make will either be silent or feature the sound from the original video. To make a video of Bob Ross in which he appears to say whatever you want him to, you need to make some audio to go along with the image frames. As it happens, adding audio is much easier than the two processes we have already outlined. To start, the AI you select needs at least one audio sample to train its voice cloning model, so you would need to add audio from an interview with Bob Ross, or an episode, into the system. The model analyzes the audio sample for distinctive speech patterns, a particular accent, and other individual characteristics, so that the words you want the target to appear to say take on their specific manner of speaking and any of their verbal tics. Once the AI has done this modelling, input your script and wait until audio that closely resembles the sound of the target talking is produced. This process can take from a few minutes to a few days, depending on the AI model being used and the complexity and quality of the recording it will deliver.
The Future
Deepfakes are increasingly a widely used form of technology, which will affect our lives in the near future. A few years ago in a Florida museum, an AI rendition of Salvador Dalí brought several of the artist’s quotes alive, to the amazement of the audience. Someday, students will be able to listen to the life stories of historical figures coming from their “own mouths,” which could potentially increase entertainment value and the interest of viewers.
Some companies, including Google, TikTok, and Meta, have recently vowed to combat deepfakes on their platforms by utilizing technology to detect and prevent the spread of this kind of content. This decision came in light of the risk of election interference, as a recent voice call deepfake of President Joe Biden began making the rounds, discouraging “individuals from voting in the New Hampshire primary,” according to ABC News. Despite these tech giants beginning to put countermeasures in place, it is clear that deepfakes aren’t leaving the internet. As important as it is to learn the how and why behind deepfake technology, it’s just as important to learn the how and why behind responsible use and creation. The more we know, the more we can protect ourselves and others.