How Do Deepfakes Work? Understanding the Technology and Preventing Fraud in April 2026

More from Tofu
April 13th, 2026 10 minute read

How Deepfakes Work: The Technology Behind AI-Generated Media
The History and Evolution of Deepfake Technology
Types of Deepfakes and Their Applications
- By Format
- By Use Case
How Deepfakes Are Used for Fraud and Financial Crime
Are Deepfakes Illegal? Understanding U.S. Federal and State Laws
- Where the Gaps Are
How Deepfake Detection Technology Works
How to Spot a Deepfake: Detection Methods and Warning Signs
Protecting Your Hiring Funnel from Deepfakes with Tofu
Final Thoughts on Deepfakes and Identity Verification
FAQs

If you're wondering how do deepfakes work, the short answer is two neural networks locked in an arms race until one produces media so convincing the other can't tell what's real. The result is video, audio, and images that map one person's identity onto another with enough precision to fool recruiters during live interviews. This isn't speculative. It's happening right now in your applicant pipeline. The tools that once required a research lab now run on a gaming laptop with a few minutes of source footage. We're breaking down the architecture, the fraud patterns it makes possible, and the detection methods that actually work when you're assessing a candidate in real time.

TLDR:

Deepfakes use GANs to generate synthetic video and audio that maps one person's face or voice onto another in real time
People correctly identify high-quality deepfakes only 24.5% of the time. Human intuition alone can't catch them
Deepfake fraud caused over $200M in losses in Q1 2025, with hiring fraud a growing vector most orgs aren't watching
No single federal law bans all deepfakes. 169 state laws exist but most cover only intimate imagery and election fraud
Tofu's DeepDetect analyzes lip sync, eye movement, and voice patterns during Zoom/Teams calls to catch deepfakes before offers go out

How Deepfakes Work: The Technology Behind AI-Generated Media

At the core of every deepfake is a Generative Adversarial Network, or GAN. The architecture pits two neural networks against each other: a generator that creates synthetic media and a discriminator that tries to identify what's fake. Each time the discriminator catches the generator, the generator improves. They train in a loop until the output is convincing enough to fool even the discriminator itself.

The result is synthetic video, audio, or imagery that maps one person's likeness onto another with frightening precision. Face-swapping deepfakes work by detecting facial landmarks in the source video, then warping and blending a target face into each frame in real time. Audio deepfakes use similar adversarial training on voice data to clone speech patterns, cadence, and tone. Face swapping and voice cloning are both examples of synthetic media that GAN-based models produce at scale.

What makes this tech genuinely dangerous is how accessible it has become for applicant fraud. Generating a convincing video deepfake once required a research lab. Now it requires a consumer GPU and a few minutes of source footage.

The History and Evolution of Deepfake Technology

The origin of deepfakes is academic. Researchers at Bell Labs and MIT were experimenting with facial animation and video synthesis as far back as the 1990s. These were slow, expensive, and largely confined to controlled lab environments.

The word "deepfake" itself traces back to 2017, when a Reddit user of that name began posting AI-generated face-swap videos. That moment cracked the door open. Within months, the technique spread across forums, hobbyist communities, and eventually commercial apps.

What happened next was less an evolution than an acceleration. Deepfake files jumped from 500,000 in 2023 to a projected 8 million in 2025, with annual growth nearing 900%. What once took a research team now takes a laptop. And according to Fortune's 2026 outlook, that pace shows no signs of slowing. The timeline from "research curiosity" to "hiring threat" turned out to be shorter than most anticipated.

Types of Deepfakes and Their Applications

Not all deepfakes are created the same way, or for the same purpose. The format shapes the threat.

By Format

Video deepfakes: Real-time face-swapping or full-face generation layered over live or recorded footage, often convincing enough to pass through a standard video interview
Audio deepfakes: Voice cloning that replicates speech patterns, tone, and cadence from as little as a few seconds of source audio
Image deepfakes: Static synthetic faces used in fake profile photos or fabricated ID documents
Text deepfakes: AI-generated writing that mimics a specific person's style, used in phishing and impersonation schemes

By Use Case

Legitimate applications do exist. Film studios use face-swapping for de-aging actors. Accessibility tools use voice cloning for people who've lost speech. Training simulations use synthetic media to generate realistic scenarios without real subjects.

The malicious applications are harder to ignore. Fraud operators clone executive voices to authorize wire transfers. Political actors generate synthetic video to fabricate statements. In hiring, a candidate applies with a real resume, then shows up to a video interview as someone else entirely using a face-swap filter. The job gets offered. The wrong person shows up on day one.

Deepfake Type	Technical Method	Primary Fraud Application	Detection Difficulty	Key Detection Signals
Video Deepfake	Real-time face-swapping using GANs with facial landmark mapping and frame-by-frame warping	Hiring fraud where candidates interview as someone else, bypassing identity verification during live video calls	High - 24.5% human detection rate for quality deepfakes	Lip sync misalignment, unnatural eye movement, facial edge blur during head turns, lighting inconsistencies
Audio Deepfake	Voice cloning via adversarial training on voice samples, replicating speech patterns and tone	Executive impersonation to authorize wire transfers, approval chain fraud, fake voice verification	Medium to High - compression artifacts often subtle	Cadence irregularities, lack of natural breath patterns, tonal anomalies, spectral frequency inconsistencies
Image Deepfake	Static synthetic face generation using trained neural networks on facial datasets	Fabricated profile photos, fake ID documents, synthetic identity creation for application fraud	Medium - depends on image quality and context	Pixel-level artifacts, compression anomalies, inconsistent lighting across facial features, unnatural skin texture
Text Deepfake	AI text generators trained to mimic writing style, vocabulary, and communication patterns	Phishing campaigns, executive impersonation via email or messaging, fabricated credentials and references	Low to Medium - style analysis can catch patterns	Deviation from baseline writing patterns, unnatural phrasing, inconsistent vocabulary complexity, timing anomalies

How Deepfakes Are Used for Fraud and Financial Crime

Deepfakes have become the preferred tool for impersonation at scale, and the numbers are hard to ignore. Losses in North America exceeded $200 million in Q1 2025 alone due to deepfake fraud. That's a single quarter.

The Arup incident became the case study everyone cited. In February 2024, a finance employee was tricked into wiring $25 million after joining a video conference where every other participant, including the CFO, was a deepfake. No one in the call was real.

The fraud patterns break into a few clear categories:

CEO and executive impersonation to authorize wire transfers or data access
Identity verification bypass using synthetic face video during KYC checks
Hiring fraud, where a candidate interviews as one person and shows up as another
Voice cloning to impersonate employees in internal approval chains

The hiring vector is the one most organizations aren't watching. A bad actor applies with a real resume, passes application screening, then runs a face-swap filter during the video interview. The recruiter sees a convincing face. The offer goes out to the wrong person entirely.

Are Deepfakes Illegal? Understanding U.S. Federal and State Laws

The short answer is: it depends on what the deepfake does, and where you are. There's no single federal law that makes all deepfakes illegal. What exists is a patchwork of state statutes, targeted federal bills, and a handful of signed laws covering specific categories of harm.

Since 2022, 169 laws have been enacted across the U.S. targeting AI deepfake use. The coverage is uneven. Most states focus on nonconsensual intimate imagery and election interference. California and Texas both have laws targeting deepfakes in political ads and explicit content. New Jersey passed legislation criminalizing nonconsensual deepfake pornography. But workplace fraud, identity impersonation in hiring, and financial crime largely fall outside these statutes.

At the federal level, the TAKE IT DOWN Act, signed in 2025, targets nonconsensual intimate deepfakes and requires removal of flagged content within 48 hours. The DEEPFAKES Accountability Act goes further, proposing mandatory watermarking of synthetic media and criminal penalties for malicious use. As of early 2026, it has not passed.

Where the Gaps Are

Even in states with deepfake laws on the books, enforcement is difficult. Most statutes require proving intent, and attribution is hard when the tools are anonymous and the actors are overseas. DPRK IT workers running deepfake interview fraud, for instance, operate well outside U.S. jurisdiction.

For organizations in hiring, legal recourse is rarely the first line of defense. Detection has to come before litigation.

How Deepfake Detection Technology Works

Detection works by inverting the same logic used to create synthetic media. Generators leave traces. The goal is finding them before a human ever has to.

The main signals detectors look for:

Spectral artifacts: compression inconsistencies and frequency anomalies invisible to the naked eye but detectable in pixel-level analysis
Liveness checks: micro-expressions, involuntary eye movement, and blinking patterns that AI-generated faces still struggle to replicate accurately
Lip sync misalignment: even small timing gaps between audio and mouth movement signal manipulation
Voice pattern analysis: cloned audio carries subtle cadence and tonal irregularities that deviate from natural speech

In a hiring context, these signals get harder to catch because the interview environment adds noise. Lighting varies. Connections drop. Recruiters are not forensic analysts. That's why automated detection running behind the scenes matters more than human intuition alone.

How to Spot a Deepfake: Detection Methods and Warning Signs

People correctly identify high-quality video deepfakes only 24.5% of the time. That's roughly coin-flip odds, and experienced observers don't perform meaningfully better than untrained ones.

Some visual and audio cues are worth knowing:

Unnatural eye movement or reduced blinking frequency that feels subtly robotic
Facial edges that blur or shimmer during head turns, especially around the hairline
Lighting inconsistencies where shadows don't match the environment
Lip sync that drifts slightly off, particularly on hard consonants like "p" or "b"
Voice that sounds compressed or lacks natural breath patterns between sentences
Accessories like glasses or earrings that render inconsistently frame to frame

The problem is that a recruiter assessing a candidate in a 30-minute call is not running frame-by-frame forensic analysis. High-quality deepfakes are built to pass casual scrutiny. Human vigilance helps at the margins, but it was never designed to be the primary defense.

Protecting Your Hiring Funnel from Deepfakes with Tofu

Detection methods covered earlier work in theory. In a live interview with a real recruiter on a clock, they're nearly impossible to execute manually.

That's the gap Tofu fills. DeepDetect runs behind the scenes during Zoom, Teams, and Google Meet calls, analyzing lip sync, eye movement, facial construction, and voice patterns in real time without interrupting the interview. Recruiters don't need to become forensic analysts. The system flags the manipulation before the offer goes out.

FraudDetect covers the application layer, catching synthetic identities, DPRK IT workers, and location spoofing before a recruiter wastes a single hour on a fraudulent candidate. Together, both products cover the entire funnel. The person who applies is who interviews is the person you hire. If that chain breaks anywhere, Tofu catches it.

Human intuition was never built for this threat. Automated detection was.

Final Thoughts on Deepfakes and Identity Verification

The real problem with detecting deepfakes is that you're expected to do it while running a normal interview. Nobody has time to analyze lip sync frame-by-frame or run spectral analysis on audio in the middle of a candidate conversation. That's exactly why we built automated checks that flag fraud in real time without slowing your team down. You hire people, we verify they're real.

FAQs

How can recruiters tell if someone is using a deepfake during a video interview?

Recruiters correctly identify high-quality deepfakes only 24.5% of the time—roughly coin-flip odds. Manual detection is unreliable because you're looking for subtle lip sync drift, unnatural eye movement, and lighting inconsistencies while trying to run an actual interview. Automated detection running behind the scenes is the only reliable method.

What types of deepfake fraud are most common in hiring?

Face-swapping deepfakes during video interviews are the biggest threat—a candidate applies with a real resume, then shows up to the call as someone else using real-time synthetic video. Voice cloning is also rising, where bad actors use AI to replicate speech patterns and tone. Both are used to get the wrong person hired.

Are deepfakes illegal in the United States?

It depends on what the deepfake does and where you are. 169 laws have been enacted since 2022, but most target nonconsensual intimate imagery and election interference—not workplace fraud or hiring impersonation. There's no federal law that makes all deepfakes illegal, and enforcement is difficult when actors operate overseas.

How does deepfake detection technology actually work during live interviews?

Detection tools analyze lip sync alignment, eye movement patterns, facial construction consistency, and voice pattern anomalies in real time. They look for spectral artifacts and compression inconsistencies that human observers can't catch. Systems like DeepDetect run behind the scenes during Zoom, Teams, and Google Meet calls without interrupting the interview flow.

Can deepfakes bypass standard identity verification in hiring?

Yes. A bad actor can pass application screening with a real resume and legitimate credentials, then use a face-swap filter during video interviews. The recruiter sees a convincing face, the offer goes out, and the wrong person shows up on day one. That's why continuous identity verification across the entire hiring funnel matters more than single-point checks.

What is a GAN and how does it relate to deepfake creation?

A GAN (Generative Adversarial Network) is the core technology behind deepfakes. It consists of two neural networks—a generator that creates synthetic media and a discriminator that tries to identify fakes—that train against each other in a loop until the output becomes convincing enough to fool even the discriminator itself.

How much training data or source footage is needed to create a deepfake?

Creating a convincing deepfake now requires only a few minutes of source footage and a consumer-grade GPU. What once demanded a research lab and extensive data can now be done on a gaming laptop with minimal source material.

What was the Arup deepfake fraud incident?

In February 2024, a finance employee at Arup was tricked into wiring $25 million after joining a video conference where every participant, including the CFO, was a deepfake. No one in the call was real, making it one of the most cited cases of deepfake financial fraud.

Can deepfake technology be used for legitimate purposes?

Yes. Film studios use face-swapping for de-aging actors, accessibility tools use voice cloning for people who've lost speech, and training simulations use synthetic media to generate realistic scenarios without requiring real subjects.

What is the TAKE IT DOWN Act and what does it cover?

The TAKE IT DOWN Act, signed in 2025, specifically targets nonconsensual intimate deepfakes and requires the removal of flagged content within 48 hours. However, it doesn't cover workplace fraud, hiring impersonation, or most financial crimes involving deepfakes.

How fast is deepfake content growing online?

Deepfake files jumped from 500,000 in 2023 to a projected 8 million in 2025, representing annual growth nearing 900%. The acceleration shows no signs of slowing according to 2026 forecasts.

What are spectral artifacts and how do they help detect deepfakes?

Spectral artifacts are compression inconsistencies and frequency anomalies that are invisible to the naked eye but detectable through pixel-level analysis. These traces are left behind by the generation process and serve as key signals that automated detection systems look for.

Where did the term 'deepfake' originally come from?

The word 'deepfake' traces back to 2017, when a Reddit user of that name began posting AI-generated face-swap videos. Within months, the technique spread across forums, hobbyist communities, and eventually commercial apps.

Why is hiring fraud through deepfakes harder to prosecute than other types?

Most deepfake laws focus on nonconsensual intimate imagery and election interference, not workplace fraud or hiring impersonation. Additionally, enforcement is difficult when actors operate overseas—like DPRK IT workers running interview fraud—well outside U.S. jurisdiction.

What are the most common visual signs that someone might be using a deepfake?

Common visual cues include unnatural eye movement or reduced blinking, facial edges that blur during head turns (especially around the hairline), lighting inconsistencies where shadows don't match the environment, and accessories like glasses or earrings that render inconsistently frame to frame.

« Back to Blog