A finance employee at a multinational company joins a video call with what appears to be her CFO and several colleagues. She is instructed to authorize a $25 million wire transfer for a confidential acquisition. Every face on the call is real-time AI deepfake. She authorizes it. The money is gone within hours. This is not a hypothetical — it happened in Hong Kong in 2024, and it is happening at smaller scales every week.
A deepfake video call scam uses AI-generated facial and voice synthesis to impersonate a real person — typically an executive, family member, or trusted colleague — in a live video call. Unlike pre-recorded deepfakes used in disinformation campaigns, these attacks use real-time video manipulation software that overlays a synthetic face onto the attacker’s live webcam feed, enabling an interactive conversation that appears to show the impersonated person speaking and responding naturally.
The technology crossed a critical threshold of practical accessibility around 2023–2024. Tools that previously required significant computing resources and expertise are now available as consumer applications. Combined with the vast availability of video footage of public figures and business professionals on LinkedIn, YouTube, and company websites, the barrier to mounting a convincing deepfake impersonation has dropped dramatically.
The $25 million Hong Kong case — where a finance employee authorized a wire transfer after a multi-person deepfake video conference — demonstrated that the attack is now viable against sophisticated corporate targets with established security awareness. The attack succeeded not because the employee was naive, but because the visual and procedural authenticity of the interaction was sufficient to override standard skepticism.
Attackers gather video and audio of the target being impersonated — earnings call recordings, conference presentations, LinkedIn video posts, YouTube appearances, or interview footage. For corporate executive impersonation, this material is often publicly available. For family impersonation, social media provides the source. The quality and quantity of available footage directly determines the realism of the resulting deepfake.
Video footage is fed into a deepfake generation tool, which trains a model on the target’s facial features, expressions, and voice characteristics. Real-time deepfake tools can then overlay the synthetic face onto the attacker’s live webcam feed, mapping the attacker’s own facial movements onto the target’s appearance in real time. Voice cloning runs in parallel, replacing the attacker’s voice with a synthesized version of the target’s.
For corporate attacks, the pretext is typically urgent and confidential: a surprise acquisition, a regulatory response, a sensitive transaction requiring immediate action and discretion. The urgency prevents verification and the confidentiality requirement justifies bypassing standard approval chains. The victim is contacted by email first — often from a spoofed or compromised account — then invited to a video call for “sensitive discussion.”
The attacker joins the call through standard video conferencing software with the deepfake overlay active. In the Hong Kong case, multiple deepfake participants were present simultaneously. The call follows a scripted narrative that moves quickly to the financial request — limiting the time available for the victim to notice visual artifacts or ask probing questions. Once authorization is given, funds move immediately.
End the call. Call back on a known, pre-existing number from your own contacts — not any number provided during the suspicious call. A real executive answers on their known company number. A real family member answers on their saved phone number. A deepfake operation cannot replicate this. No amount of visual AI sophistication can intercept a call you initiate independently to a number you already have. This out-of-band callback is the single most reliable defense against deepfake impersonation at any level of realism.
Finance employees with payment authorization authority are the highest-value targets for deepfake video fraud because they can authorize large, irreversible transfers. The combination of executive authority (the CFO appearing on screen) with procedural legitimacy (a video call is a normal business tool) creates an authorization environment that feels safe. The urgency and confidentiality framing prevents the additional verification steps that would be standard in any other large transfer context.
The Hong Kong case demonstrated a sophisticated evolution: multiple participants in a single video call were simultaneously deepfaked. Seeing several familiar faces — not just one — on a call dramatically reduces the impulse to question the authenticity of any individual participant. The social proof of a normal-seeming meeting overrides individual skepticism. Organizations must treat any video call resulting in a financial request with the same verification requirements regardless of how many familiar faces appear.
The most effective organizational defense is procedural, not technical. No video call — regardless of who appears — should be sufficient sole authorization for any outgoing wire transfer above a defined threshold. A second, independent verification through a different channel (calling back on a known number, a physical in-person confirmation, dual approval from a second employee) must be required. This procedural requirement cannot be circumvented by any level of visual AI sophistication.
Deepfakes, voice cloning, and real-time impersonation are now consumer-grade threats. The best identity theft protection services have expanded their monitoring to cover AI-assisted fraud signals — dark web activity, synthetic identity use, and account takeover attempts. We tested every major service to find out which ones actually keep pace.
Deepfake operations source video and personal details from public profiles and data broker databases. The more of your face, voice, and personal information that’s publicly accessible, the lower the barrier to impersonating you. Find out how much of your data is already exposed — in under 60 seconds.