Deepfake Voice Technology Iterates on Old Phishing Strategies

As the world of AI and deepfake technology grows more complex, the risk that deepfakes pose to firms and individuals grows increasingly potent. This growing sophistication of the latest software and algorithms has allowed malicious hackers, scammers and cyber criminals who work tirelessly behind the scenes to stay one step ahead of the authorities, making the threat of attacks increasingly difficult to both prepare for and defend against. Most readers probably believe they’re more or less familiar with the nature of traditional cyber attacks that involve system hacking, viruses and ransomware. However, the realm of cyber crime took a vast leap forward in 2019 when the CEO of a UK-based energy firm fell victim to a scam built upon a phone call using deepfake audio technology. Believing he was speaking to his boss, the CEO victim sent almost $250k as a result of being told to do so by a AI-generated deepfake audio file. In the aftermath, some cybersecurity experts have been left wondering whether deepfake audio technology represents the next major security concern, and the wider world is left scrambling for ways to spot this looming threat.

Voice Cloning and AI Audio: A New Frontier For Cybercrime

The audio deepfake scam is, without a doubt, one of the more bizarre applications of deepfake technology. However, as we’ve seen, it’s one which can clearly be applied successfully – so successfully and convincingly, in fact, that the CEO who fell victim to the cyberattack stated on the record that he recognized his boss’s voice by its ‘slight German accent’ and ‘melodic lilt.’ Furthermore, by all accounts, the cybercriminals’ tech is becoming more difficult to detect by the month. Sophisticated technology aside, the process behind the construction of audio deepfakes is a surprisingly simple one. Hackers have tweaked machine learning technology in such a way as to clone an individual’s voice, usually by utilizing spyware and devices that allow the cyber attacker to gather several hours of recordings of their victim speaking. The more data they are able to collect – and the better the quality of the recordings – the more accurate and potentially harmful the voice clone will be in practice. Once a voice model has been created, the malicious hacker’s AI gets to work ‘learning’ how to mimic the target. The AI will use what are known as generative adversarial networks (GAN), systems which continuously compete against one another through which one creates a fake and the other attempts to identify its flaws. With each new attempt, the AI is able to exponentially improve upon itself. This process continues until a reliable mimic is achieved and often succeeds after analyzing as few as twenty minutes of recordings. Worryingly for many executives (most notably those at large firms), such recordings are woefully easy to gather. Speeches are recorded online and shared via social media, while phone calls, interviews and everyday conversations are relatively simple to gain access to. With enough data in the bank, the level of accuracy achieved by audio deepfake files is as impressive as it is a truly frightening prospect, and the criminals are able to get the deepfake to say whatever it is they want it to. At present, many of the recorded examples of deepfake audio scams have been those which were ultimately unsuccessful in their aims. However, when one considers that the 2019 attempted coup in Gabon is believed to have been triggered by a deepfake audio call, it becomes devastatingly clear how impactful this technology can be.

Next-Level Phishing Meets Next-Gen Security

Regular, non-deepfake based phishing scams remain remarkably popular and successful, with as many as 85% of organizations finding themselves targeted. However, one of the key reasons why voice phishers present such a potent threat to the big-monied world of corporate security is because deepfake audio hackers are able to circumvent that most fabled of cybersecurity protections: the corporate VPN. Your computer network can be protected against the majority of sophisticated malware and viruses, and VPN software is consistently updated to look out for new concerns and virus types. AI-generated phone calls, however, depend solely upon human error, gullibility, and trust… and that’s what makes them potentially so dangerous. When one considers that even the smart phones we keep perma-clutched in our hands are nowhere near as secure as we believe, it isn’t difficult to see a multitude of ways in which cyber criminals can penetrate our defenses. It stands to reason, therefore, that the answer to defending our privacy and vulnerabilities from deepfake audio may come in the form of AI solutions specifically formulated to root it out. Scientists are working on complex and far-reaching algorithms that have the capacity to learn human speech patterns and peculiarities and that can be used to detect deepfake audio tracks. By seeking out ‘deformities’ in speech and automatically comparing the recordings with authentic speech files, they’ll be included in anti-voice cloning security devices that are likely to become widespread in the coming years. Essentially, the security systems of the very near future will be advanced imitations of the same AI tools which malicious hackers are using in their attempts to defraud their victims. Experts are also keen to highlight practical steps that we can all undertake to protect ourselves from deepfake audio scams. One of the easiest – and most effective – ways to identify a deepfake scam is to simply hang up your phone and call the number back. The majority of deepfake scams are carried out with the use of a burner VOIP account, set up to contact targets on the hackers’ behalf. By calling back, victims should be able to figure out straight away whether or not they were talking to a real person.

Deepfake Audio Scams: A Very Real Threat on the Horizon

At present, deepfake audio scams are seemingly few and far between, with the technology simply not widespread enough for them to be a far-reaching concern for the majority of professionals and private individuals. This is, of course, likely to change in the near future. AI advancements evolve at an eye-watering rate, and the tech which makes deepfaking possible is becoming more accessible and easier to use. While private security systems and international efforts to tackle cybercrime are quickly catching up with malicious hackers, they are a creative bunch who will never stop searching for ways to move one step ahead. With that in mind, the best advice is to remain vigilant and prepared, as deepfake audio scams could very much become the next big issue for cybersecurity to deal with.

About the Author: Bernard Brode (@BernieBrode) is a product researcher at Microscopic Machines and remains eternally curious about where the intersection of AI, cybersecurity, and nanotechnology will eventually take us. Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc.

Meet Fortra™ Your Cybersecurity Ally™

Fortra is creating a simpler, stronger, and more straightforward future for cybersecurity by offering a portfolio of integrated and scalable solutions. Learn more about how Fortra’s portfolio of solutions can benefit your business.

Learn More

Deepfake Voice Technology Iterates on Old Phishing Strategies

Voice Cloning and AI Audio: A New Frontier For Cybercrime

Next-Level Phishing Meets Next-Gen Security

Deepfake Audio Scams: A Very Real Threat on the Horizon

Meet Fortra™ Your Cybersecurity Ally™

Guest Authors

Contact Information

Privacy Policy

Cookie Policy

Impressum