Skip to content ↓ | Skip to navigation ↓

Back in the spring of 2015, I wrote about five types of social engineering attacks against which users should protect themselves. One of the techniques I discussed is called pretexting. It’s when an attacker creates a pretext, or a fabricated scenario, to trick a target.

Attackers will assume any pretense to achieve their nefarious ends. Some of their shams can be quite complicated. Take business email compromise (BEC) scams, otherwise known as CEO fraud. For an attacker to set up this pretext, they must hack a business executive’s email account. Attackers will usually pull off this preliminary step by subjecting a target to a whaling attack.

If the attack proves successful, they can leverage their target’s email to impersonate the executive, contact the HR department, and request employees W-2 forms and other personally identifiable information. Alternatively, they can contact someone in finance and request that they make a fraudulent wire transfer to an account under their control. This latter form of fraud has cost individual businesses tens of millions of dollars. According to the FBI, BEC scams have victimized a total of 22,000 companies and caused more than $3 billion in losses since 2013.

CEO fraud assumes an attacker has compromised a target’s email account. Were a computer criminal to pose as the executive outside of a text-based medium, the jig would be up. That’s still the case as of January 2017. But a new technology could change all of that.

Project Voco: The Excitement and Concern

Meet Project VoCo. It’s short for “Photoshopping Voiceovers,” one of 11 experimental technologies demoed at Adobe MAX 2016. VoCo is a sound engineer’s dream in that it allows a controller to edit or insert words into an audio recording without having to bring the voiceover artist back into studio. All the software needs is about 20 minutes of a person’s speech to make the process work.

Project VoCo lives up to that expectation in the demo video provided below.

Clearly, lots of people are excited about the prospect of being able to alter audio recordings. But not everyone is jumping on the bandwagon. Dr. Eddy Borges Rey, a lecturer in media and technology at the University of Stirling, is concerned by the development. He revealed as much to BBC News:

“It seems that Adobe’s programmers were swept along with the excitement of creating something as innovative as a voice manipulator, and ignored the ethical dilemmas brought up by its potential misuse. Inadvertently, in its quest to create software to manipulate digital media, Adobe has [already] drastically changed the way we engage with evidential material such as photographs. This makes it hard for lawyers, journalists, and other professionals who use digital media as evidence. In the same way that Adobe’s Photoshop has faced legal backlash after the continued misuse of the application by advertisers, Voco, if released commercially, will follow its predecessor with similar consequences.”

That’s a good point. If proper safeguards aren’t implemented, Project VoCo could undermine the authenticity of audio recordings. Attackers could in that case exploit the technology to fool others into thinking someone said something they did not–all towards a nefarious end like CEO fraud. All they would need to do is conduct a bit of research beforehand.

Laura V. explains in Social-Engineer Newsletter how one such attack might proceed:

  1. An attacker performs OSINT and discovers an organization’s CEO will be away on business for a few days or a week.
  2. The bad actor records a fake message from the CEO using VoCo that asks the head of finance to call them back for instructions regarding an upcoming payment. They leave that message as a voicemail for the head of finance.
  3. The head of finance receives the message, thereby establishing the attacker’s pretext.
  4. The attacker receives a call from the head of finance. Using VoCo, the former instructs the latter to deliver funds to an account under their control.

The scenario above doesn’t cover all the attacks that Project VoCo might facilitate. Bad actors could use the technology to target voice-activated assistants like Amazon Echo and Google Home so that they can break into a person’s home. They could also create embarrassing recordings that undermine the public standing of executives and politicians.


It’s unclear when Photoshopping Voiceovers will become publicly available. When it does, it’ll take even more time to determine how easy it is for people to identify an audio recording that someone’s modified using the technology. With that in mind, organizations’ best hope of preventing attacks such as those described above is to train their employees to be on the lookout for vishing and spear-phishing attacks. If an attacker can’t build a pretext, they won’t be able to leverage VoCo to make fraudulent wire transfers or steal sensitive information.