Digital sound-alikes

Revision as of 12:10, 2 April 2020 by Juho Kunsola (talk | contribs) (rm transclusion of Wikipedia article on w:speech synthesis)

When it cannot be determined by human testing whether some fake voice is a synthetic fake of some person's voice, or is it an actual recording made of that person's actual real voice, it is a digital sound-alike.


Living people can defend¹ themselves against digital sound-alike by denying the things the digital sound-alike says if they are presented to the target, but dead people cannot. Digital sound-alikes offer criminals new disinformation attack vectors and wreak havoc on provability.

A spectrogram of a male voice saying 'nineteenth century'

Timeline of digital sound-alikes

  • As of 2019 Symantec research knows of 3 cases where digital sound-alike technology has been used for crimes.[1]

Examples of speech synthesis software not quite able to fool a human yet

Some other contenders to create digital sound-alikes are though, as of 2019, their speech synthesis in most use scenarios does not yet fool a human because the results contain tell tale signs that give it away as a speech synthesizer.


Documented digital sound-alike attacks


Example of a hypothetical digital sound-alike attack

A very simple example of a digital sound-alike attack is as follows:

Someone puts a digital sound-alike to call somebody's voicemail from an unknown number and to speak for example illegal threats. In this example there are at least two victims:

  1. Victim #1 - The person whose voice has been stolen into a covert model and a digital sound-alike made from it to frame them for crimes
  2. Victim #2 - The person to whom the illegal threat is presented in a recorded form by a digital sound-alike that deceptively sounds like victim #1
  3. Victim #3 - It could also be viewed that victim #3 is our law enforcement systems as they are put to chase after and interrogate the innocent victim #1
  4. Victim #4 - Our judiciary which prosecutes and possibly convicts the innocent victim #1.

Thus it is high time to act and to criminalize the covert modeling of human appearance and voice!


See also in Ban Covert Modeling! wiki


Footnote 1. Whether a suspect can defend against faked synthetic speech that sounds like him/her depends on how up-to-date the judiciary is. If no information and instructions about digital sound-alikes have been given to the judiciary, they likely will not believe the defense of denying that the recording is of the suspect's voice.