Open main menu
Home
Random
Recent changes
Special pages
Community portal
Settings
About Stop Synthetic Filth! wiki
Disclaimers
Stop Synthetic Filth! wiki
Search
User menu
Talk
Contributions
Log in
Editing
Synthetic human-like fakes
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Digital sound-alikes == === University of Florida published an antidote to synthetic human-like fake voices in 2022 === '''2022''' saw a brilliant '''<font color="green">counter-measure</font>''' presented to peers at the 31st [[w:USENIX]] Security Symposium 10-12 August 2022 by [[w:University of Florida]] <u><big>'''[[Detecting deep-fake audio through vocal tract reconstruction]]'''</big></u>. The university's foundation has applied for a patent and let us hope that they will [[w:copyleft]] the patent as this protective method needs to be rolled out to protect the humanity. '''Below transcluded [[Detecting deep-fake audio through vocal tract reconstruction|from the article]]''' {{#lst:Detecting deep-fake audio through vocal tract reconstruction|what-is-it}} {{#lst:Detecting deep-fake audio through vocal tract reconstruction|original-reporting}} '''This new counter-measure needs to be rolled out to humans to protect humans against the fake human-like voices.''' {{#lst:Detecting deep-fake audio through vocal tract reconstruction|embed}} === On known history of digital sound-alikes === [[File:Helsingin-Sanomat-2012-David-Martin-Howard-of-University-of-York-on-apporaching-digital-sound-alikes.jpg|right|thumb|338px|A picture of a cut-away titled "''Voice-terrorist could mimic a leader''" from a 2012 [[w:Helsingin Sanomat]] warning that the sound-like-anyone machines are approaching. Thank you to homie [https://pure.york.ac.uk/portal/en/researchers/david-martin-howard(ecfa9e9e-1290-464f-981a-0c70a534609e).html Prof. David Martin Howard] of the [[w:University of York]], UK and the anonymous editor for the heads-up.]] The first English speaking digital sound-alikes were first introduced in 2016 by Adobe and Deepmind, but neither of them were made publicly available. <section begin=GoogleTransferLearning2018 /> Then in '''2018''' at the '''[[w:Conference on Neural Information Processing Systems]]''' (NeurIPS) the work [http://papers.nips.cc/paper/7700-transfer-learning-from-speaker-verification-to-multispeaker-text-to-speech-synthesis 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis'] ([https://arxiv.org/abs/1806.04558 at arXiv.org]) was presented. The pre-trained model is able to steal voices from a sample of only '''5 seconds''' with almost convincing results The Iframe below is transcluded from [https://google.github.io/tacotron/publications/speaker_adaptation/ ''''''Audio samples from "Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis"'''''' at google.gituhub.io], the audio samples of a sound-like-anyone machine presented as at the 2018 [[w:NeurIPS]] conference by Google researchers. Have a listen. {{#Widget:Iframe - Audio samples from Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis by Google Research}} Observe how good the "VCTK p240" system is at deceiving to think that it is a person that is doing the talking. <section end=GoogleTransferLearning2018 /> ''' Reporting on the sound-like-anyone-machines ''' * [https://www.forbes.com/sites/bernardmarr/2019/05/06/artificial-intelligence-can-now-copy-your-voice-what-does-that-mean-for-humans/#617f6d872a2a '''"Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?"''' May 2019 reporting at forbes.com] on [[w:Baidu Research]]'es attempt at the sound-like-anyone-machine demonstrated at the 2018 [[w:NeurIPS]] conference. The to the right [https://www.youtube.com/watch?v=0sR1rU3gLzQ video 'This AI Clones Your Voice After Listening for 5 Seconds' by '2 minute papers' at YouTube] describes the voice thieving machine presented by Google Research in [[w:NeurIPS|w:NeurIPS]] 2018. {{#ev:youtube|0sR1rU3gLzQ|640px|right|Video [https://www.youtube.com/watch?v=0sR1rU3gLzQ video 'This AI Clones Your Voice After Listening for 5 Seconds' by '2 minute papers' at YouTube] describes the voice thieving machine by Google Research in [[w:NeurIPS|w:NeurIPS]] 2018.}} === Documented crimes with digital sound-alikes === In 2019 reports of crimes being committed with digital sound-alikes started surfacing. As of Jan 2022 no reports of other types of attack than fraud have been found. ==== 2019 digital sound-alike enabled fraud ==== By 2019 digital sound-alike anyone technology found its way to the hands of criminals. In '''2019''' [[w:NortonLifeLock|Symantec]] researchers knew of 3 cases where digital sound-alike technology had been used for '''[[w:crime]]'''.<ref name="Washington Post reporting on 2019 digital sound-alike fraud" /> Of these crimes the most publicized was a fraud case in March 2019 where 220,000€ were defrauded with the use of a real-time digital sound-alike.<ref name="WSJ original reporting on 2019 digital sound-alike fraud" /> The company that was the victim of this fraud had bought some kind of cyberscam insurance from French insurer [[w:Euler Hermes]] and the case came to light when Mr. Rüdiger Kirsch of Euler Hermes informed [[w:The Wall Street Journal]] about it.<ref name="Forbes reporting on 2019 digital sound-alike fraud" /> ''' Reporting on the 2019 digital sound-alike enabled fraud ''' * [https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402 '''''Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case''''' at wsj.com] original reporting, date unknown, updated 2019-08-30<ref name="WSJ original reporting on 2019 digital sound-alike fraud"> {{cite web |url=https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402 |title=Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case |last=Stupp |first=Catherine |date=2019-08-30 |website=[[w:wsj.com]] |publisher=[[w:The Wall Street Journal]] |access-date=2022-01-01 |quote=}} </ref> * [https://www.bbc.com/news/technology-48908736 '''"Fake voices 'help cyber-crooks steal cash''''" at bbc.com] July 2019 reporting <ref name="BBC reporting on 2019 digital sound-alike fraud"> {{cite web |url= https://www.bbc.com/news/technology-48908736 |title= Fake voices 'help cyber-crooks steal cash' |last= |first= |date= 2019-07-08 |website= [[w:bbc.com]] |publisher= [[w:BBC]] |access-date= 2020-07-22 |quote= }} </ref> * [https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/ '''"An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft"''' at washingtonpost.com] documents a [[w:fraud]] committed with digital sound-like-anyone-machine, July 2019 reporting.<ref name="Washington Post reporting on 2019 digital sound-alike fraud"> {{cite web |url= https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/ |title= An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft |last= Drew |first= Harwell |date= 2020-04-16 |website= [[w:washingtonpost.com]] |publisher= [[w:Washington Post]] |access-date= 2019-07-22 |quote=Researchers at the cybersecurity firm Symantec said they have found at least three cases of executives’ voices being mimicked to swindle companies. Symantec declined to name the victim companies or say whether the Euler Hermes case was one of them, but it noted that the losses in one of the cases totaled millions of dollars.}} </ref> * [https://www.forbes.com/sites/jessedamiani/2019/09/03/a-voice-deepfake-was-used-to-scam-a-ceo-out-of-243000/ '''''A Voice Deepfake Was Used To Scam A CEO Out Of $243,000''''' at forbes.com], 2019-09-03 reporting<ref name="Forbes reporting on 2019 digital sound-alike fraud"> {{cite web |url=https://www.forbes.com/sites/jessedamiani/2019/09/03/a-voice-deepfake-was-used-to-scam-a-ceo-out-of-243000/ |title=A Voice Deepfake Was Used To Scam A CEO Out Of $243,000 |last=Damiani |first=Jesse |date=2019-09-03 |website=[[w:Forbes.com]] |publisher=[[w:Forbes]] |access-date=2022-01-01 |quote=According to a new report in The Wall Street Journal, the CEO of an unnamed UK-based energy firm believed he was on the phone with his boss, the chief executive of firm’s the German parent company, when he followed the orders to immediately transfer €220,000 (approx. $243,000) to the bank account of a Hungarian supplier. In fact, the voice belonged to a fraudster using AI voice technology to spoof the German chief executive. Rüdiger Kirsch of Euler Hermes Group SA, the firm’s insurance company, shared the information with WSJ.}} </ref> ==== 2020 digital sound-alike fraud attempt ==== In June 2020 fraud was attempted with a poor quality pre-recorded digital sound-alike with delivery method was voicemail. ([https://soundcloud.com/jason-koebler/redacted-clip '''Listen to a redacted clip''' at soundcloud.com]) The recipient in a tech company didn't believe the voicemail to be real and alerted the company and they realized that someone tried to scam them. The company called in Nisos to investigate the issue. Nisos analyzed the evidence and they were certain it was a fake, but had aspects of a cut-and-paste job to it. Nisos prepared [https://www.nisos.com/blog/synthetic-audio-deepfake/ a report titled '''''"The Rise of Synthetic Audio Deepfakes"''''' at nisos.com] on the issue and shared it with Motherboard, part of [[w:Vice (magazine)]] prior to its release.<ref name="Vice reporting on 2020 digital sound-alike fraud attempt"> {{cite web |url=https://www.vice.com/en/article/pkyqvb/deepfake-audio-impersonating-ceo-fraud-attempt |title=Listen to This Deepfake Audio Impersonating a CEO in Brazen Fraud Attempt |last=Franceschi-Bicchierai |first=Lorenzo |date=2020-07-23 |website=[[w:Vice.com]] |publisher=[[w:Vice (magazine)]] |access-date=2022-01-03 |quote=}} </ref> ==== 2021 digital sound-alike enabled fraud ==== <section begin=2021 digital sound-alike enabled fraud />The 2nd publicly known fraud done with a digital sound-alike<ref group="1st seen in" name="2021 digital sound-alike fraud case">https://www.reddit.com/r/VocalSynthesis/</ref> took place on Friday 2021-01-15. A bank in Hong Kong was manipulated to wire money to numerous bank accounts by using a voice stolen from one of the their client company's directors. They managed to defraud $35 million of the U.A.E. based company's money.<ref name="Forbes reporting on 2021 digital sound-alike fraud">https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/</ref>. This case came into light when Forbes saw [https://www.documentcloud.org/documents/21085009-hackers-use-deep-voice-tech-in-400k-theft a document] where the U.A.E. financial authorities were seeking administrative assistance from the US authorities towards the end of recovering a small portion of the defrauded money that had been sent to bank accounts in the USA.<ref name="Forbes reporting on 2021 digital sound-alike fraud" /> '''Reporting on the 2021 digital sound-alike enabled fraud''' * [https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/ '''''Fraudsters Cloned Company Director’s Voice In $35 Million Bank Heist, Police Find''''' at forbes.com] 2021-10-14 original reporting * [https://www.unite.ai/deepfaked-voice-enabled-35-million-bank-heist-in-2020/ '''''Deepfaked Voice Enabled $35 Million Bank Heist in 2020''''' at unite.ai]<ref group="1st seen in" name="2021 digital sound-alike fraud case" /> reporting updated on 2021-10-15 * [https://www.aiaaic.org/aiaaic-repository/ai-and-algorithmic-incidents-and-controversies/usd-35m-voice-cloning-heist '''''USD 35m voice cloning heist''''' at aiaaic.org], October 2021 AIAAIC repository entry <section end=2021 digital sound-alike enabled fraud /> '''More fraud cases with digital sound-alikes''' * [https://www.washingtonpost.com/technology/2023/03/05/ai-voice-scam/ '''''They thought loved ones were calling for help. It was an AI scam.''''' at washingtonpost.com], March 2023 reporting === Example of a hypothetical 4-victim digital sound-alike attack === A very simple example of a digital sound-alike attack is as follows: Someone puts a digital sound-alike to call somebody's voicemail from an unknown number and to speak for example illegal threats. In this example there are at least two victims: # Victim #1 - The person whose voice has been stolen into a covert model and a digital sound-alike made from it to frame them for crimes # Victim #2 - The person to whom the illegal threat is presented in a recorded form by a digital sound-alike that deceptively sounds like victim #1 # Victim #3 - It could also be viewed that victim #3 is our law enforcement systems as they are put to chase after and interrogate the innocent victim #1 # Victim #4 - Our judiciary which prosecutes and possibly convicts the innocent victim #1. === Examples of speech synthesis software not quite able to fool a human yet === Some other contenders to create digital sound-alikes are though, as of 2019, their speech synthesis in most use scenarios does not yet fool a human because the results contain tell tale signs that give it away as a speech synthesizer. * '''[https://lyrebird.ai/ Lyrebird.ai]''' [https://www.youtube.com/watch?v=xxDBlZu__Xk (listen)] * '''[https://candyvoice.com/ CandyVoice.com]''' [https://candyvoice.com/demos/voice-conversion (test with your choice of text)] * '''[https://cstr-edinburgh.github.io/merlin/ Merlin]''', a [[w:neural network]] based speech synthesis system by the Centre for Speech Technology Research at the [[w:University of Edinburgh]] * [https://papers.nips.cc/paper/8206-neural-voice-cloning-with-a-few-samples ''''Neural Voice Cloning with a Few Samples''' at papers.nips.cc], [[w:Baidu Research]]'es shot at sound-like-anyone-machine did not convince in '''2018''' === Temporal limit of digital sound-alikes === [[File:Edison_and_phonograph_edit1.jpg|thumb|left|210px|[[w:Thomas Edison]] and his early [[w:phonograph]]. Cropped from [[w:Library of Congress]] copy, ca. 1877, (probably 18 April 1878)]] The temporal limit of whom, dead or living, the digital sound-alikes can attack is defined by the '''[[w:history of sound recording]]'''. The article starts by mentioning that the invention of the [[w:phonograph]] by [[w:Thomas Edison]] in '''1877''' is considered the start of sound recording. The '''phonautograph''' is the earliest known device for recording [[w:sound]]. Previously, tracings had been obtained of the sound-producing vibratory motions of [[w:tuning forks]] and other objects by physical contact with them, but not of actual sound waves as they propagated through air or other media. Invented by Frenchman [[W:Édouard-Léon Scott de Martinville]], it was patented on March 25, '''1857'''.<ref name="NPR-Phonautograph"> {{Cite news |url=https://www.npr.org/templates/story/story.php?storyId=89380697 |title=1860 'Phonautograph' Is Earliest Known Recording |last=Flatow |first=Ira|date=April 4, 2008|work=NPR |access-date=2012-12-09 |language=en}} </ref> Apparently, it did not occur to anyone before the 1870s that the recordings, called '''phonautograms''', contained enough information about the sound that they could, in theory, be '''used to recreate it'''. Because the phonautogram tracing was an insubstantial two-dimensional line, direct physical playback was impossible in any case. Several phonautograms recorded '''before 1861''' were successfully played as sound in '''2008''' by optically scanning them and using a computer to process the scans into digital audio files. ([[w:Phonautograph|Wikipedia]]) [[File:Spectrogram-19thC.png|thumb|right|640px|A [[w:spectrogram]] of a male voice saying 'nineteenth century']] === What should we do about digital sound-alikes? === Living people can defend<ref group="footnote" name="judiciary maybe not aware">Whether a suspect can defend against faked synthetic speech that sounds like him/her depends on how up-to-date the judiciary is. If no information and instructions about digital sound-alikes have been given to the judiciary, they likely will not believe the defense of denying that the recording is of the suspect's voice.</ref> themselves against digital sound-alike by denying the things the digital sound-alike says if they are presented to the target, but dead people cannot. Digital sound-alikes offer criminals new disinformation attack vectors and wreak havoc on provability. For these reasons the bannable '''raw materials''' i.e. covert voice models '''[[Law proposals to ban covert modeling|should be prohibited by law]]''' in order to protect humans from abuse by criminal parties. It is high time to act and to '''[[Law proposals to ban covert modeling|criminalize the covert modeling of human voice!]]'''
Summary:
Please note that all contributions to Stop Synthetic Filth! wiki are considered to be released under the Creative Commons Attribution-ShareAlike (see
SSF:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)