Synthetic human-like fakes: Difference between revisions

2020's synthetic human-like fakes: + 2022 | disinformation attack | In June 2022 a fake digital look-and-sound-alike in the appearance and voice of w:Vitali Klitschko, mayor of w:Kyiv, held fake video phone calls with several European mayors + ref Guardian
+ w:Sora (text-to-video model), a w:text-to-video model developed by w:OpenAI, that has worrying levels of realism was published in Dec 2024
 
(12 intermediate revisions by the same user not shown)
Line 101: Line 101:
<small>[[:File:Deb-2000-reflectance-separation.png|Original picture]]  by [[w:Paul Debevec]] et al. - Copyright ACM 2000 https://dl.acm.org/citation.cfm?doid=311779.344855</small>]]
<small>[[:File:Deb-2000-reflectance-separation.png|Original picture]]  by [[w:Paul Debevec]] et al. - Copyright ACM 2000 https://dl.acm.org/citation.cfm?doid=311779.344855</small>]]


In the cinemas we have seen digital look-alikes for 20 years. These digital look-alikes have "clothing" (a simulation of clothing is not clothing) or "superhero costumes" and "superbaddie costumes", and they don't need to care about the laws of physics, let alone laws of physiology. It is generally accepted that digital look-alikes made their public debut in the sequels of The Matrix i.e. [[w:The Matrix Reloaded]] and [[w:The Matrix Revolutions]] released in 2003. It can be considered almost certain, that it was not possible to make these before the year 1999, as the final piece of the puzzle to make a (still) digital look-alike that passes human testing, the [[Glossary#Reflectance capture|reflectance capture]] over the human face, was made for the first time in 1999 at the [[w:University of Southern California]] and was presented to the crème de la crème  
In the cinemas we have seen digital look-alikes for over 20 years. These digital look-alikes have "clothing" (a simulation of clothing is not clothing) or "superhero costumes" and "superbaddie costumes", and they don't need to care about the laws of physics, let alone laws of physiology. It is generally accepted that digital look-alikes made their public debut in the sequels of The Matrix i.e. [[w:The Matrix Reloaded]] and [[w:The Matrix Revolutions]] released in 2003. It can be considered almost certain, that it was not possible to make these before the year 1999, as the final piece of the puzzle to make a (still) digital look-alike that passes human testing, the [[Glossary#Reflectance capture|reflectance capture]] over the human face, was made for the first time in 1999 at the [[w:University of Southern California]] and was presented to the crème de la crème  
of the computer graphics field in their annual gathering SIGGRAPH 2000.<ref name="Deb2000">
of the computer graphics field in their annual gathering SIGGRAPH 2000.<ref name="Deb2000">
{{cite book
{{cite book
Line 174: Line 174:
=== University of Florida published an antidote to synthetic human-like fake voices in 2022 ===
=== University of Florida published an antidote to synthetic human-like fake voices in 2022 ===


'''2022''' saw a brilliant '''<font color="green">counter-measure</font>''' presented to peers at the 31st [[w:USENIX]] Security Symposium 10-12 August 2022 by [[w:University of Florida]] <u><big>'''[[Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction]]'''</big></u>.
'''2022''' saw a brilliant '''<font color="green">counter-measure</font>''' presented to peers at the 31st [[w:USENIX]] Security Symposium 10-12 August 2022 by [[w:University of Florida]] <u><big>'''[[Detecting deep-fake audio through vocal tract reconstruction]]'''</big></u>.


The university's foundation has applied for a patent and let us hope that they will [[w:copyleft]] the patent as this protective method needs to be rolled out to protect the humanity.
The university's foundation has applied for a patent and let us hope that they will [[w:copyleft]] the patent as this protective method needs to be rolled out to protect the humanity.


'''Below transcluded [[Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|from the article]]'''
'''Below transcluded [[Detecting deep-fake audio through vocal tract reconstruction|from the article]]'''


{{#lst:Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|what-is-it}}{{#lst:Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|original-reporting}}  
{{#lst:Detecting deep-fake audio through vocal tract reconstruction|what-is-it}} {{#lst:Detecting deep-fake audio through vocal tract reconstruction|original-reporting}}  


'''This new counter-measure needs to be rolled out to humans to protect humans against the fake human-like voices.'''
'''This new counter-measure needs to be rolled out to humans to protect humans against the fake human-like voices.'''


{{#lst:Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|embed}}
{{#lst:Detecting deep-fake audio through vocal tract reconstruction|embed}}


=== On known history of digital sound-alikes ===
=== On known history of digital sound-alikes ===
Line 210: Line 210:


{{#ev:youtube|0sR1rU3gLzQ|640px|right|Video [https://www.youtube.com/watch?v=0sR1rU3gLzQ video 'This AI Clones Your Voice After Listening for 5 Seconds' by '2 minute papers' at YouTube] describes the voice thieving machine by Google Research in [[w:NeurIPS|w:NeurIPS]] 2018.}}
{{#ev:youtube|0sR1rU3gLzQ|640px|right|Video [https://www.youtube.com/watch?v=0sR1rU3gLzQ video 'This AI Clones Your Voice After Listening for 5 Seconds' by '2 minute papers' at YouTube] describes the voice thieving machine by Google Research in [[w:NeurIPS|w:NeurIPS]] 2018.}}
In November 2024, Nvidia researchers announced they have made and trained a [https://fugatto.github.io/ Foundational Generative Audio Transformer (Opus 1) at fugatto.github.io] or Fugatto for short.
The researchers state ''Fugatto is a versatile audio synthesis and transformation model capable of following
free-form text instructions with optional audio inputs. ''<ref>https://research.nvidia.com/publication/2024-11_fugatto-1-foundational-generative-audio-transformer-opus-1</ref>


=== Documented crimes with digital sound-alikes ===
=== Documented crimes with digital sound-alikes ===
Line 293: Line 298:
==== 2021 digital sound-alike enabled fraud ====
==== 2021 digital sound-alike enabled fraud ====


<section begin=2021 digital sound-alike enabled fraud />The 2nd publicly known fraud done with a digital sound-alike<ref group="1st seen in" name="2021 digital sound-alike fraud case">https://www.reddit.com/r/VocalSynthesis/</ref> took place on Friday 2021-01-15. A bank in Hong Kong was manipulated to wire money to numerous bank accounts by using a voice stolen from one of the their client company's directors. They managed to defraud $35 million of the U.A.E. based company's money.<ref name="Forbes reporting on 2021 digital sound-alike fraud">https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/</ref>. This case came into light when Forbes saw [https://www.documentcloud.org/documents/21085009-hackers-use-deep-voice-tech-in-400k-theft a document] where the U.A.E. financial authorities were seeking administrative assistance from the US authorities towards the end of recovering a small portion of the defrauded money that had been sent to bank accounts in the USA.<ref name="Forbes reporting on 2021 digital sound-alike fraud" />
<section begin=2021 digital sound-alike enabled fraud />The 2nd publicly known fraud done with a digital sound-alike<ref group="1st seen in" name="2021 digital sound-alike fraud case">https://www.reddit.com/r/VocalSynthesis/</ref> took place on Friday 2021-01-15. A bank in Hong Kong was manipulated to wire money to numerous bank accounts by using a voice stolen from one of the their client company's directors. They managed to defraud $35 million of the U.A.E. based company's money.<ref name="Forbes reporting on 2021 digital sound-alike fraud">https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/</ref>. This case came into light when Forbes saw [https://www.documentcloud.org/documents/21085009-hackers-use-deep-voice-tech-in-400k-theft a document] where the U.A.E. financial authorities were seeking administrative assistance from the US authorities towards recovering a small portion of the defrauded money that had been sent to bank accounts in the USA.<ref name="Forbes reporting on 2021 digital sound-alike fraud" />


'''Reporting on the 2021 digital sound-alike enabled fraud'''
'''Reporting on the 2021 digital sound-alike enabled fraud'''
Line 301: Line 306:
* [https://www.aiaaic.org/aiaaic-repository/ai-and-algorithmic-incidents-and-controversies/usd-35m-voice-cloning-heist '''''USD 35m voice cloning heist''''' at aiaaic.org], October 2021 AIAAIC repository entry  
* [https://www.aiaaic.org/aiaaic-repository/ai-and-algorithmic-incidents-and-controversies/usd-35m-voice-cloning-heist '''''USD 35m voice cloning heist''''' at aiaaic.org], October 2021 AIAAIC repository entry  
<section end=2021 digital sound-alike enabled fraud />
<section end=2021 digital sound-alike enabled fraud />
'''More fraud cases with digital sound-alikes'''
* [https://www.washingtonpost.com/technology/2023/03/05/ai-voice-scam/ '''''They thought loved ones were calling for help. It was an AI scam.''''' at washingtonpost.com], March 2023 reporting


=== Example of a hypothetical 4-victim digital sound-alike attack ===
=== Example of a hypothetical 4-victim digital sound-alike attack ===
Line 351: Line 359:
It is high time to act and to '''[[Law proposals to ban covert modeling|criminalize the covert modeling of human voice!]]'''
It is high time to act and to '''[[Law proposals to ban covert modeling|criminalize the covert modeling of human voice!]]'''


== Digital look-and-sound-alikes ==
=== Real-time digital look-and-sound-alike fraud in 2023 ===
'''Real-time digital look-and-sound-alike''' in a video call was used to defraud a substantial amount of money in 2023.<ref name="Reuters real-time digital look-and-sound-alike crime  2023">
{{cite web
| url = https://www.reuters.com/technology/deepfake-scam-china-fans-worries-over-ai-driven-fraud-2023-05-22/
| title = 'Deepfake' scam in China fans worries over AI-driven fraud
| last =
| first =
| date = 2023-05-22
| website = [[w:Reuters.com]]
| publisher = [[w:Reuters]]
| access-date = 2023-06-05
| quote =
}}
</ref>
=== Real-time digital look-and-sound-alike fraud in 2024 ===
Reporting
* [https://edition.cnn.com/2024/02/04/asia/deepfake-cfo-scam-hong-kong-intl-hnk/index.html '''''Finance worker pays out $25 million after video call with deepfake "chief financial officer"''''' at edition.cnn.com], February 2024 reporting by Heather Chen and Kathleen Magramo, CNN
----
== Text syntheses ==
== Text syntheses ==
[[w:Chatbot]]s and [[w:spamming]] have existed for a longer time, but only now armed with AI they are becoming more deceiving.  
[[w:Chatbot]]s and [[w:spamming]] have existed for a longer time, but only now armed with AI they are becoming more deceiving.  
Line 408: Line 437:
If the handwriting-like synthesis passes human and media forensics testing, it is a '''digital handwrite-alike'''.
If the handwriting-like synthesis passes human and media forensics testing, it is a '''digital handwrite-alike'''.


Here we find a '''risk''' similar to that which realized when the '''[[w:speaker recognition]] systems''' turned out to be instrumental in the development of '''[[#Digital sound-alikes|digital sound-alikes]]'''. After the knowledge needed to recognize a speaker was [[w:Transfer learning|w:transferred]] into a generative task in 2018 by Google researchers, we no longer cannot effectively determine for English speakers which recording is human in origin and which is from a machine origin.
Here we find a possible '''risk''' similar to that which became a reality, when the '''[[w:speaker recognition]] systems''' turned out to be instrumental in the development of '''[[#Digital sound-alikes|digital sound-alikes]]'''. After the knowledge needed to recognize a speaker was [[w:Transfer learning|w:transferred]] into a generative task in 2018 by Google researchers, we no longer cannot effectively determine for English speakers which recording is human in origin and which is from a machine origin.


'''Handwriting-like syntheses''':
'''Handwriting-like syntheses''':
Line 463: Line 492:


== 2020's synthetic human-like fakes ==
== 2020's synthetic human-like fakes ==
* '''2024''' | '''<font color="red">text-to-video model</font>''' | '''[[w:Sora (text-to-video model)]]''', a [[w:text-to-video model]] developed by [[w:OpenAI]], that has worrying levels of realism was published in 2024. It was released to subscription paying users of ChatGPT in December 2024.


* '''2023''' | '''<font color="orange">Real-time digital look-and-sound-alike crime</font>''' | In April a man in northern China was defrauded of 4.3 million yuan by a criminal employing a digital look-and-sound-alike pretending to be his friend on a video call made with a stolen messaging service account.<ref name="Reuters real-time digital look-and-sound-alike crime  2023"/>
* '''2023''' | '''<font color="orange">Real-time digital look-and-sound-alike crime</font>''' | In April a man in northern China was defrauded of 4.3 million yuan by a criminal employing a digital look-and-sound-alike pretending to be his friend on a video call made with a stolen messaging service account.<ref name="Reuters real-time digital look-and-sound-alike crime  2023"/>
Line 492: Line 522:
* '''2022''' | '''<font color="green">brief report of counter-measures</font>''' | {{#lst:Protecting world leaders against deep fakes using facial, gestural, and vocal mannerisms|what-is-it}} Publication date 2022-11-23.
* '''2022''' | '''<font color="green">brief report of counter-measures</font>''' | {{#lst:Protecting world leaders against deep fakes using facial, gestural, and vocal mannerisms|what-is-it}} Publication date 2022-11-23.


* '''2022''' | '''<font color="green">counter-measure</font>''' | {{#lst:Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|what-is-it}}
* '''2022''' | '''<font color="green">counter-measure</font>''' | {{#lst:Detecting deep-fake audio through vocal tract reconstruction|what-is-it}}
:{{#lst:Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction|original-reporting}}. Presented to peers in August 2022 and to the general public in September 2022.
:{{#lst:Detecting deep-fake audio through vocal tract reconstruction|original-reporting}}. Presented to peers in August 2022 and to the general public in September 2022.


* '''2022''' | <font color="orange">'''disinformation attack'''</font> | In June 2022 a fake digital look-and-sound-alike in the appearance and voice of [[w:Vitali Klitschko]], mayor of [[w:Kyiv]], held fake video phone calls with several European mayors.<ref>https://www.theguardian.com/world/2022/jun/25/european-leaders-deepfake-video-calls-mayor-of-kyiv-vitali-klitschko</ref>
* '''2022''' | <font color="orange">'''disinformation attack'''</font> | In June 2022 a fake digital look-and-sound-alike in the appearance and voice of [[w:Vitali Klitschko]], mayor of [[w:Kyiv]], held fake video phone calls with several European mayors. The Germans determined that the video phone call was fake by contacting the Ukrainian officials. This attempt at covert disinformation attack was originally reported by [[w:Der Spiegel]].<ref>https://www.theguardian.com/world/2022/jun/25/european-leaders-deepfake-video-calls-mayor-of-kyiv-vitali-klitschko</ref><ref>https://www.dw.com/en/vitali-klitschko-fake-tricks-berlin-mayor-in-video-call/a-62257289</ref>


* '''2022''' | science | [[w:DALL-E]] 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles" was published in April 2022.<ref>{{Cite web |title=DALL·E 2 |url=https://openai.com/dall-e-2/ |access-date=2023-04-22 |website=OpenAI |language=en-US}}</ref> ([https://en.wikipedia.org/w/index.php?title=DALL-E&oldid=1151136107 Wikipedia])
* '''2022''' | science | [[w:DALL-E]] 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles" was published in April 2022.<ref>{{Cite web |title=DALL·E 2 |url=https://openai.com/dall-e-2/ |access-date=2023-04-22 |website=OpenAI |language=en-US}}</ref> ([https://en.wikipedia.org/w/index.php?title=DALL-E&oldid=1151136107 Wikipedia])
Line 801: Line 831:


== Contact information of organizations ==
== Contact information of organizations ==
Please contact these organizations and tell them to work harder against the disinformation weapons
Please contact [[Organizations, studies and events against synthetic human-like fakes|these organizations]] and tell them to work harder against the disinformation weapons
<references group="contact" />


= 1st seen in =
= 1st seen in =