SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
DeepFake Detection: The Importance of Training Data
Preprocessing and Practical Considerations
Dr. Symeon (Akis) Papadopoulos – @sympap
MeVer Team @ Information Technologies Institute (ITI) /
Centre for Research & Technology Hellas (CERTH)
Joint work with Polychronis Charitidis, George Kordopatis-Zilos and
Yiannis Kompatsiaris
AI4Media Workshop on GANs for Media Content Generation, Oct 1, 2020
Media Verification
(MeVer)
DeepFakes
• Content, generated by AI, that seems
authentic to human eye
• Most common form: generation and
manipulation of human face
Source: https://en.wikipedia.org/wiki/Deepfake
Source: https://www.youtube.com/watch?v=iHv6Q9ychnA
Source: Media Forensics and DeepFakes: an overview
Manipulation types
Facial manipulations can
be categorised in four
main different groups:
• Entire face synthesis
• Attribute manipulation
• Identity swap
• Expression swap
Source: DeepFakes and Beyond: A Survey of Face Manipulation and Fake
Detection (Tolosana et al., 2020)
Tolosana, R., et al. (2020). Deepfakes and beyond:
A survey of face manipulation and fake
detection. arXiv preprint arXiv:2001.00179.
Verdoliva, L. (2020). Media forensics and deepfakes:
an overview. arXiv preprint arXiv:2001.06564.
Mirsky, Y., & Lee, W. (2020). The Creation and
Detection of Deepfakes: A Survey. arXiv preprint
arXiv:2004.11138.
WeVerify Project
• WeVerify aims at detecting disinformation in social media and expose
misleading and fabricated content
• Partners: Univ. Sheffield, OntoText, ATC, DW, AFP, EU DisinfoLab, CERTH
• A key outcome is a platform for collaborative content verification,
tracking, and debunking
• Currently, we are developing a deepfake detection service for the
WeVerify platform
• Participation in DeepFake Detection Challenge
https://weverify.eu/
DeepFake Detection Challenge
• Goal: detect videos with facial or voice manipulations
• 2,114 teams participated in the challenge
• Log Loss error evaluation on public and private validation sets
• Public evaluation contained videos with similar transformations as the
training set
• Private evaluation contained organic videos and videos with unknown
transformations from the Internet
• Our final standings:
• public leaderboard: 49 (top 3%) with 0.295 Log Loss error
• private leaderboard: 115 (top 5%) with 0.515 Log Loss error
Source: https://www.kaggle.com/c/deepfake-detection-challenge
DeepFake Detection Challenge - dataset
• Dataset of more than 110k videos
• Approx. 20k REAL and the rest are FAKE
• FAKE videos generated from the REAL
• Models used:
• DeepFake AutoEncoder (DFAE)
• Morphable Mask faceswap (MM/NN)
• Neural Talking Heads (NTH)
• FSGAN
• StyleGAN
Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang,
M., & Ferrer, C. C. (2020). The DeepFake Detection Challenge
Dataset. arXiv preprint arXiv:2006.07397.
Dataset preprocessing - Issues
• Face dataset quality depends on face extraction accuracy (Dlib,
mtcnn, facenet-pytorch, Blazeface)
• Generally all face extraction libraries generate a number of false
positive detections
• Manual tuning can improve the quality of the generated dataset
Deep learning
model
Face
extraction
Frame
extraction
Video
corpus
Noisy data creeping in the training set
• Extracting faces with 1 fps from Kaggle DeepFake Detection Challenge dataset
videos using pytorch implementation of MTCNN face detection
• Observation: False detections are less compared to true detections in a video
Our “noise” filtering approach
• Compute face embeddings for each detected face in video
• Similarity calculation between all face embeddings in a video → similarity graph construction
• Nodes represent faces and two faces are connected if their similarities are greater than 0.8 (solid lines)
• Drop components smaller than N/2 (e.g. component 2)
• N is the number of frames that contain face detections (true or false).
Advantages
• Simple and fast procedure
• No need for manual tuning of the face extraction settings
• Clusters of distinct faces in cases of multiple persons in the video
• This information can be utilized in various ways (e.g. predictions per face)
Faces extracted from multiple video frames
Component 1
Component 2
Experiments
• We trained multiple DeepFake detection models on the DFDC dataset
with and without (baseline) our proposed approach
• Three datasets: a) Celeb-DF, b) FaceForensics++, c) DFDC subset
• For evaluation we examined two aggregation approaches
• avg: prediction is the average of all face predictions
• face: prediction is the max prediction among different avg face predictions
• Results for the EfficientNet-B4 model in terms of Log loss error:
Pre-
processing
CelebDF FaceForensics++ DFDC
avg face avg face avg face
baseline 0,510 0,511 0,563 0,563 0,213 0,198
proposed 0,458 0,456 0,497 0,496 0,195 0,173
Our DFDC Approach - details
• Applied proposed preprocessing approach to clean the generated face dataset
• Face augmentation:
• Horizontal & vertical flip, random crop, rotation, image compression, Gaussian & motion
blurring, brightness, saturation & contrast transformation
• Trained three different models: a) EfficientNet-B3, b) EfficientNet-B4, c) I3D*
• Models trained on face level:
• I3d trained with 10 consecutive face images exploiting temporal information.
• EfficientNet models trained on single face images
• Per model:
• Added two dense layers with dropout after the backbone architecture with 256 and 1 units
• Used the sigmoid activation for the last layer
* ignoring the optical flow stream
Our DFDC approach – inference
pre-processing model inference post-processing
Lessons from other DFDC teams
• Most approaches ensemble multiple EfficientNet architectures (B3-B7) and
some of them were trained on different seeds
• ResNeXT was another architecture used by a top-performing solutions
combined with 3D architectures such as I3D, 3D ResNet34, MC3 & R2+1D
• Several approaches increased the margin of the detected facial bounding
box to further improve results.
• We used an additional margin of 20% but other works proposed a higher proportion.
• To improve generalization:
• Domain-specific augmentations: a) half face removal horizontally or vertically, b)
landmark (eyes, nose, or mouth) removal
• Mixup augmentations
Practical challenges
• Limited generalization
• This observation applies to most submissions. The winning team scored
0.20336 in public validation and only 0.42798 in the private (Log Loss)
• Overfitting
• The best submission in the public leaderboard scored 0.19207 but in the
private evaluation the error was 0.57468, leading to the 904-th position!
• Broad problem scope
• The term DeepFake may refer to every possible manipulation and generation
• Constantly increasing manipulation and generation techniques
• A detector is only trained with a subset of these manipulations
DeepFake Detection in the Wild
• Videos in the wild usually contain multiple scenes
• Only a subset of these scenes may contain DeepFakes
• Detection process might be slow for multi-shot videos (even short ones)
• Low quality videos
• Low quality faces tend to fool classifiers
• Small detected and fast-moving faces
• Usually lead to noisy predictions
• Changes in the environment
• Moving obstacles in front of the faces
• Changes in lighting
DeepFake Detection Service @ WeVerify
https://www.youtube.com/watch?v=cVljNV
V5VPw&ab_channel=TheFakening
More details at TTO 2020
Charitidis, P., Kordopatis-Zilos, G., Papadopoulos, S., & Kompatsiaris, Y.
(2020). Investigating the impact of preprocessing and prediction
aggregation on the DeepFake detection task. Proceedings of the
Conference for Truth and Trust Online (TTO) [to appear],
https://arxiv.org/abs/2006.07084
https://truthandtrustonline.com/
Thank you!
Dr. Symeon Papadopoulos
papadop@iti.gr
@sympap
Media Verification (MeVer)
https://mever.iti.gr/
@meverteam https://ai4media.eu/
https://weverify.eu/

Contenu connexe

Tendances

Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?Ian McCarthy
 
Deep Fakes Artificial Intelligence.pptx
Deep Fakes Artificial Intelligence.pptxDeep Fakes Artificial Intelligence.pptx
Deep Fakes Artificial Intelligence.pptxNilayDeshmukh3
 
DeepFake Seminar.pptx
DeepFake Seminar.pptxDeepFake Seminar.pptx
DeepFake Seminar.pptxspchinchole20
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural networkSmriti Tikoo
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Lalit Jain
 
DeepFake_Seminar.pptx
DeepFake_Seminar.pptxDeepFake_Seminar.pptx
DeepFake_Seminar.pptxsandeshsb
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION khanam22
 
Face detection and recognition
Face detection and recognitionFace detection and recognition
Face detection and recognitionPankaj Thakur
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentationAshwinBicholiya
 

Tendances (20)

Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?
 
Deep fakes and beyond
Deep fakes and beyondDeep fakes and beyond
Deep fakes and beyond
 
Deep fake
Deep fakeDeep fake
Deep fake
 
Final year ppt
Final year pptFinal year ppt
Final year ppt
 
Deep Fakes Artificial Intelligence.pptx
Deep Fakes Artificial Intelligence.pptxDeep Fakes Artificial Intelligence.pptx
Deep Fakes Artificial Intelligence.pptx
 
DeepFake Seminar.pptx
DeepFake Seminar.pptxDeepFake Seminar.pptx
DeepFake Seminar.pptx
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
Deepfakes
DeepfakesDeepfakes
Deepfakes
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
 
DeepFake_Seminar.pptx
DeepFake_Seminar.pptxDeepFake_Seminar.pptx
DeepFake_Seminar.pptx
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
 
Face detection and recognition
Face detection and recognitionFace detection and recognition
Face detection and recognition
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentation
 
ESE presentation.pptx
ESE presentation.pptxESE presentation.pptx
ESE presentation.pptx
 

Similaire à Deepfake detection

Face Recognition System for Door Unlocking
Face Recognition System for Door UnlockingFace Recognition System for Door Unlocking
Face Recognition System for Door UnlockingHassan Tariq
 
Biometric Recognition using Deep Learning
Biometric Recognition using Deep LearningBiometric Recognition using Deep Learning
Biometric Recognition using Deep LearningSahithiKotha2
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsvoginip
 
Automated_attendance_system_project.pptx
Automated_attendance_system_project.pptxAutomated_attendance_system_project.pptx
Automated_attendance_system_project.pptxNaveensai51
 
Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Herman Kurnadi
 
Technical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot AnalysisTechnical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot AnalysisPositive Hack Days
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...Edge AI and Vision Alliance
 
Attendance System using Facial Recognition
Attendance System using Facial RecognitionAttendance System using Facial Recognition
Attendance System using Facial RecognitionIRJET Journal
 
Human Face Identification
Human Face IdentificationHuman Face Identification
Human Face Identificationbhupesh lahare
 
Presentation of the InVID verification technologies at IPTC 2018
Presentation of the InVID verification technologies at IPTC 2018Presentation of the InVID verification technologies at IPTC 2018
Presentation of the InVID verification technologies at IPTC 2018InVID Project
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Vienna Data Science Group
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FIAT/IFTA
 
Recent Advances in Face Analysis: database, methods, and software.
Recent Advances in Face Analysis: database, methods, and software.Recent Advances in Face Analysis: database, methods, and software.
Recent Advances in Face Analysis: database, methods, and software.Taowei Huang
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1San Kim
 
IRJET- Real Time Attendance System using Face Recognition
IRJET-  	  Real Time Attendance System using Face RecognitionIRJET-  	  Real Time Attendance System using Face Recognition
IRJET- Real Time Attendance System using Face RecognitionIRJET Journal
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsJason Anderson
 
Face Recognition Home Security System
Face Recognition Home Security SystemFace Recognition Home Security System
Face Recognition Home Security SystemSuman Mia
 
Face Recognition Technology
Face Recognition TechnologyFace Recognition Technology
Face Recognition TechnologyShravan Halankar
 

Similaire à Deepfake detection (20)

Face Recognition System for Door Unlocking
Face Recognition System for Door UnlockingFace Recognition System for Door Unlocking
Face Recognition System for Door Unlocking
 
Biometric Recognition using Deep Learning
Biometric Recognition using Deep LearningBiometric Recognition using Deep Learning
Biometric Recognition using Deep Learning
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradts
 
A guide to Face Detection in Python.pdf
A guide to Face Detection in Python.pdfA guide to Face Detection in Python.pdf
A guide to Face Detection in Python.pdf
 
Automated_attendance_system_project.pptx
Automated_attendance_system_project.pptxAutomated_attendance_system_project.pptx
Automated_attendance_system_project.pptx
 
Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)
 
Technical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot AnalysisTechnical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot Analysis
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
 
Attendance System using Facial Recognition
Attendance System using Facial RecognitionAttendance System using Facial Recognition
Attendance System using Facial Recognition
 
Human Face Identification
Human Face IdentificationHuman Face Identification
Human Face Identification
 
Presentation of the InVID verification technologies at IPTC 2018
Presentation of the InVID verification technologies at IPTC 2018Presentation of the InVID verification technologies at IPTC 2018
Presentation of the InVID verification technologies at IPTC 2018
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
 
Recent Advances in Face Analysis: database, methods, and software.
Recent Advances in Face Analysis: database, methods, and software.Recent Advances in Face Analysis: database, methods, and software.
Recent Advances in Face Analysis: database, methods, and software.
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
IRJET- Real Time Attendance System using Face Recognition
IRJET-  	  Real Time Attendance System using Face RecognitionIRJET-  	  Real Time Attendance System using Face Recognition
IRJET- Real Time Attendance System using Face Recognition
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
Face Recognition Home Security System
Face Recognition Home Security SystemFace Recognition Home Security System
Face Recognition Home Security System
 
Face Recognition Technology
Face Recognition TechnologyFace Recognition Technology
Face Recognition Technology
 

Plus de Weverify

Operation wise attention network for tampering localization fusion
Operation wise attention network for tampering localization fusionOperation wise attention network for tampering localization fusion
Operation wise attention network for tampering localization fusionWeverify
 
MeVer tools for disinformation detection
MeVer tools for disinformation detectionMeVer tools for disinformation detection
MeVer tools for disinformation detectionWeverify
 
TTO2021: Cross-Lingual Rumour Stance Classification: a First Study with BERT...
TTO2021: Cross-Lingual Rumour Stance Classification:  a First Study with BERT...TTO2021: Cross-Lingual Rumour Stance Classification:  a First Study with BERT...
TTO2021: Cross-Lingual Rumour Stance Classification: a First Study with BERT...Weverify
 
MeVer tools for disinformation detection
MeVer tools for disinformation detectionMeVer tools for disinformation detection
MeVer tools for disinformation detectionWeverify
 
Operation-wise Attention Network for Tampering Localization Fusion.
Operation-wise Attention Network for Tampering Localization Fusion.Operation-wise Attention Network for Tampering Localization Fusion.
Operation-wise Attention Network for Tampering Localization Fusion.Weverify
 
L tvs disinfo - 24 nov 2020
L tvs disinfo - 24 nov 2020L tvs disinfo - 24 nov 2020
L tvs disinfo - 24 nov 2020Weverify
 
RDSM workshop 13 dec2020
RDSM workshop 13 dec2020RDSM workshop 13 dec2020
RDSM workshop 13 dec2020Weverify
 
We verify balkan disinformation panel
We verify balkan disinformation panelWe verify balkan disinformation panel
We verify balkan disinformation panelWeverify
 
Text analysis for disinformation detection 17 dec 2020
Text analysis for disinformation detection    17 dec 2020Text analysis for disinformation detection    17 dec 2020
Text analysis for disinformation detection 17 dec 2020Weverify
 
20200112 EUDL training for adult learners
20200112 EUDL training for adult learners20200112 EUDL training for adult learners
20200112 EUDL training for adult learnersWeverify
 
#Semiform2020 02 11 2020
#Semiform2020 02 11 2020#Semiform2020 02 11 2020
#Semiform2020 02 11 2020Weverify
 
2nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 20212nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 2021Weverify
 
Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020Weverify
 
Stance classification. Uni Cambridge 22 Jan 2021
Stance classification. Uni Cambridge 22 Jan 2021Stance classification. Uni Cambridge 22 Jan 2021
Stance classification. Uni Cambridge 22 Jan 2021Weverify
 
EDMO workshop 17 Feb 2021
EDMO workshop 17 Feb 2021EDMO workshop 17 Feb 2021
EDMO workshop 17 Feb 2021Weverify
 
Edmo research + platforms panel
Edmo research + platforms panel Edmo research + platforms panel
Edmo research + platforms panel Weverify
 
Qurator keynote berlin 2101 2020
Qurator keynote berlin 2101 2020Qurator keynote berlin 2101 2020
Qurator keynote berlin 2101 2020Weverify
 
Aacl / SoBigData 2020 06 12 2020
Aacl / SoBigData 2020 06 12 2020Aacl / SoBigData 2020 06 12 2020
Aacl / SoBigData 2020 06 12 2020Weverify
 
TTO Keynote 08 10 2021
TTO Keynote 08 10 2021TTO Keynote 08 10 2021
TTO Keynote 08 10 2021Weverify
 
We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020Weverify
 

Plus de Weverify (20)

Operation wise attention network for tampering localization fusion
Operation wise attention network for tampering localization fusionOperation wise attention network for tampering localization fusion
Operation wise attention network for tampering localization fusion
 
MeVer tools for disinformation detection
MeVer tools for disinformation detectionMeVer tools for disinformation detection
MeVer tools for disinformation detection
 
TTO2021: Cross-Lingual Rumour Stance Classification: a First Study with BERT...
TTO2021: Cross-Lingual Rumour Stance Classification:  a First Study with BERT...TTO2021: Cross-Lingual Rumour Stance Classification:  a First Study with BERT...
TTO2021: Cross-Lingual Rumour Stance Classification: a First Study with BERT...
 
MeVer tools for disinformation detection
MeVer tools for disinformation detectionMeVer tools for disinformation detection
MeVer tools for disinformation detection
 
Operation-wise Attention Network for Tampering Localization Fusion.
Operation-wise Attention Network for Tampering Localization Fusion.Operation-wise Attention Network for Tampering Localization Fusion.
Operation-wise Attention Network for Tampering Localization Fusion.
 
L tvs disinfo - 24 nov 2020
L tvs disinfo - 24 nov 2020L tvs disinfo - 24 nov 2020
L tvs disinfo - 24 nov 2020
 
RDSM workshop 13 dec2020
RDSM workshop 13 dec2020RDSM workshop 13 dec2020
RDSM workshop 13 dec2020
 
We verify balkan disinformation panel
We verify balkan disinformation panelWe verify balkan disinformation panel
We verify balkan disinformation panel
 
Text analysis for disinformation detection 17 dec 2020
Text analysis for disinformation detection    17 dec 2020Text analysis for disinformation detection    17 dec 2020
Text analysis for disinformation detection 17 dec 2020
 
20200112 EUDL training for adult learners
20200112 EUDL training for adult learners20200112 EUDL training for adult learners
20200112 EUDL training for adult learners
 
#Semiform2020 02 11 2020
#Semiform2020 02 11 2020#Semiform2020 02 11 2020
#Semiform2020 02 11 2020
 
2nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 20212nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 2021
 
Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020
 
Stance classification. Uni Cambridge 22 Jan 2021
Stance classification. Uni Cambridge 22 Jan 2021Stance classification. Uni Cambridge 22 Jan 2021
Stance classification. Uni Cambridge 22 Jan 2021
 
EDMO workshop 17 Feb 2021
EDMO workshop 17 Feb 2021EDMO workshop 17 Feb 2021
EDMO workshop 17 Feb 2021
 
Edmo research + platforms panel
Edmo research + platforms panel Edmo research + platforms panel
Edmo research + platforms panel
 
Qurator keynote berlin 2101 2020
Qurator keynote berlin 2101 2020Qurator keynote berlin 2101 2020
Qurator keynote berlin 2101 2020
 
Aacl / SoBigData 2020 06 12 2020
Aacl / SoBigData 2020 06 12 2020Aacl / SoBigData 2020 06 12 2020
Aacl / SoBigData 2020 06 12 2020
 
TTO Keynote 08 10 2021
TTO Keynote 08 10 2021TTO Keynote 08 10 2021
TTO Keynote 08 10 2021
 
We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020
 

Dernier

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Dernier (20)

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Deepfake detection

  • 1. DeepFake Detection: The Importance of Training Data Preprocessing and Practical Considerations Dr. Symeon (Akis) Papadopoulos – @sympap MeVer Team @ Information Technologies Institute (ITI) / Centre for Research & Technology Hellas (CERTH) Joint work with Polychronis Charitidis, George Kordopatis-Zilos and Yiannis Kompatsiaris AI4Media Workshop on GANs for Media Content Generation, Oct 1, 2020 Media Verification (MeVer)
  • 2. DeepFakes • Content, generated by AI, that seems authentic to human eye • Most common form: generation and manipulation of human face Source: https://en.wikipedia.org/wiki/Deepfake Source: https://www.youtube.com/watch?v=iHv6Q9ychnA Source: Media Forensics and DeepFakes: an overview
  • 3. Manipulation types Facial manipulations can be categorised in four main different groups: • Entire face synthesis • Attribute manipulation • Identity swap • Expression swap Source: DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection (Tolosana et al., 2020) Tolosana, R., et al. (2020). Deepfakes and beyond: A survey of face manipulation and fake detection. arXiv preprint arXiv:2001.00179. Verdoliva, L. (2020). Media forensics and deepfakes: an overview. arXiv preprint arXiv:2001.06564. Mirsky, Y., & Lee, W. (2020). The Creation and Detection of Deepfakes: A Survey. arXiv preprint arXiv:2004.11138.
  • 4. WeVerify Project • WeVerify aims at detecting disinformation in social media and expose misleading and fabricated content • Partners: Univ. Sheffield, OntoText, ATC, DW, AFP, EU DisinfoLab, CERTH • A key outcome is a platform for collaborative content verification, tracking, and debunking • Currently, we are developing a deepfake detection service for the WeVerify platform • Participation in DeepFake Detection Challenge https://weverify.eu/
  • 5. DeepFake Detection Challenge • Goal: detect videos with facial or voice manipulations • 2,114 teams participated in the challenge • Log Loss error evaluation on public and private validation sets • Public evaluation contained videos with similar transformations as the training set • Private evaluation contained organic videos and videos with unknown transformations from the Internet • Our final standings: • public leaderboard: 49 (top 3%) with 0.295 Log Loss error • private leaderboard: 115 (top 5%) with 0.515 Log Loss error Source: https://www.kaggle.com/c/deepfake-detection-challenge
  • 6. DeepFake Detection Challenge - dataset • Dataset of more than 110k videos • Approx. 20k REAL and the rest are FAKE • FAKE videos generated from the REAL • Models used: • DeepFake AutoEncoder (DFAE) • Morphable Mask faceswap (MM/NN) • Neural Talking Heads (NTH) • FSGAN • StyleGAN Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C. (2020). The DeepFake Detection Challenge Dataset. arXiv preprint arXiv:2006.07397.
  • 7. Dataset preprocessing - Issues • Face dataset quality depends on face extraction accuracy (Dlib, mtcnn, facenet-pytorch, Blazeface) • Generally all face extraction libraries generate a number of false positive detections • Manual tuning can improve the quality of the generated dataset Deep learning model Face extraction Frame extraction Video corpus
  • 8. Noisy data creeping in the training set • Extracting faces with 1 fps from Kaggle DeepFake Detection Challenge dataset videos using pytorch implementation of MTCNN face detection • Observation: False detections are less compared to true detections in a video
  • 9. Our “noise” filtering approach • Compute face embeddings for each detected face in video • Similarity calculation between all face embeddings in a video → similarity graph construction • Nodes represent faces and two faces are connected if their similarities are greater than 0.8 (solid lines) • Drop components smaller than N/2 (e.g. component 2) • N is the number of frames that contain face detections (true or false).
  • 10. Advantages • Simple and fast procedure • No need for manual tuning of the face extraction settings • Clusters of distinct faces in cases of multiple persons in the video • This information can be utilized in various ways (e.g. predictions per face) Faces extracted from multiple video frames Component 1 Component 2
  • 11. Experiments • We trained multiple DeepFake detection models on the DFDC dataset with and without (baseline) our proposed approach • Three datasets: a) Celeb-DF, b) FaceForensics++, c) DFDC subset • For evaluation we examined two aggregation approaches • avg: prediction is the average of all face predictions • face: prediction is the max prediction among different avg face predictions • Results for the EfficientNet-B4 model in terms of Log loss error: Pre- processing CelebDF FaceForensics++ DFDC avg face avg face avg face baseline 0,510 0,511 0,563 0,563 0,213 0,198 proposed 0,458 0,456 0,497 0,496 0,195 0,173
  • 12. Our DFDC Approach - details • Applied proposed preprocessing approach to clean the generated face dataset • Face augmentation: • Horizontal & vertical flip, random crop, rotation, image compression, Gaussian & motion blurring, brightness, saturation & contrast transformation • Trained three different models: a) EfficientNet-B3, b) EfficientNet-B4, c) I3D* • Models trained on face level: • I3d trained with 10 consecutive face images exploiting temporal information. • EfficientNet models trained on single face images • Per model: • Added two dense layers with dropout after the backbone architecture with 256 and 1 units • Used the sigmoid activation for the last layer * ignoring the optical flow stream
  • 13. Our DFDC approach – inference pre-processing model inference post-processing
  • 14. Lessons from other DFDC teams • Most approaches ensemble multiple EfficientNet architectures (B3-B7) and some of them were trained on different seeds • ResNeXT was another architecture used by a top-performing solutions combined with 3D architectures such as I3D, 3D ResNet34, MC3 & R2+1D • Several approaches increased the margin of the detected facial bounding box to further improve results. • We used an additional margin of 20% but other works proposed a higher proportion. • To improve generalization: • Domain-specific augmentations: a) half face removal horizontally or vertically, b) landmark (eyes, nose, or mouth) removal • Mixup augmentations
  • 15. Practical challenges • Limited generalization • This observation applies to most submissions. The winning team scored 0.20336 in public validation and only 0.42798 in the private (Log Loss) • Overfitting • The best submission in the public leaderboard scored 0.19207 but in the private evaluation the error was 0.57468, leading to the 904-th position! • Broad problem scope • The term DeepFake may refer to every possible manipulation and generation • Constantly increasing manipulation and generation techniques • A detector is only trained with a subset of these manipulations
  • 16. DeepFake Detection in the Wild • Videos in the wild usually contain multiple scenes • Only a subset of these scenes may contain DeepFakes • Detection process might be slow for multi-shot videos (even short ones) • Low quality videos • Low quality faces tend to fool classifiers • Small detected and fast-moving faces • Usually lead to noisy predictions • Changes in the environment • Moving obstacles in front of the faces • Changes in lighting
  • 17. DeepFake Detection Service @ WeVerify https://www.youtube.com/watch?v=cVljNV V5VPw&ab_channel=TheFakening
  • 18. More details at TTO 2020 Charitidis, P., Kordopatis-Zilos, G., Papadopoulos, S., & Kompatsiaris, Y. (2020). Investigating the impact of preprocessing and prediction aggregation on the DeepFake detection task. Proceedings of the Conference for Truth and Trust Online (TTO) [to appear], https://arxiv.org/abs/2006.07084 https://truthandtrustonline.com/
  • 19. Thank you! Dr. Symeon Papadopoulos papadop@iti.gr @sympap Media Verification (MeVer) https://mever.iti.gr/ @meverteam https://ai4media.eu/ https://weverify.eu/