SlideShare une entreprise Scribd logo
1  sur  32
Detect Known People in a
Video
Yonatan Katz
My journey to
The Journey
Deep
Learning
Face
Detection
Shot
Boundaries
Detection
Face
Recognition
Object
Tracking
Computer
Vision
The Problem: When a specific
person appears in a video?
D. Trump:
[0:07- 1:23, 1:52-2:03]
B. Obama (nickname:
Obamush):
[0:07- 1:23]
Journey Outline
1. We will parse the video into frames
2. We will detect faces in the frame
3. We will try to recognize the faces
4. We will track the faces back and forth in the video
a. We will split the video into shots
Parsing the video
(or: choosing the technology)
● Why Python?
● OpenCV
● NumPy
● Code example:
video = cv2.VideoCapture(video_path)
video.set(3, cv2.cv.CV_CAP_PROP_FRAME_WIDTH)
video.set(4, cv2.cv.CV_CAP_PROP_FRAME_HEIGHT)
while True:
ret, frame = video.read()
if frame is None:
break
cv2.imshow('video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video.release()
cv2.destoryAllWindows()
Detect Faces
(sounds complicated, it’s not)
Let’s examine the code first
win = dlib.image_window()
image = io.imread(file_name)
face_detector = dlib.get_frontal_face_detector()
detected_faces = face_detector(image, 1)
win.set_image(image)
for i, face_rect in enumerate(detected_faces):
win.add_overlay(face_rect)
dlib.hit_enter_to_continue()
The MAGIC is here.
You don’t need to
invent anything
But how does it really work??
Taken from this great meduim post
1. Convert to grayscale image
2. Look at every pixel, and the pixels
surrounding it
But how does it really work??
Taken from this great meduim post
3. Find the direction where pixels become
darker
But how does it really work??
Taken from this great meduim post
4. Convert the image to “darker vectors”
ONLY THE “DARKNESS RATIO” METTERS - works
on both dark and bright images!
But how does it really work??
Taken from this great meduim post
6. Compare patterns!
5. Reduce the size of the vector
Recognize Faces
(Deep learning. Not only a buzzword)
Intro to machine learning
1. Train:
a. Find the data that may affect the end result (“features”)
b. Train a model that takes as an input:
i. List of features
ii. The end result (“label”)
c. Get the weights for each feature
2. Test:
a. Apply the weights on the your data
b. Compute the most relevant result
I’m a man. 32 years old. I watched 32 drama movies, 3 comedy movies (in average, I saw
only 75 % of these boring movies) and no action movie. What youtube will recommend me?
1. Borat
2. Hit
3. Titanik
Do you want to be data scientist?
13 x Feature1 + 5 x Feature2…. = score
Intro to deep learning
● How does a child learn to ride a bicycle?
● Neural network is trying to imitate a man learning process
● Invented by psychologist - ‫עושים‬‫היסטוריה‬
Deep Learning in computer vision
● Classic problem: what is this number?
● Are these images represent the same number?
Back to our journey
● The problem: recognize people
Donald Trump of course!
KE’ILU DA!
I have no idea. But he is pretty
similar to this weirdo guy:
Moment before we jump into code...
● In order to compare faces, we need to center the face (“apples to apples”)
● In order to do saw, we need to find landmarks
Alignment code example can be found here
From their website:
OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper
FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google.
Torch allows the network to be executed on a CPU or with CUDA.
Nightmare to install
:(
Finally - CODE !
align = openface.AlignDlib(args.dlibFacePredictor)
net = openface.TorchNeuralNet(args.networkModel, args.imgDim)
def getRep(imgPath):
bgrImg = cv2.imread(imgPath)
rgbImg = cv2.cvtColor(bgrImg, cv2.COLOR_BGR2RGB)
bb = align.getLargestFaceBoundingBox(rgbImg)
alignedFace = align.align(args.imgDim, rgbImg, bb,
landmarkIndices=openface.AlignDlib.OUTER_EYES_AND_NOSE)
rep = net.forward(alignedFace)
return rep
d = getRep(img1) - getRep(img2)
print("distance between representations: {:0.3f}".format(np.dot(d, d)))
Full code can be found here
Summery
● Assuming we know who’s gonna be in the video, we download images of
these people
● We run over the video frame - by - frame:
○ For each frame, search for faces
■ For each face -
● Make some image manipulation to align the face image
● Get its representation from the neural network (openface)
● Compare the representation with the representation of the pre-downloaded images
Object Tracking
(or: Why recognition over video is different from loop
over image recognition algorithm)
Problem Definition
● We are good at finding frontal faces, but not profile faces
○ There are some models that support profile pictures as well
● It is problematic to compare profile pictures
○ We need to train a model (is there data scientist in the room?)
○ We need to have too many profile pictures…
● What if our dear president-elect decides to turn around?
Object Tracking
● Dlib have an API for tracking objects
● We need to run forward and backward once we find a face
● Problem: if there is a camera cut in the middle, it doesn’t know.
video = cv2.VideoCapture(video_path)
video.set(3, cv2.cv.CV_CAP_PROP_FRAME_WIDTH)
video.set(4, cv2.cv.CV_CAP_PROP_FRAME_HEIGHT)
tracker = dlib.correlation_tracker()
ret, frame = video.read()
tracker.start_track(frame, face_rectangle)
while True:
ret, frame = video.read()
if frame is None:
break
tracker.update(frame)
pos = tracker.get_position()
bl = (int(pos.left()), int(pos.bottom()))
tr = (int(pos.right()), int(pos.top()))
cv2.rectangle(frame, bl, tr, color=(153, 255, 204), thickness=3)
cv2.imshow('video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video.release()
cv2.destoryAllWindows()
Dlib is a modern C++ toolkit containing
machine learning algorithms and tools
for creating complex software in C++ to
solve real world problems. It is used in
both industry and academia in a wide
range of domains including robotics,
embedded devices, mobile phones, and
large high performance computing
environments
From dlib website:
Shot Boundaries
Detection
(last known stop in our journey)
Movie Shots
● We need it in order to cut the object trackers
● Shot types:
○ Camera cut
○ Dissolve
○ Wipe
○ Fade-in / Fade out
● Tools that do shot detection:
○ Ffmpg
○ Scene Segmentation
● Not good enough...
Comparison Metrics
● Color histogram
Comparison Metrics
● Edge Change Ratio - Compare the in-pixels and out-pixels
Frame # NFrame # N -1
Considerations (ok ok , and some code…)
● Thresholds for shot change
● Compare every two consecutive frames, or distant frames
● Do we prefer more shots (maybe wrong ones), or less shots (and miss ones)
● Check the complete frame, or the tracked object square
● Crop the image before comparison (prevent subtitles, logo noises, etc.)
● What will happen if a cat is sitting on a table, and then jumps?
● ECR doesn’t have much effect. But it’s cool!
● ECR code here
So Where are We Standing?
● Problems with model (= neural network)
○ Grayscale images
○ Colored people
● We need validation of 3rd party
○ But not on all frames
● We want to build an images database
● Hardware requirements are very high
○ Maybe we will process only ‘important videos’
Q&A

Contenu connexe

Similaire à People detection in a video

Using the code below- I need help with creating code for the following.pdf
Using the code below- I need help with creating code for the following.pdfUsing the code below- I need help with creating code for the following.pdf
Using the code below- I need help with creating code for the following.pdfacteleshoppe
 
426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer Tools426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer ToolsMark Billinghurst
 
License Plate Recognition System
License Plate Recognition System License Plate Recognition System
License Plate Recognition System Hira Rizvi
 
building_games_with_ruby_rubyconf
building_games_with_ruby_rubyconfbuilding_games_with_ruby_rubyconf
building_games_with_ruby_rubyconftutorialsruby
 
building_games_with_ruby_rubyconf
building_games_with_ruby_rubyconfbuilding_games_with_ruby_rubyconf
building_games_with_ruby_rubyconftutorialsruby
 
Advanced Game Development with the Mobile 3D Graphics API
Advanced Game Development with the Mobile 3D Graphics APIAdvanced Game Development with the Mobile 3D Graphics API
Advanced Game Development with the Mobile 3D Graphics APITomi Aarnio
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationRyo Takahashi
 
project_final_seminar
project_final_seminarproject_final_seminar
project_final_seminarMUKUL BICHKAR
 
動画像理解のための深層学習アプローチ Deep learning approaches to video understanding
動画像理解のための深層学習アプローチ Deep learning approaches to video understanding動画像理解のための深層学習アプローチ Deep learning approaches to video understanding
動画像理解のための深層学習アプローチ Deep learning approaches to video understandingToru Tamaki
 
The not so short
The not so shortThe not so short
The not so shortAXM
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingSteven Tovey
 
Need helping adding to the code below to plot the images from the firs.pdf
Need helping adding to the code below to plot the images from the firs.pdfNeed helping adding to the code below to plot the images from the firs.pdf
Need helping adding to the code below to plot the images from the firs.pdfactexerode
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkitde:code 2017
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterJeff Hale
 
2 d gameplaytutorial
2 d gameplaytutorial2 d gameplaytutorial
2 d gameplaytutorialunityshare
 
Doom Technical Review
Doom Technical ReviewDoom Technical Review
Doom Technical ReviewAli Salehi
 
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordes
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel HordesPyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordes
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordeskgrandis
 

Similaire à People detection in a video (20)

Using the code below- I need help with creating code for the following.pdf
Using the code below- I need help with creating code for the following.pdfUsing the code below- I need help with creating code for the following.pdf
Using the code below- I need help with creating code for the following.pdf
 
05-Debug.pdf
05-Debug.pdf05-Debug.pdf
05-Debug.pdf
 
426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer Tools426 lecture 4: AR Developer Tools
426 lecture 4: AR Developer Tools
 
License Plate Recognition System
License Plate Recognition System License Plate Recognition System
License Plate Recognition System
 
engine terminology 2
 engine terminology 2 engine terminology 2
engine terminology 2
 
building_games_with_ruby_rubyconf
building_games_with_ruby_rubyconfbuilding_games_with_ruby_rubyconf
building_games_with_ruby_rubyconf
 
building_games_with_ruby_rubyconf
building_games_with_ruby_rubyconfbuilding_games_with_ruby_rubyconf
building_games_with_ruby_rubyconf
 
Advanced Game Development with the Mobile 3D Graphics API
Advanced Game Development with the Mobile 3D Graphics APIAdvanced Game Development with the Mobile 3D Graphics API
Advanced Game Development with the Mobile 3D Graphics API
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth Estimation
 
project_final_seminar
project_final_seminarproject_final_seminar
project_final_seminar
 
動画像理解のための深層学習アプローチ Deep learning approaches to video understanding
動画像理解のための深層学習アプローチ Deep learning approaches to video understanding動画像理解のための深層学習アプローチ Deep learning approaches to video understanding
動画像理解のための深層学習アプローチ Deep learning approaches to video understanding
 
OpenCV+Android.pptx
OpenCV+Android.pptxOpenCV+Android.pptx
OpenCV+Android.pptx
 
The not so short
The not so shortThe not so short
The not so short
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time Lighting
 
Need helping adding to the code below to plot the images from the firs.pdf
Need helping adding to the code below to plot the images from the firs.pdfNeed helping adding to the code below to plot the images from the firs.pdf
Need helping adding to the code below to plot the images from the firs.pdf
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models faster
 
2 d gameplaytutorial
2 d gameplaytutorial2 d gameplaytutorial
2 d gameplaytutorial
 
Doom Technical Review
Doom Technical ReviewDoom Technical Review
Doom Technical Review
 
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordes
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel HordesPyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordes
PyCon 2012: Militarizing Your Backyard: Computer Vision and the Squirrel Hordes
 

Dernier

Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Dernier (20)

Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

People detection in a video

  • 1. Detect Known People in a Video Yonatan Katz My journey to
  • 3. The Problem: When a specific person appears in a video? D. Trump: [0:07- 1:23, 1:52-2:03] B. Obama (nickname: Obamush): [0:07- 1:23]
  • 4. Journey Outline 1. We will parse the video into frames 2. We will detect faces in the frame 3. We will try to recognize the faces 4. We will track the faces back and forth in the video a. We will split the video into shots
  • 5. Parsing the video (or: choosing the technology)
  • 6. ● Why Python? ● OpenCV ● NumPy ● Code example: video = cv2.VideoCapture(video_path) video.set(3, cv2.cv.CV_CAP_PROP_FRAME_WIDTH) video.set(4, cv2.cv.CV_CAP_PROP_FRAME_HEIGHT) while True: ret, frame = video.read() if frame is None: break cv2.imshow('video', frame) if cv2.waitKey(1) & 0xFF == ord('q'): break video.release() cv2.destoryAllWindows()
  • 8. Let’s examine the code first win = dlib.image_window() image = io.imread(file_name) face_detector = dlib.get_frontal_face_detector() detected_faces = face_detector(image, 1) win.set_image(image) for i, face_rect in enumerate(detected_faces): win.add_overlay(face_rect) dlib.hit_enter_to_continue() The MAGIC is here. You don’t need to invent anything
  • 9. But how does it really work?? Taken from this great meduim post 1. Convert to grayscale image 2. Look at every pixel, and the pixels surrounding it
  • 10. But how does it really work?? Taken from this great meduim post 3. Find the direction where pixels become darker
  • 11. But how does it really work?? Taken from this great meduim post 4. Convert the image to “darker vectors” ONLY THE “DARKNESS RATIO” METTERS - works on both dark and bright images!
  • 12. But how does it really work?? Taken from this great meduim post 6. Compare patterns! 5. Reduce the size of the vector
  • 13. Recognize Faces (Deep learning. Not only a buzzword)
  • 14. Intro to machine learning 1. Train: a. Find the data that may affect the end result (“features”) b. Train a model that takes as an input: i. List of features ii. The end result (“label”) c. Get the weights for each feature 2. Test: a. Apply the weights on the your data b. Compute the most relevant result I’m a man. 32 years old. I watched 32 drama movies, 3 comedy movies (in average, I saw only 75 % of these boring movies) and no action movie. What youtube will recommend me? 1. Borat 2. Hit 3. Titanik Do you want to be data scientist? 13 x Feature1 + 5 x Feature2…. = score
  • 15. Intro to deep learning ● How does a child learn to ride a bicycle? ● Neural network is trying to imitate a man learning process ● Invented by psychologist - ‫עושים‬‫היסטוריה‬
  • 16. Deep Learning in computer vision ● Classic problem: what is this number? ● Are these images represent the same number?
  • 17. Back to our journey ● The problem: recognize people Donald Trump of course! KE’ILU DA! I have no idea. But he is pretty similar to this weirdo guy:
  • 18. Moment before we jump into code... ● In order to compare faces, we need to center the face (“apples to apples”) ● In order to do saw, we need to find landmarks Alignment code example can be found here
  • 19. From their website: OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google. Torch allows the network to be executed on a CPU or with CUDA. Nightmare to install :(
  • 20. Finally - CODE ! align = openface.AlignDlib(args.dlibFacePredictor) net = openface.TorchNeuralNet(args.networkModel, args.imgDim) def getRep(imgPath): bgrImg = cv2.imread(imgPath) rgbImg = cv2.cvtColor(bgrImg, cv2.COLOR_BGR2RGB) bb = align.getLargestFaceBoundingBox(rgbImg) alignedFace = align.align(args.imgDim, rgbImg, bb, landmarkIndices=openface.AlignDlib.OUTER_EYES_AND_NOSE) rep = net.forward(alignedFace) return rep d = getRep(img1) - getRep(img2) print("distance between representations: {:0.3f}".format(np.dot(d, d))) Full code can be found here
  • 21. Summery ● Assuming we know who’s gonna be in the video, we download images of these people ● We run over the video frame - by - frame: ○ For each frame, search for faces ■ For each face - ● Make some image manipulation to align the face image ● Get its representation from the neural network (openface) ● Compare the representation with the representation of the pre-downloaded images
  • 22. Object Tracking (or: Why recognition over video is different from loop over image recognition algorithm)
  • 23. Problem Definition ● We are good at finding frontal faces, but not profile faces ○ There are some models that support profile pictures as well ● It is problematic to compare profile pictures ○ We need to train a model (is there data scientist in the room?) ○ We need to have too many profile pictures… ● What if our dear president-elect decides to turn around?
  • 24.
  • 25. Object Tracking ● Dlib have an API for tracking objects ● We need to run forward and backward once we find a face ● Problem: if there is a camera cut in the middle, it doesn’t know. video = cv2.VideoCapture(video_path) video.set(3, cv2.cv.CV_CAP_PROP_FRAME_WIDTH) video.set(4, cv2.cv.CV_CAP_PROP_FRAME_HEIGHT) tracker = dlib.correlation_tracker() ret, frame = video.read() tracker.start_track(frame, face_rectangle) while True: ret, frame = video.read() if frame is None: break tracker.update(frame) pos = tracker.get_position() bl = (int(pos.left()), int(pos.bottom())) tr = (int(pos.right()), int(pos.top())) cv2.rectangle(frame, bl, tr, color=(153, 255, 204), thickness=3) cv2.imshow('video', frame) if cv2.waitKey(1) & 0xFF == ord('q'): break video.release() cv2.destoryAllWindows() Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments From dlib website:
  • 27. Movie Shots ● We need it in order to cut the object trackers ● Shot types: ○ Camera cut ○ Dissolve ○ Wipe ○ Fade-in / Fade out ● Tools that do shot detection: ○ Ffmpg ○ Scene Segmentation ● Not good enough...
  • 29. Comparison Metrics ● Edge Change Ratio - Compare the in-pixels and out-pixels Frame # NFrame # N -1
  • 30. Considerations (ok ok , and some code…) ● Thresholds for shot change ● Compare every two consecutive frames, or distant frames ● Do we prefer more shots (maybe wrong ones), or less shots (and miss ones) ● Check the complete frame, or the tracked object square ● Crop the image before comparison (prevent subtitles, logo noises, etc.) ● What will happen if a cat is sitting on a table, and then jumps? ● ECR doesn’t have much effect. But it’s cool! ● ECR code here
  • 31. So Where are We Standing? ● Problems with model (= neural network) ○ Grayscale images ○ Colored people ● We need validation of 3rd party ○ But not on all frames ● We want to build an images database ● Hardware requirements are very high ○ Maybe we will process only ‘important videos’
  • 32. Q&A