SlideShare une entreprise Scribd logo
1  sur  65
‫ממוחשבת‬ ‫לראיה‬ ‫מבוא‬
‫שגיב‬ ‫חן‬
‫ומנכ‬ ‫מייסדת‬"‫בע‬ ‫שגיבטק‬ ‫משותפת‬ ‫לית‬"‫מ‬
‫ההרצאה‬ ‫תוכן‬
•‫ממוחשבת‬ ‫ראיה‬–‫טוב‬ ‫זה‬ ‫ולמה‬ ‫זה‬ ‫מה‬?
•‫בסיסיים‬ ‫דברים‬ ‫כמה‬
•‫ממוחשבת‬ ‫ראיה‬ ‫של‬ ‫קלסיקה‬
•‫ה‬ ‫מהפיכת‬-AI
•‫הממוחשבת‬ ‫הראיה‬ ‫לעולם‬ ‫כניסה‬
‫ממוחשבת‬ ‫ראיה‬ ‫מהי‬?
•‫כלים‬ ‫בעזרת‬ ‫מתמונות‬ ‫מידע‬ ‫והוצאת‬ ‫שיפור‬
‫ממוחשב‬ ‫ועיבוד‬ ‫מתמטיים‬
•‫מטרות‬:
–‫האנושי‬ ‫לצופה‬ ‫התמונה‬ ‫של‬ ‫מיטבית‬ ‫הצגה‬
–‫מידע‬ ‫והפקת‬ ‫תמונות‬ ‫של‬ ‫ממוחשבת‬ ‫אנליזה‬
‫ממוחשבת‬ ‫בראיה‬ ‫משתמשים‬ ‫היכן‬?
•‫רפואה‬
•‫אוטונומיים‬ ‫רכבים‬
•‫רבודה‬ ‫ומציאות‬ ‫מדומה‬ ‫מציאות‬
•‫תעשיה‬:‫פגמים‬ ‫איבחון‬
•‫חברתיות‬ ‫רשתות‬
•‫בטחונית‬ ‫תעשיה‬,‫אבטחה‬
‫ואכיפת‬‫חוק‬
•‫חקלאות‬
•‫וטיפוח‬ ‫אופנה‬
Taken from: http://www.cvl.isy.liu.se/
‫דיגיטלית‬ ‫תמונה‬ ‫מהי‬?
‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬?
•‫ה‬"‫אופטיקה‬"-‫הגל‬ ‫לאורך‬ ‫שרגיש‬ ‫סנסור‬ ‫ידי‬ ‫על‬ ‫תמונה‬ ‫רכישת‬
‫המתאים‬:‫נראה‬ ‫אור‬,US,‫אינפרא‬-‫אדום‬
•‫שהתקבלה‬ ‫לאנרגיה‬ ‫פרופורציונלי‬ ‫חשמלי‬ ‫אות‬ ‫מייצר‬ ‫הסנסור‬
•‫האנלוגי‬ ‫האות‬ ‫דגימת‬
•DSP-‫בצבע‬ ‫טיפול‬,‫שגויים‬ ‫פיקסלים‬,‫רעש‬
‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬?
‫האנושית‬ ‫הראיה‬ ‫מערכת‬
‫מ‬ ‫לקוח‬-Wikipedia
‫מרחבית‬ ‫דגימה‬
Taken from Digital Image Processing, Gonzalez
‫נייקוויסט‬ ‫ותדר‬ ‫הדגימה‬ ‫משפט‬
•‫דגום‬ ‫אות‬ ‫לשחזר‬ ‫כדי‬ ‫כי‬ ‫קובע‬ ‫זה‬ ‫משפט‬,‫קצב‬
‫האות‬ ‫מתדר‬ ‫כפליים‬ ‫לפחות‬ ‫להיות‬ ‫צריך‬ ‫הדגימה‬
‫הדגום‬.‫נקרא‬ ‫זה‬ ‫דגימה‬ ‫קצב‬:‫נייקויסט‬ ‫תדר‬
‫צריך‬ ‫אפור‬ ‫רמות‬ ‫כמה‬?
•‫ב‬ ‫משתמשים‬ ‫הצגה‬ ‫לצרכי‬-8‫ביט‬
•‫לצרכי‬‫בנות‬ ‫בתמונות‬ ‫מטפלים‬ ‫תמונה‬ ‫ניתוח‬10-
12‫יותר‬ ‫ואף‬ ‫ביט‬
Taken from Digital Image Processing, Gonzalez
‫בתמונה‬ ‫רעשים‬ ‫ניקוי‬
Linear Denoising Filters
Speckle Noise Gaussian Noise Salt & Pepper Noise
Median Filter
Speckle Noise Gaussian Noise Salt & Pepper Noise
‫שפות‬ ‫ומציאת‬ ‫סגמנטציה‬
‫סגמנטציה‬ ‫מבוססת‬ ‫מה‬ ‫על‬?
•‫אזורים‬ ‫בין‬ ‫מעבר‬ ‫יש‬ ‫בהן‬ ‫נקודות‬ ‫או‬ ‫השפה‬ ‫חיפוש‬–
‫שפות‬ ‫מבוססות‬ ‫שיטות‬
•‫שבו‬ ‫שהפיקסלים‬ ‫אזור‬ ‫הגדרת‬"‫דומים‬"‫לזה‬ ‫זה‬–
‫אזורים‬ ‫מבוססות‬ ‫שיטות‬
Gray-level profile
First derivative
Second derivative
Approximations
 Sobel
 Prewitt
-1-2-1
000
121
10-1
20-2
10-1
-1-1-1
000
111
10-1
10-1
10-1
Global Processing: The Hough Transform
•‫ע‬ ‫לתיאור‬ ‫ניתן‬ ‫בתמונה‬ ‫ישר‬ ‫כל‬"‫משוואה‬ ‫י‬.
•‫קווים‬ ‫אינסוף‬ ‫לעבור‬ ‫יכולים‬ ‫הישר‬ ‫על‬ ‫נקודה‬ ‫כל‬ ‫דרך‬
•‫בהתמרת‬Hough‫נקודה‬ ‫כל‬"‫מצביעה‬"‫הקווים‬ ‫עבור‬
‫דרכה‬ ‫לעבור‬ ‫שיכולים‬
•‫קולות‬ ‫הרבה‬ ‫הכי‬ ‫עם‬ ‫הישר‬-‫מנצח‬!
The Hough Transform
SIFT
Scale Invariant Transform
A motivating application
Building a panorama
• We need to match/align/register images
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1) Detect feature points in both images
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1. Detect feature points in both images
2. Find corresponding pairs
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1. Detect feature points in both images
2. Find corresponding pairs
3. Find a parametric transformation (e.g. homography)
4. Warp (right image to left image)
Taken from PPT of M. Brown and D. Lowe, University of British Columbia
Scale-Space
     , , , , ,L x y G x y I x y  
   2 2 2
2
2
1
, ,
2
x y
G x y e



 

DoG
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Scale-Space Extrema Detection
• X is selected if it is larger or smaller than all 26
neighbors
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Keypoint Localization
Threshold on minimal contrast
Threshold on ratio of
principal curvatures
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Orientation assignment
• Create weighted histogram of
local gradient directions
computed at selected scale
• Assign canonical orientation at
peak of smoothed histogram
• For location of multiple peaks
multiply key point
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Keypoint descriptor
https://www.youtube.com/watch?v=FsFC8sCpDSw
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
• If you’ve been to a concert recently, you’ve probably seen how many people take videos of the
event with mobile phone cameras
• Each user has only one video – taken from one angle and location and of only moderate quality
Mobile Crowdsourcing Video Scene
Reconstruction
Creation of the 3D Video Sequence
The scene is photographed by
several people using their cell
phone camera
The video data is
transmitted via the
cellular network to a
High Performance
Computing server.
Following time
synchronization, resolution
normalization and spatial
registration, the several videos
are merged into a 3-D video
cube.
TIME
Spatial Calibration
Feature detection +
Matching
Fundamental matrix
estimation
Global registration
Time & Audio Synchronization
• Precise : epipolar matching is both fast and accurate
• Dense multi-scale description of the images using binary descriptors
3D Model Reconstruction
• Precise : epipolar matching is both fast and accurate
• Empirical probability density check to discard false positives at occlusion
points
Correct match : max peak above other local max
Wrong match : max peak similar to other local max
3D Model Reconstruction
• Robust : works even with a minimal set of inputs
• two viewpoints already sufficient for dense reconstruction
• very few erroneous points
3Dreconstruction
3D Model Reconstruction
3D Visualizer for Dynamic Scenes
moving unknown
‫המחשב‬ ‫מן‬ ‫האדם‬ ‫מותר‬?
Kanizsa
‫ה‬ ‫מהפיכת‬-AI
• Image classification revolutionized by DL
– ImageNet – from 27% to ~5% in three years
What happened in ImageNet 2012 ?
• In this case, our classifier would have a decision boundary
more complex than the simple straight line.
• All the training patterns would be separated perfectly.
Learning Methods - Supervised
• Simpler recognizer  better performance on novel patterns.
• This is one of the central problems in statistical pattern
recognition.
Learning Methods - Supervised
• Feature extraction
 Discriminative features
 Invariant features with respect to translation, rotation and scale.
• Classification
 Use a feature vector provided by a feature extractor to assign the
object to a category
Learning Methods - Supervised
• A multi layered Neural Network (NN)
– With non linearity
• The input to the DL NN is presented at the input layer
– Images, sound, laguage...
• Hidden layers extracts increasingly abstract features
• Output layer contains the result
Neural Networks
• More complex neural networks with multiple layers and multiple output neurons are
theoretically capable of separation using any continuous surface.
• The straight line depicts the separation achieved by a simple Perceptron and the curve the
separation by a multi-layered network (left), which is in theory able to learn any separating
function.
Learning methods
Supervised
What can you do with Machine & Deep Learning ?
• Train classifiers for specific recognition tasks
• Localization
– fully convolutional connected networks
– train networks, e.g. RCNN, YOLO
The task: What object category do we see in the image?
Benchmark: 1000 category “ImageNet” dataset.
(from Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep
Convolutional Neural Networks, NIPS, 2012.)
Image Classification
2012 AlexNet results were what jump started the current neural network wave.
Current results (networks with tens/hundreds of layers) perform better than humans on this task!
Image Classification - continued
Task: Find bounding boxes and categories of objects in the image.
Benchmark: PASCAL VOC / Microsoft COCO
In right video: YOLO 2 (You Look Only Once)
Object Detection
https://www.youtube.com/watch?v=VOC3huqHrssg
Task: Assign each image pixel a category label (person, wall, road, dog, ..)
Example Dataset: PASCAL VOC2012 Challenge
SegModel, Deep-Labv2, and many more..
Image Semantic Segmentation
Task: Given an image and a question, answer the question.
Question Answering
Task: Given a dataset of images, generate new artificial image that look real!
State of the art: Generative Adversarial Networks (GANs)
Image Generation
Steering the wheels of self driving cars
Super resolution
Image completion
Saliency detection
Human pose detection
Facial keypoints (nose, eye, ear..)
Image captioning
Activity recognition
And many, many more tasks ….
SagivTech Traffic Lights Detection using DL
With dlib & a few
images from
Google street
view
https://www.youtube.com/watch?v=jg444J2AmOI
https://www.youtube.com/watch?v=YV4y1iqo_TQ
What can you do to get into this world ?
• Theory:
– Get to know basic computer vision and some “classical” algorithms, e.g.
Viola Jones, SIFT, etc.
– Get to know Deep Learning, e.g. CS 231 by Stanford
• Practice:
– Get to know and use OpenCV
– Hands on experience with Caffe, Tensor Flow etc.
Technology and Professional Services company
Established in 2009 and headquartered in Israel
SagivTech Snapshot
• What we do:
• Technological
Solutions
• Projects
• Research
• Core domains:
• Computer Vision
• Deep Learning
• Code Optimization
• GPU Computing
Thanks for the following SagivTech team members and
collaborators:
Acknowledgements
• Prof. Peter Maass, University of Bremen
• Prof. Pierre Vandergheynst, EPFL
• Dov Eilot, SagivTech
• Jacob Gildenblat, SagivTech
• Amir Egozi, SceneNet project
Thank You
F o r m o r e i n f o r m a t i o n p l e a s e c o n t a c t
C h e n S a g i v
c h e n @ s a g i v t e c h . c o m
+ 9 7 2 5 4 7 7 0 6 0 8 9

Contenu connexe

Tendances

Machine learning and multimedia information retrieval
Machine learning and multimedia information retrievalMachine learning and multimedia information retrieval
Machine learning and multimedia information retrieval
Si Krishan
 
object recognition for robots
object recognition for robotsobject recognition for robots
object recognition for robots
s1240148
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
Jia-Bin Huang
 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
CSCJournals
 
Resume
ResumeResume
Resume
butest
 

Tendances (20)

Machine learning and multimedia information retrieval
Machine learning and multimedia information retrievalMachine learning and multimedia information retrieval
Machine learning and multimedia information retrieval
 
object recognition for robots
object recognition for robotsobject recognition for robots
object recognition for robots
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
G010245056
G010245056G010245056
G010245056
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Computer vision series
Computer vision seriesComputer vision series
Computer vision series
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
 
Image recognition
Image recognitionImage recognition
Image recognition
 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
 
Computer vision for interactive computer graphics
Computer vision for interactive computer graphicsComputer vision for interactive computer graphics
Computer vision for interactive computer graphics
 
Activity recognition for video surveillance
Activity recognition for video surveillanceActivity recognition for video surveillance
Activity recognition for video surveillance
 
Computer vision
Computer visionComputer vision
Computer vision
 
Independent Research
Independent ResearchIndependent Research
Independent Research
 
Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界
 
Computer vision
Computer visionComputer vision
Computer vision
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Resume
ResumeResume
Resume
 
Tagsense ppt
Tagsense pptTagsense ppt
Tagsense ppt
 

Similaire à Introduction talk to Computer Vision

TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
Motaz El-Saban
 

Similaire à Introduction talk to Computer Vision (20)

Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
 
Object Recognition
Object RecognitionObject Recognition
Object Recognition
 
1.pdf
1.pdf1.pdf
1.pdf
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Deep Learning AtoC with Image Perspective
Deep Learning AtoC with Image PerspectiveDeep Learning AtoC with Image Perspective
Deep Learning AtoC with Image Perspective
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
Introduction
IntroductionIntroduction
Introduction
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
Resume_updated_job
Resume_updated_jobResume_updated_job
Resume_updated_job
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
IRJET - Direct Me-Nevigation for Blind People
IRJET -  	  Direct Me-Nevigation for Blind PeopleIRJET -  	  Direct Me-Nevigation for Blind People
IRJET - Direct Me-Nevigation for Blind People
 
Deep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionDeep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer vision
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Development of wearable object detection system & blind stick for visuall...
Development of wearable object detection system & blind stick for visuall...Development of wearable object detection system & blind stick for visuall...
Development of wearable object detection system & blind stick for visuall...
 
introdaction.pptx
introdaction.pptxintrodaction.pptx
introdaction.pptx
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A Survey
 
ICS1020 CV
ICS1020 CVICS1020 CV
ICS1020 CV
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Introduction talk to Computer Vision

  • 1. ‫ממוחשבת‬ ‫לראיה‬ ‫מבוא‬ ‫שגיב‬ ‫חן‬ ‫ומנכ‬ ‫מייסדת‬"‫בע‬ ‫שגיבטק‬ ‫משותפת‬ ‫לית‬"‫מ‬
  • 2. ‫ההרצאה‬ ‫תוכן‬ •‫ממוחשבת‬ ‫ראיה‬–‫טוב‬ ‫זה‬ ‫ולמה‬ ‫זה‬ ‫מה‬? •‫בסיסיים‬ ‫דברים‬ ‫כמה‬ •‫ממוחשבת‬ ‫ראיה‬ ‫של‬ ‫קלסיקה‬ •‫ה‬ ‫מהפיכת‬-AI •‫הממוחשבת‬ ‫הראיה‬ ‫לעולם‬ ‫כניסה‬
  • 3. ‫ממוחשבת‬ ‫ראיה‬ ‫מהי‬? •‫כלים‬ ‫בעזרת‬ ‫מתמונות‬ ‫מידע‬ ‫והוצאת‬ ‫שיפור‬ ‫ממוחשב‬ ‫ועיבוד‬ ‫מתמטיים‬ •‫מטרות‬: –‫האנושי‬ ‫לצופה‬ ‫התמונה‬ ‫של‬ ‫מיטבית‬ ‫הצגה‬ –‫מידע‬ ‫והפקת‬ ‫תמונות‬ ‫של‬ ‫ממוחשבת‬ ‫אנליזה‬
  • 4. ‫ממוחשבת‬ ‫בראיה‬ ‫משתמשים‬ ‫היכן‬? •‫רפואה‬ •‫אוטונומיים‬ ‫רכבים‬ •‫רבודה‬ ‫ומציאות‬ ‫מדומה‬ ‫מציאות‬ •‫תעשיה‬:‫פגמים‬ ‫איבחון‬ •‫חברתיות‬ ‫רשתות‬ •‫בטחונית‬ ‫תעשיה‬,‫אבטחה‬ ‫ואכיפת‬‫חוק‬ •‫חקלאות‬ •‫וטיפוח‬ ‫אופנה‬ Taken from: http://www.cvl.isy.liu.se/
  • 6. ‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬? •‫ה‬"‫אופטיקה‬"-‫הגל‬ ‫לאורך‬ ‫שרגיש‬ ‫סנסור‬ ‫ידי‬ ‫על‬ ‫תמונה‬ ‫רכישת‬ ‫המתאים‬:‫נראה‬ ‫אור‬,US,‫אינפרא‬-‫אדום‬ •‫שהתקבלה‬ ‫לאנרגיה‬ ‫פרופורציונלי‬ ‫חשמלי‬ ‫אות‬ ‫מייצר‬ ‫הסנסור‬ •‫האנלוגי‬ ‫האות‬ ‫דגימת‬ •DSP-‫בצבע‬ ‫טיפול‬,‫שגויים‬ ‫פיקסלים‬,‫רעש‬
  • 9. ‫מרחבית‬ ‫דגימה‬ Taken from Digital Image Processing, Gonzalez
  • 10. ‫נייקוויסט‬ ‫ותדר‬ ‫הדגימה‬ ‫משפט‬ •‫דגום‬ ‫אות‬ ‫לשחזר‬ ‫כדי‬ ‫כי‬ ‫קובע‬ ‫זה‬ ‫משפט‬,‫קצב‬ ‫האות‬ ‫מתדר‬ ‫כפליים‬ ‫לפחות‬ ‫להיות‬ ‫צריך‬ ‫הדגימה‬ ‫הדגום‬.‫נקרא‬ ‫זה‬ ‫דגימה‬ ‫קצב‬:‫נייקויסט‬ ‫תדר‬
  • 11. ‫צריך‬ ‫אפור‬ ‫רמות‬ ‫כמה‬? •‫ב‬ ‫משתמשים‬ ‫הצגה‬ ‫לצרכי‬-8‫ביט‬ •‫לצרכי‬‫בנות‬ ‫בתמונות‬ ‫מטפלים‬ ‫תמונה‬ ‫ניתוח‬10- 12‫יותר‬ ‫ואף‬ ‫ביט‬
  • 12. Taken from Digital Image Processing, Gonzalez
  • 14. Linear Denoising Filters Speckle Noise Gaussian Noise Salt & Pepper Noise
  • 15. Median Filter Speckle Noise Gaussian Noise Salt & Pepper Noise
  • 17. ‫סגמנטציה‬ ‫מבוססת‬ ‫מה‬ ‫על‬? •‫אזורים‬ ‫בין‬ ‫מעבר‬ ‫יש‬ ‫בהן‬ ‫נקודות‬ ‫או‬ ‫השפה‬ ‫חיפוש‬– ‫שפות‬ ‫מבוססות‬ ‫שיטות‬ •‫שבו‬ ‫שהפיקסלים‬ ‫אזור‬ ‫הגדרת‬"‫דומים‬"‫לזה‬ ‫זה‬– ‫אזורים‬ ‫מבוססות‬ ‫שיטות‬
  • 20.
  • 21. Global Processing: The Hough Transform •‫ע‬ ‫לתיאור‬ ‫ניתן‬ ‫בתמונה‬ ‫ישר‬ ‫כל‬"‫משוואה‬ ‫י‬. •‫קווים‬ ‫אינסוף‬ ‫לעבור‬ ‫יכולים‬ ‫הישר‬ ‫על‬ ‫נקודה‬ ‫כל‬ ‫דרך‬ •‫בהתמרת‬Hough‫נקודה‬ ‫כל‬"‫מצביעה‬"‫הקווים‬ ‫עבור‬ ‫דרכה‬ ‫לעבור‬ ‫שיכולים‬ •‫קולות‬ ‫הרבה‬ ‫הכי‬ ‫עם‬ ‫הישר‬-‫מנצח‬!
  • 23.
  • 25. A motivating application Building a panorama • We need to match/align/register images Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 26. Building a panorama 1) Detect feature points in both images Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 27. Building a panorama 1. Detect feature points in both images 2. Find corresponding pairs Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 28. Building a panorama 1. Detect feature points in both images 2. Find corresponding pairs 3. Find a parametric transformation (e.g. homography) 4. Warp (right image to left image) Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 29. Scale-Space      , , , , ,L x y G x y I x y      2 2 2 2 2 1 , , 2 x y G x y e      
  • 30. DoG http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 31. Scale-Space Extrema Detection • X is selected if it is larger or smaller than all 26 neighbors Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 32. Keypoint Localization Threshold on minimal contrast Threshold on ratio of principal curvatures Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 33. Orientation assignment • Create weighted histogram of local gradient directions computed at selected scale • Assign canonical orientation at peak of smoothed histogram • For location of multiple peaks multiply key point Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 34. Keypoint descriptor https://www.youtube.com/watch?v=FsFC8sCpDSw Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 35. • If you’ve been to a concert recently, you’ve probably seen how many people take videos of the event with mobile phone cameras • Each user has only one video – taken from one angle and location and of only moderate quality Mobile Crowdsourcing Video Scene Reconstruction
  • 36. Creation of the 3D Video Sequence The scene is photographed by several people using their cell phone camera The video data is transmitted via the cellular network to a High Performance Computing server. Following time synchronization, resolution normalization and spatial registration, the several videos are merged into a 3-D video cube. TIME
  • 38. Feature detection + Matching Fundamental matrix estimation Global registration
  • 39. Time & Audio Synchronization
  • 40. • Precise : epipolar matching is both fast and accurate • Dense multi-scale description of the images using binary descriptors 3D Model Reconstruction
  • 41. • Precise : epipolar matching is both fast and accurate • Empirical probability density check to discard false positives at occlusion points Correct match : max peak above other local max Wrong match : max peak similar to other local max 3D Model Reconstruction
  • 42. • Robust : works even with a minimal set of inputs • two viewpoints already sufficient for dense reconstruction • very few erroneous points 3Dreconstruction 3D Model Reconstruction
  • 43. 3D Visualizer for Dynamic Scenes moving unknown
  • 46. • Image classification revolutionized by DL – ImageNet – from 27% to ~5% in three years What happened in ImageNet 2012 ?
  • 47. • In this case, our classifier would have a decision boundary more complex than the simple straight line. • All the training patterns would be separated perfectly. Learning Methods - Supervised
  • 48. • Simpler recognizer  better performance on novel patterns. • This is one of the central problems in statistical pattern recognition. Learning Methods - Supervised
  • 49. • Feature extraction  Discriminative features  Invariant features with respect to translation, rotation and scale. • Classification  Use a feature vector provided by a feature extractor to assign the object to a category Learning Methods - Supervised
  • 50. • A multi layered Neural Network (NN) – With non linearity • The input to the DL NN is presented at the input layer – Images, sound, laguage... • Hidden layers extracts increasingly abstract features • Output layer contains the result Neural Networks
  • 51. • More complex neural networks with multiple layers and multiple output neurons are theoretically capable of separation using any continuous surface. • The straight line depicts the separation achieved by a simple Perceptron and the curve the separation by a multi-layered network (left), which is in theory able to learn any separating function. Learning methods Supervised
  • 52. What can you do with Machine & Deep Learning ? • Train classifiers for specific recognition tasks • Localization – fully convolutional connected networks – train networks, e.g. RCNN, YOLO
  • 53. The task: What object category do we see in the image? Benchmark: 1000 category “ImageNet” dataset. (from Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012.) Image Classification
  • 54. 2012 AlexNet results were what jump started the current neural network wave. Current results (networks with tens/hundreds of layers) perform better than humans on this task! Image Classification - continued
  • 55. Task: Find bounding boxes and categories of objects in the image. Benchmark: PASCAL VOC / Microsoft COCO In right video: YOLO 2 (You Look Only Once) Object Detection https://www.youtube.com/watch?v=VOC3huqHrssg
  • 56. Task: Assign each image pixel a category label (person, wall, road, dog, ..) Example Dataset: PASCAL VOC2012 Challenge SegModel, Deep-Labv2, and many more.. Image Semantic Segmentation
  • 57. Task: Given an image and a question, answer the question. Question Answering
  • 58. Task: Given a dataset of images, generate new artificial image that look real! State of the art: Generative Adversarial Networks (GANs) Image Generation
  • 59. Steering the wheels of self driving cars Super resolution Image completion Saliency detection Human pose detection Facial keypoints (nose, eye, ear..) Image captioning Activity recognition And many, many more tasks ….
  • 60. SagivTech Traffic Lights Detection using DL With dlib & a few images from Google street view https://www.youtube.com/watch?v=jg444J2AmOI https://www.youtube.com/watch?v=YV4y1iqo_TQ
  • 61. What can you do to get into this world ? • Theory: – Get to know basic computer vision and some “classical” algorithms, e.g. Viola Jones, SIFT, etc. – Get to know Deep Learning, e.g. CS 231 by Stanford • Practice: – Get to know and use OpenCV – Hands on experience with Caffe, Tensor Flow etc.
  • 62. Technology and Professional Services company Established in 2009 and headquartered in Israel SagivTech Snapshot • What we do: • Technological Solutions • Projects • Research • Core domains: • Computer Vision • Deep Learning • Code Optimization • GPU Computing
  • 63.
  • 64. Thanks for the following SagivTech team members and collaborators: Acknowledgements • Prof. Peter Maass, University of Bremen • Prof. Pierre Vandergheynst, EPFL • Dov Eilot, SagivTech • Jacob Gildenblat, SagivTech • Amir Egozi, SceneNet project
  • 65. Thank You F o r m o r e i n f o r m a t i o n p l e a s e c o n t a c t C h e n S a g i v c h e n @ s a g i v t e c h . c o m + 9 7 2 5 4 7 7 0 6 0 8 9