2. PROFILE
Igi Ardiyanto
Field of Interest :
Robotics
Computer Vision
Intelligent Transportation System
Embedded System
Parallel Computing
Deep Learning
More Information ??
http://te.ugm.ac.id/~igi
4. COMPUTER VISION
Make computers understand images and video
What kind of scene?
Where are the people?
How far is the
building?
Where is Waldo?
Like when human “sees” something …..
5. VISION IS REALLY HARD
Vision is an amazing feat of natural
intelligence
Visual cortex occupies about 50%
of Macaque brain
More human brain devoted to
vision than anything else
Sik…sik…. Iki
dolanan opo
panganan, cuk?
6. OPTICAL CHARACTER RECOGNITION (OCR)
Digit recognition, AT&T labs
http://www.research.att.com/~yann/
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
License plate readers
http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
10. MACHINE LEARNING
Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
There is no need to “learn” to calculate payroll
Learning is used when:
Human expertise does not exist (navigating on Mars),
Humans are unable to explain their expertise (speech
recognition)
Solution changes in time (routing on a computer network)
Solution needs to be adapted to particular cases (user biometrics)
11. COMPUTER VISION MEETS MACHINE LEARNING
Dog
Cat
Raccoon
Dog
Train:
Deploy:
Training
Labels
Training
Image
Features
Prediction
Image
Features
Learned
model
12. IMAGE FEATURES ??
Color
Histograms
Shape
…
Slide credit: L. Lazebnik
13. VERY BRIEF TOUR OF SOME CLASSIFIERS
K-nearest neighbor
SVM
Boosted Decision Trees
Neural networks
Naïve Bayes
Bayesian network
Gaussian Logistic regression
Random Forests
RBMs
Etc.
17. 1) A host of statistical machine
learning techniques
2) Enables the automatic learning
of feature hierarchies
3) Generally based on artificial
neural networks
DEEP LEARNING
18. English and Mandarin speech recognition
Transition from English to Mandarin made simpler by end-to-end
DL
No feature engineering or Mandarin-specificsrequired
More accurate than humans
Error rate 3.7% vs. 4% for human tests
http://arxiv.org/abs/1512.02595
END-TO-END DEEP LEARNING FOR ENGLISH AND MANDARIN SPEECH
RECOGNITION
BAIDU DEEP SPEECH 2
19. FIRST COMPUTER PROGRAM TO BEAT A HUMAN GO PROFESSIONAL
Training DNNs : 3 weeks, 340 million training steps on 50 GPUs
Play : Asynchronousmulti-threadedsearch
Simulations on CPUs, policy and value DNNs in parallel on
GPUs Single machine: 40 search threads, 48 CPUs, and 8
GPUs
Distributed version: 40 search threads, 1202 CPUs and
176 GPUs
Outcome: Beat both European and World Go champions in
best of 5 matches
ALPHA-GO
20. DEEP LEARNING EVERYWHERE
INTERNET & CLOUD
Image Classification
Speech Recognition
Language Translation
Language Processing
Sentiment Analysis
Recommendation
MEDIA &
ENTERTAINMENT
Video Captioning
Video Search
Real Time
Translation
AUTONOMOUS MACHINES
Pedestrian Detection
Lane Tracking
Recognize Traffic Sign
SECURITY &
DEFENSE
Face Detection
Video Surveillance
Satellite Imagery
MEDICINE & BIOLOGY
Cancer Cell
Detection Diabetic
Grading Drug
Discovery
21. So what’s the f*** there
for Python?
Computer Vision, Machine, and Deep Learning with Python
22. WHAT IS PYTHON?
General purpose interpreted programming language
Widely used by scientists and programmers of all stripes
Supported by many 3rd-party libraries (currently 21,054 on the
main python package website)
Free!
23. WHY IS IT WELL-SUITED TO SCIENCE?
NumPy
Numerical library for python
Written in C, wrapped by python
Fast
Scipy
Built on top of NumPy (i.e. Also fast!)
Common maths, science, engineering routines
Matplotlib
Hugely flexible plotting library
Similar syntax to Matlab
Produces publication-quality output
24. WHY IS PYTHON BETTER THAN WHAT I USE NOW?
It can do everything
Fast mathematical operations
Easy file manipulation
Format conversion
Plotting
Scripting
Command line
OK, not everything
Write thesis for you
25. Python has a wide range of deep learning-related libraries available
Low level
High level
(efficient gpu-powered math)
(theano-wrapper, models in python code,
abstracts theano away)
(wrapper for theano, yaml, experiment-oriented)
(computer-vision oriented DL framework,
model-zoo, prototxt model definitions)
pythonification ongoing!
(theano-extension, models in python code,
theano not hidden)
and of course: