SlideShare a Scribd company logo
1 of 106
Download to read offline
Deep Learning:
Application landscape
Grigory Sapunov
Private Event / Mar 2018
gs@inten.to
The Context
AI/ML/DL
● Artificial Intelligence (AI) is a broad field of
study dedicated to complex problem solving.
● Machine Learning (ML) is usually considered
as a subfield of AI. ML is a data-driven
approach focused on creating algorithms that
has the ability to learn from the data without
being explicitly programmed.
● Deep Learning (DL) is a subfield of ML focused
on deep neural networks (NN) able to
automatically learn hierarchical
representations.
Different approaches to solving problems
Deep Learning approach
Deep Learning success: why now?
Recent progress
Typical image-related tasks
https://research.facebook.com/blog/learning-to-segment/
Detection task is harder than classification, but both are almost done.
And with better-than-human quality.
Human quality is estimated as ~5.1% error rate on this dataset (0.051)
From Lex Fridman slides: https://selfdrivingcars.mit.edu/
Image recognition quality on ImageNet dataset
Example: Object Detection
Example: Activity Recognition
Example: Semantic Segmentation
https://stanfordmlgroup.github.io/projects/chexnet/
Example: Radiologist-Level Pneumonia Detection
Example: Image Colorization
Learning Representations for Automatic Colorization https://arxiv.org/abs/1603.06668
Example: Photo-realistic Style Transfer
https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
Example: Background removal
https://towardsdatascience.com/background-removal-with-deep-learning-c4f2104b3157
Example: Object removal
http://hi.cs.waseda.ac.jp/~iizuka/projects/completion/en/
Example: Image completion
http://hi.cs.waseda.ac.jp/~iizuka/projects/completion/en/
Example: Learning Lip Sync from Audio
http://grail.cs.washington.edu/projects/AudioToObama/
https://www.youtube.com/watch?v=9Yq67CjDqvw
Example: DeepFakes, FakeApp
https://thenextweb.com/artificial-intelligence/2018/02/21/deepfakes-algorithm-nails-donald-trump-in-most-convincing-fake-yet/
New kid on the block: GAN
https://www.technologyreview.com/lists/technologies/2018/
Example: Generating images by GAN
Progressive Growing of GANs for Improved Quality, Stability, and Variation,
https://github.com/tkarras/progressive_growing_of_gans
https://www.youtube.com/watch?v=XOxxPcy5Gr4
GAN rapid evolution
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
https://arxiv.org/abs/1802.07228
Example: Multi-Domain Image-to-Image Translation
https://github.com/yunjey/StarGAN
Example: Unsupervised Image-to-Image Translation
http://research.nvidia.com/publication/2017-12_Unsupervised-Image-to-Image-Translation
https://www.youtube.com/watch?v=nlyXoX2aIek
https://arxiv.org/abs/1703.00848
But...
What’s with the Big Picture?
https://www.engadget.com/2018/01/23/photo-stitch-ai-fail-the-big-picture/
Still some issues exist: Reasoning
Deep learning is mainly about perception, but there is a lot of inference involved in
everyday human reasoning.
● Neural networks lack common sense
● Cannot find information by inference
● Cannot explain the answer
○ It could be a must-have requirement in
some areas, i.e. law, medicine.
○ GDPR is coming
The most fruitful approach is likely to be a hybrid
neural-symbolic system. Topic of active research
right now.
Adversarial Examples
Adversarial Examples
https://spectrum.ieee.org/cars-that-think/transportation/sensors/slight-street-sign-modifications-can-fool-machine-learning-algorithms
Robust Adversarial Examples
https://blog.openai.com/robust-adversarial-inputs/
Physical Adversarial Examples
http://www.labsix.org/physical-objects-that-fool-neural-nets/
Adversarial Patch
https://arxiv.org/abs/1712.09665
Computer & Human Adversarial Examples
https://spectrum.ieee.org/the-human-os/robotics/artificial-intelligence/hacking-the-brain-with-adversarial-images
Text Processing / NLP
Deep Learning and NLP
Variety of tasks:
● Classification: language detection, genre and topic detection,
positive/negative sentiment analysis, authorship detection, …
● Fact extraction: people and company names, geography, prices, dates,
product names, …
● Language modeling, Part of speech recognition
● Key phrase extraction
● Finding synonyms
● Machine translation
● Search (written and spoken)
● Question answering
● Dialog systems
Example: Entity Extraction
https://aws.amazon.com/blogs/aws/amazon-comprehend-continuously-trained-natural-language-processing/
Example: Neural Machine Translation vs. other
https://research.googleblog.com/2016/09/a-neural-network-for-machine.html
Example: Machine Translation Quality Evolution
https://bit.ly/mt_mar2018
Example: Legal document analyzing / NDA
https://www.prnewswire.com/news-releases/artificial-intelligence-more-accurate-than-lawyers-for-reviewing-contracts-new-study-reveals-300603781.html
“The highest performing lawyer in the
study achieved 94% accuracy -
matching the AI - while the lowest
performing lawyer achieved an average
67% accuracy. The challenge took the
LawGeex AI 26 seconds to complete,
compared to an average of 92 minutes
for the lawyers. The longest time taken
by a lawyer to complete the test was
156 minutes, and the shortest time was
51 minutes.”
Example: Legal document analyzing / Privacy policies
https://www.wired.com/story/polisis-ai-reads-privacy-policies-so-you-dont-have-to/
“In about 30 seconds, Polisis can read
a privacy policy it's never seen before
and extract a readable summary,
displayed in a graphic flow chart, of
what kind of data a service collects,
where that data could be sent, and
whether a user can opt out of that
collection or sharing.”
https://research.googleblog.com/2017/05/efficient-smart-reply-now-for-gmail.html
Example: Text generation / Smart Reply
https://arxiv.org/abs/1708.08151 Automated Crowdturfing Attacks and Defenses in Online Review Systems
Example: Review generation (Human-like!)
Example: Seq2SQL
https://arxiv.org/abs/1709.00103 Seq2SQL: Generating Structured Queries from Natural Language ...
Example: Question Answering
SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250
https://rajpurkar.github.io/SQuAD-explorer/
http://u.cs.biu.ac.il/~yogo/squad-vs-human.pdf
https://blog.drift.com/chatbots-report/
Still many problems with chatbots
http://www.eweek.com/big-data-and-analytics/state-of-chatbots-in-2018-rapidly-moving-into-the-mainstream
Key PointSource findings include:
● When AI is present, half of (49 percent) consumers are already willing to
shop more frequently, 34 percent will spend more money and 38 percent will
share their experiences with friends and family.
● 51 percent of consumers still anticipate frustrations around chatbots not
understanding what they’re looking for; 44 percent question the accuracy
of the information chatbots provide.
● More than half (54 percent) of consumers would still prefer to talk to a
customer service representative.
● If a customer is on hold with a customer service rep, 34 percent of customers
want to switch to a chatbot after 5 minutes have passed. However, 59
percent get frustrated if a chatbot doesn’t resolve their inquiry in that same
time.
Text + Image / Multimodal learning
DL/Multi-modal Learning
Deep Learning models become multi-modal: they use 2+ modalities
simultaneously, i.e.:
● Image caption generation: images + text
● Search Web by an image: images + text
● Video describing: the same but added time dimension
● Visual question answering: images + text
● Speech recognition: audio + video (lip motion)
● Image classification and navigation: RGB-D (color + depth)
Will be possible to match different modalities easily.
Example: Caption Generation (text by image)
http://arxiv.org/abs/1411.4555 “Show and Tell: A Neural Image Caption Generator”
Example: NeuralTalk and Walk
Ingredients:
● https://github.com/karpathy/neuraltalk2
Project for learning Multimodal Recurrent Neural Networks that describe
images with sentences
● Webcam/notebook
Result:
● https://vimeo.com/146492001
More hacking: NeuralTalk and Walk
Example: Video description (text by video)
https://vsubhashini.github.io/s2vt.html
Example: Image generation by text
AttnGAN: Fine-Grained Text to Image Generation with
Attentional Generative Adversarial Networks, https://arxiv.org/abs/1711.10485
Example: Code generation by image
pix2code: Generating Code from a Graphical User Interface Screenshot,
https://arxiv.org/abs/1705.07962
SketchCode: Go from idea to HTML in 5 seconds
Automated front-end development using deep learning
https://blog.insightdatascience.com/automated-front-end-development-using-deep-learning-3169dd086e82
Speech
Speech Recognition: Word Error Rate (WER) [2017]
“Google’s speech recognition technology now has a 4.9% word error rate” (2017)
https://venturebeat.com/2017/05/17/googles-speech-recognition-technology-now-has-a-4-9-word-error-rate/
Microsoft “It can now transcribe human speech with a 5.1% error rate”
http://uk.businessinsider.com/microsofts-speech-recognition-5-1-error-rate-human-level-accuracy-2017-8
IBM. “The company has reached a 5.5 percent word error rate that's nearly on par
with humans.”
https://www.engadget.com/2017/03/10/ibm-speech-recognition-accuracy-record/
Speech Recognition: Lip Reading
“This lip reading performance beats a professional lip reader on videos from BBC
television, and we also demonstrate that visual information helps to improve
speech recognition performance even when the audio is available.”
Lip Reading Sentences in the Wild, https://arxiv.org/abs/1611.05358
“To the best of our knowledge, LipNet is the first end-to-end sentence-level
lipreading model that simultaneously learns spatiotemporal visual features and a
sequence model. On the GRID corpus, LipNet achieves 95.2% accuracy in
sentence-level, overlapped speaker split task, outperforming experienced human
lipreaders and the previous 86.4% word-level state-of-the-art accuracy.“
LipNet: End-to-End Sentence-level Lipreading, https://arxiv.org/abs/1611.01599
Case: Amazon Echo
Amazon Alexa is in more than 20 million devices. The vast majority of these are in the
Amazon Echo portfolio.
https://www.voicebot.ai/2017/10/27/bezos-says-20-million-amazon-alexa-devices-sold/
Case: Skype Live Translation
Translating voice calls and video calls in 8 languages and instant messages in over 50.
https://www.skype.com/en/features/skype-translator/
Case: Google Pixel Buds
Google packed its headphones (in combination with the Pixel 2) with the power to
translate between 40 languages, literally in real-time. The company has finally done
what science fiction and countless Kickstarters have been promising us, but failing
to deliver on, for years. This technology could fundamentally change how we
communicate across the global community.
https://www.engadget.com/2017/10/04/google-pixel-buds-translation-change-the-world/
● “Our approach does not use complex linguistic and acoustic features as input. Instead, we generate
human-like speech from text using neural networks trained using only speech examples and
corresponding text transcripts.”
Speech Synthesis: Tacotron 2 (Google, 2017)
https://research.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html
● “Deep Voice 3 introduces a completely novel neural network architecture for speech synthesis. This
novel architecture trains an order of magnitude faster, allowing us to scale over 800 hours of
training data and synthesize speech from over 2,400 voices, which is more than any other
previously published text-to-speech model.”
Speech Synthesis: Deep Voice 3 (Baidu, 2017)
http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/
But the same problem with adversarial examples...
Did you hear that? Adversarial Examples Against Automatic Speech Recognition
https://arxiv.org/abs/1801.00554
Did you hear that? Adversarial Examples Against Automatic Speech Recognition
https://arxiv.org/abs/1801.00554
[Robotic] Control
Drone control
http://www.digitaltrends.com/cool-tech/swiss-drone-ai-follows-trails/
This drone can automatically follow forest
trails to track down lost hikers
Car control
Meet the 26-Year-Old Hacker Who Built a
Self-Driving Car... in His Garage
https://www.youtube.com/watch?v=KTrgRYa2wbI
Car driving
https://www.youtube.com/watch?v=YuyT2SDcYrU
“Actually a “Perception to Action” system. The visual perception and control
system is a Deep learning architecture trained end to end to transform pixels
from the cameras into steering angles. And this car uses regular color cameras,
not LIDARS like the Google cars. It is watching the driver and learns.”
Example: Sensorimotor Deep Learning
“In this project we aim to develop deep learning techniques that can be deployed
on a robot to allow it to learn directly from trial-and-error, where the only
information provided by the teacher is the degree to which it is succeeding at the
current task.”
http://rll.berkeley.edu/deeplearningrobotics/
Games
https://blog.openai.com/dota-2/
https://blog.openai.com/more-on-dota-2/
AlphaGo Lee: Computer-Human 4:1
AlphaGo Zero
AlphaZero
Poker: Libratus
http://www.dailymail.co.uk/sciencetech/article-4177262/AI-beats-professional-poker-players-Pittsburgh.html
https://fr.pokernews.com/news/2017/01/ai-bot-libratus-poker-no-limit-wins-science-32312.htm
“The research has implications for situations where information is incomplete and
misinformation can be given, such as business negotiations, military strategy,
cybersecurity and planning of medical treatments.”
ML for Systems
ML in datacenters
“We’ve managed to reduce the amount of energy we use for cooling by up to 40 percent.”
https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
Device Placement with Reinforcement Learning
Device Placement Optimization with Reinforcement Learning
https://arxiv.org/abs/1706.04972
Neural Architecture Search
Efficient Neural Architecture Search via Parameter Sharing
https://arxiv.org/abs/1802.03268
Examples
- Improving ML algorithms: Device placement, Architecture search, Optimizer
search, Ensembling, ...
- Optimizing indexes in DB (The Case for Learned Index Structures,
https://arxiv.org/abs/1712.01208)
- Improving datacenter efficiency: optimize cooling, optimize virtual machine
placement, ...
- …
Computer Systems are filled with heuristics that work well “in general case”. But
they generally don’t adapt to actual pattern of usage and don’t take into account
available context.
We can use ML anywhere we’re using heuristics to make a decision!
See Jeff Dean talk at NIPS 2017
http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
Examples
Compilers: instruction scheduling, register allocation, loop nest parallelization
strategies, …
Networking: TCP window size decisions, backoff for retransmits, data
compression, ...
Operating systems: process scheduling, buffer cache insertion/replacement, file
system prefetching, …
Job scheduling systems: which tasks/VMs to co-locate on same machine, which
tasks to pre-empt, ...
ASIC design: physical circuit layout, test case selection, …
See Jeff Dean talk at NIPS 2017
http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
See Jeff Dean talk at NIPS 2017
http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
Data & Models
No dataset — no deep learning
Deep learning requires a lot of data (otherwise simple models could be better).
But sometimes you have no dataset…
Nonetheless several ways available:
● Transfer learning
● Data augmentation
● Mechanical Turk
● Unsupervised pre-training
● moving towards one-shot and zero-shot learning
● …
The data scale versus the model performance
http://www.spacemachine.net/views/2016/3/datasets-over-algorithms
Importance of Datasets
Data & Models vs. Code
The almost same state-of-the-art code is mostly available for all the market.
Currently the real differentiator is a data or trained models (the data derivative
thing). Using a publicly available code/algorithm with unique data it’s possible to
create a better quality model than with the highly-specialized code with public
data.
There is a space for a new type of infrastructure
● Data and algorithm marketplaces
● Model marketplaces and model repositories
● AutoML (already appearing)
● Model management
● Model quality evaluation
● ...
Hardware
Still some issues exist: Computing power
DL requires a lot of computations. Without a cluster or GPU machines
much more time is required.
● Currently GPUs (mostly NVIDIA) is the only choice
● FPGA/ASIC are coming into this field (Google TPU gen.2, Bitmain Sophon,
Intel 2018+). The situation resembles the path of Bitcoin mining
● Neuromorphic computing is on the rise (IBM TrueNorth, Intel, memristors, etc)
● Quantum computing can benefit machine learning as well (but probably it won’t be
a desktop or in-house server solutions)
NVIDIA slides: http://www.nvidia.com/content/events/geoInt2015/LBrown_DL.pdf
Computing power grows
https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664
Distributed training is a commodity now
Image from: https://github.com/uber/horovod
Case: AlphaGo Zero
https://deepmind.com/blog/alphago-zero-learning-scratch/
Trends: Supercomputer performance (GFLOPS FP64)
https://en.wikipedia.org/wiki/TOP500
Personal Supercomputers
● NVIDIA DGX-1 Server ($149,000)
Performance: 1000 TFLOPS FP16, 125 TFLOPS FP32
* NVIDIA DGX-2 (16 TESLA V100, 2 PFLOPS FP16) is just announced
● DeepLearning11 ($16,500, contains 10x NVIDIA GeForce GTX 1080 Ti)
Performance: 100 TFLOPS FP32
● NVIDIA GTX Titan V gaming card ($3000) 6.9 TFLOPS FP64 (! it is not usually
reported FP16 performance !)
○ Corresponds to the best supercomputer in the world at 2001–2002 (IBM ASCI
White with 7.226 TFLOPS peak speed) and a supercomputer on 500th place (still
a cool supercomputer) of the TOP500 list in November 2007 (the entry level to the
list was the 5.9 TFlop/s)
● For comparison: Huawei Mate 10 smartphone with Kirin 970 Neural Network
Processing Unit, 1.92 TFLOPS FP16
○ A similar performance (but FP64) had the top performing supercomputer of 1997
https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664
AI at the edge
● NVidia Jetson TK1/TX1/TX2
○ 192/256/256 CUDA Cores
○ 64/64/128-bit 4/4/6-Core ARM CPU, 2/4/8 Gb Mem
○ Xavier is coming
● Tablets, Smartphones
○ Qualcomm Snapdragon 845
○ Apple A11 Bionic
○ Huawei Kirin 970
● Raspberry Pi 3 (1.2 GHz 4-core)
● Movidius Neural Compute Stick
References:
Hardware for Deep Learning series of posts:
https://blog.inten.to/hardware-for-deep-learning-current-state-and-trends-51c01ebbb6dc
● Part 1: Introduction and Executive summary
● Part 2: CPU
● Part 3: GPU
● Part 4: FPGA
● Part 5: ASIC
● Part 6: Mobile AI
● Part 7: Neuromorphic computing
● Part 8: Quantum computing
Security
https://blog.openai.com/preparing-for-malicious-uses-of-ai/
AI changes the landscape of threats
● Expansion of existing threats
○ The costs of attacks are lowered
■ Set of actors who can carry out attacks expands
■ The rate and scale of attacks can increase
■ The set of potential targets can expand
● Introduction of new threats
○ AI systems can compete tasks that would be otherwise impractical for
humans
○ Exploiting vulnerabilities of AI systems
● Change to the typical character of threats
○ Attacks can be especially effective
○ Finely targeted
○ Difficult to attribute
Many other issues exist as well
● Unintentional forms of AI misuse like algorithmic bias
● Indirect threats: mass unemployment, or other second- or third-order effects
from the deployment of AI technology
● System-level threats that would come from the dynamic interaction between
non-malicious actors, e.g. “race to the bottom” on AI safety
● Existential risks from the human-level AI
● Unclear regulation
On the good side
https://ru.linkedin.com/in/grigorysapunov
gs@inten.to
Thanks!

More Related Content

Similar to Dl applicationlandscape-mar2018-180405144127

Real World NLP, ML, and Big Data
Real World NLP, ML, and Big DataReal World NLP, ML, and Big Data
Real World NLP, ML, and Big DataDevin Bost
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Semantic.edu, an introduction
Semantic.edu, an introductionSemantic.edu, an introduction
Semantic.edu, an introductionBryan Alexander
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
AI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityAI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityCihan Özhan
 
Keynote at-icpc-2020
Keynote at-icpc-2020Keynote at-icpc-2020
Keynote at-icpc-2020Ralf Laemmel
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)Tao Xie
 
SBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisSBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisTao Xie
 
XAI or DIE at Data Science Summit 2019
XAI or DIE at Data Science Summit 2019XAI or DIE at Data Science Summit 2019
XAI or DIE at Data Science Summit 2019Przemek Biecek
 
Designing nlp-js-extension
Designing nlp-js-extensionDesigning nlp-js-extension
Designing nlp-js-extensionAlain Lompo
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsEddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsMichael Bernstein
 
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...Chetan Khatri
 
[Nov 26] introduction to AI / ML
[Nov 26] introduction to AI / ML[Nov 26] introduction to AI / ML
[Nov 26] introduction to AI / MLAnh Nguyen
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfAnant Corporation
 
Designing Powerful Web Applications - Monterey
Designing Powerful Web Applications - MontereyDesigning Powerful Web Applications - Monterey
Designing Powerful Web Applications - MontereyDave Malouf
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software AnalyticsMargaret-Anne Storey
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
Train, explain, acclaim. Build a good model in three steps
Train, explain, acclaim.  Build a good model in three stepsTrain, explain, acclaim.  Build a good model in three steps
Train, explain, acclaim. Build a good model in three stepsPrzemek Biecek
 

Similar to Dl applicationlandscape-mar2018-180405144127 (20)

Real World NLP, ML, and Big Data
Real World NLP, ML, and Big DataReal World NLP, ML, and Big Data
Real World NLP, ML, and Big Data
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Semantic.edu, an introduction
Semantic.edu, an introductionSemantic.edu, an introduction
Semantic.edu, an introduction
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software Engineering
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
AI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityAI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision Security
 
Keynote at-icpc-2020
Keynote at-icpc-2020Keynote at-icpc-2020
Keynote at-icpc-2020
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
SBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisSBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and Analysis
 
XAI or DIE at Data Science Summit 2019
XAI or DIE at Data Science Summit 2019XAI or DIE at Data Science Summit 2019
XAI or DIE at Data Science Summit 2019
 
Designing nlp-js-extension
Designing nlp-js-extensionDesigning nlp-js-extension
Designing nlp-js-extension
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsEddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status Streams
 
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
HKOSCon18 - Chetan Khatri - Open Source AI / ML Technologies and Application ...
 
[Nov 26] introduction to AI / ML
[Nov 26] introduction to AI / ML[Nov 26] introduction to AI / ML
[Nov 26] introduction to AI / ML
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
Designing Powerful Web Applications - Monterey
Designing Powerful Web Applications - MontereyDesigning Powerful Web Applications - Monterey
Designing Powerful Web Applications - Monterey
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
Train, explain, acclaim. Build a good model in three steps
Train, explain, acclaim.  Build a good model in three stepsTrain, explain, acclaim.  Build a good model in three steps
Train, explain, acclaim. Build a good model in three steps
 
AI in security
AI in securityAI in security
AI in security
 

Recently uploaded

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 

Recently uploaded (20)

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

Dl applicationlandscape-mar2018-180405144127

  • 1. Deep Learning: Application landscape Grigory Sapunov Private Event / Mar 2018 gs@inten.to
  • 3. AI/ML/DL ● Artificial Intelligence (AI) is a broad field of study dedicated to complex problem solving. ● Machine Learning (ML) is usually considered as a subfield of AI. ML is a data-driven approach focused on creating algorithms that has the ability to learn from the data without being explicitly programmed. ● Deep Learning (DL) is a subfield of ML focused on deep neural networks (NN) able to automatically learn hierarchical representations.
  • 4. Different approaches to solving problems
  • 8. Typical image-related tasks https://research.facebook.com/blog/learning-to-segment/ Detection task is harder than classification, but both are almost done. And with better-than-human quality.
  • 9. Human quality is estimated as ~5.1% error rate on this dataset (0.051) From Lex Fridman slides: https://selfdrivingcars.mit.edu/ Image recognition quality on ImageNet dataset
  • 14. Example: Image Colorization Learning Representations for Automatic Colorization https://arxiv.org/abs/1603.06668
  • 15. Example: Photo-realistic Style Transfer https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
  • 19. Example: Learning Lip Sync from Audio http://grail.cs.washington.edu/projects/AudioToObama/ https://www.youtube.com/watch?v=9Yq67CjDqvw
  • 21. New kid on the block: GAN https://www.technologyreview.com/lists/technologies/2018/
  • 22. Example: Generating images by GAN Progressive Growing of GANs for Improved Quality, Stability, and Variation, https://github.com/tkarras/progressive_growing_of_gans https://www.youtube.com/watch?v=XOxxPcy5Gr4
  • 23. GAN rapid evolution The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation https://arxiv.org/abs/1802.07228
  • 24. Example: Multi-Domain Image-to-Image Translation https://github.com/yunjey/StarGAN
  • 25. Example: Unsupervised Image-to-Image Translation http://research.nvidia.com/publication/2017-12_Unsupervised-Image-to-Image-Translation https://www.youtube.com/watch?v=nlyXoX2aIek https://arxiv.org/abs/1703.00848
  • 26.
  • 28. What’s with the Big Picture? https://www.engadget.com/2018/01/23/photo-stitch-ai-fail-the-big-picture/
  • 29.
  • 30. Still some issues exist: Reasoning Deep learning is mainly about perception, but there is a lot of inference involved in everyday human reasoning. ● Neural networks lack common sense ● Cannot find information by inference ● Cannot explain the answer ○ It could be a must-have requirement in some areas, i.e. law, medicine. ○ GDPR is coming The most fruitful approach is likely to be a hybrid neural-symbolic system. Topic of active research right now.
  • 36. Computer & Human Adversarial Examples https://spectrum.ieee.org/the-human-os/robotics/artificial-intelligence/hacking-the-brain-with-adversarial-images
  • 38. Deep Learning and NLP Variety of tasks: ● Classification: language detection, genre and topic detection, positive/negative sentiment analysis, authorship detection, … ● Fact extraction: people and company names, geography, prices, dates, product names, … ● Language modeling, Part of speech recognition ● Key phrase extraction ● Finding synonyms ● Machine translation ● Search (written and spoken) ● Question answering ● Dialog systems
  • 40. Example: Neural Machine Translation vs. other https://research.googleblog.com/2016/09/a-neural-network-for-machine.html
  • 41. Example: Machine Translation Quality Evolution https://bit.ly/mt_mar2018
  • 42. Example: Legal document analyzing / NDA https://www.prnewswire.com/news-releases/artificial-intelligence-more-accurate-than-lawyers-for-reviewing-contracts-new-study-reveals-300603781.html “The highest performing lawyer in the study achieved 94% accuracy - matching the AI - while the lowest performing lawyer achieved an average 67% accuracy. The challenge took the LawGeex AI 26 seconds to complete, compared to an average of 92 minutes for the lawyers. The longest time taken by a lawyer to complete the test was 156 minutes, and the shortest time was 51 minutes.”
  • 43. Example: Legal document analyzing / Privacy policies https://www.wired.com/story/polisis-ai-reads-privacy-policies-so-you-dont-have-to/ “In about 30 seconds, Polisis can read a privacy policy it's never seen before and extract a readable summary, displayed in a graphic flow chart, of what kind of data a service collects, where that data could be sent, and whether a user can opt out of that collection or sharing.”
  • 45. https://arxiv.org/abs/1708.08151 Automated Crowdturfing Attacks and Defenses in Online Review Systems Example: Review generation (Human-like!)
  • 46. Example: Seq2SQL https://arxiv.org/abs/1709.00103 Seq2SQL: Generating Structured Queries from Natural Language ...
  • 47. Example: Question Answering SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250 https://rajpurkar.github.io/SQuAD-explorer/ http://u.cs.biu.ac.il/~yogo/squad-vs-human.pdf
  • 49. Still many problems with chatbots http://www.eweek.com/big-data-and-analytics/state-of-chatbots-in-2018-rapidly-moving-into-the-mainstream Key PointSource findings include: ● When AI is present, half of (49 percent) consumers are already willing to shop more frequently, 34 percent will spend more money and 38 percent will share their experiences with friends and family. ● 51 percent of consumers still anticipate frustrations around chatbots not understanding what they’re looking for; 44 percent question the accuracy of the information chatbots provide. ● More than half (54 percent) of consumers would still prefer to talk to a customer service representative. ● If a customer is on hold with a customer service rep, 34 percent of customers want to switch to a chatbot after 5 minutes have passed. However, 59 percent get frustrated if a chatbot doesn’t resolve their inquiry in that same time.
  • 50. Text + Image / Multimodal learning
  • 51. DL/Multi-modal Learning Deep Learning models become multi-modal: they use 2+ modalities simultaneously, i.e.: ● Image caption generation: images + text ● Search Web by an image: images + text ● Video describing: the same but added time dimension ● Visual question answering: images + text ● Speech recognition: audio + video (lip motion) ● Image classification and navigation: RGB-D (color + depth) Will be possible to match different modalities easily.
  • 52. Example: Caption Generation (text by image) http://arxiv.org/abs/1411.4555 “Show and Tell: A Neural Image Caption Generator”
  • 53. Example: NeuralTalk and Walk Ingredients: ● https://github.com/karpathy/neuraltalk2 Project for learning Multimodal Recurrent Neural Networks that describe images with sentences ● Webcam/notebook Result: ● https://vimeo.com/146492001
  • 55. Example: Video description (text by video) https://vsubhashini.github.io/s2vt.html
  • 56. Example: Image generation by text AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks, https://arxiv.org/abs/1711.10485
  • 57. Example: Code generation by image pix2code: Generating Code from a Graphical User Interface Screenshot, https://arxiv.org/abs/1705.07962
  • 58. SketchCode: Go from idea to HTML in 5 seconds Automated front-end development using deep learning https://blog.insightdatascience.com/automated-front-end-development-using-deep-learning-3169dd086e82
  • 60. Speech Recognition: Word Error Rate (WER) [2017] “Google’s speech recognition technology now has a 4.9% word error rate” (2017) https://venturebeat.com/2017/05/17/googles-speech-recognition-technology-now-has-a-4-9-word-error-rate/ Microsoft “It can now transcribe human speech with a 5.1% error rate” http://uk.businessinsider.com/microsofts-speech-recognition-5-1-error-rate-human-level-accuracy-2017-8 IBM. “The company has reached a 5.5 percent word error rate that's nearly on par with humans.” https://www.engadget.com/2017/03/10/ibm-speech-recognition-accuracy-record/
  • 61. Speech Recognition: Lip Reading “This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available.” Lip Reading Sentences in the Wild, https://arxiv.org/abs/1611.05358 “To the best of our knowledge, LipNet is the first end-to-end sentence-level lipreading model that simultaneously learns spatiotemporal visual features and a sequence model. On the GRID corpus, LipNet achieves 95.2% accuracy in sentence-level, overlapped speaker split task, outperforming experienced human lipreaders and the previous 86.4% word-level state-of-the-art accuracy.“ LipNet: End-to-End Sentence-level Lipreading, https://arxiv.org/abs/1611.01599
  • 62. Case: Amazon Echo Amazon Alexa is in more than 20 million devices. The vast majority of these are in the Amazon Echo portfolio. https://www.voicebot.ai/2017/10/27/bezos-says-20-million-amazon-alexa-devices-sold/
  • 63. Case: Skype Live Translation Translating voice calls and video calls in 8 languages and instant messages in over 50. https://www.skype.com/en/features/skype-translator/
  • 64. Case: Google Pixel Buds Google packed its headphones (in combination with the Pixel 2) with the power to translate between 40 languages, literally in real-time. The company has finally done what science fiction and countless Kickstarters have been promising us, but failing to deliver on, for years. This technology could fundamentally change how we communicate across the global community. https://www.engadget.com/2017/10/04/google-pixel-buds-translation-change-the-world/
  • 65. ● “Our approach does not use complex linguistic and acoustic features as input. Instead, we generate human-like speech from text using neural networks trained using only speech examples and corresponding text transcripts.” Speech Synthesis: Tacotron 2 (Google, 2017) https://research.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html
  • 66. ● “Deep Voice 3 introduces a completely novel neural network architecture for speech synthesis. This novel architecture trains an order of magnitude faster, allowing us to scale over 800 hours of training data and synthesize speech from over 2,400 voices, which is more than any other previously published text-to-speech model.” Speech Synthesis: Deep Voice 3 (Baidu, 2017) http://research.baidu.com/deep-voice-3-2000-speaker-neural-text-speech/
  • 67. But the same problem with adversarial examples... Did you hear that? Adversarial Examples Against Automatic Speech Recognition https://arxiv.org/abs/1801.00554
  • 68. Did you hear that? Adversarial Examples Against Automatic Speech Recognition https://arxiv.org/abs/1801.00554
  • 70. Drone control http://www.digitaltrends.com/cool-tech/swiss-drone-ai-follows-trails/ This drone can automatically follow forest trails to track down lost hikers
  • 71. Car control Meet the 26-Year-Old Hacker Who Built a Self-Driving Car... in His Garage https://www.youtube.com/watch?v=KTrgRYa2wbI
  • 72. Car driving https://www.youtube.com/watch?v=YuyT2SDcYrU “Actually a “Perception to Action” system. The visual perception and control system is a Deep learning architecture trained end to end to transform pixels from the cameras into steering angles. And this car uses regular color cameras, not LIDARS like the Google cars. It is watching the driver and learns.”
  • 73. Example: Sensorimotor Deep Learning “In this project we aim to develop deep learning techniques that can be deployed on a robot to allow it to learn directly from trial-and-error, where the only information provided by the teacher is the degree to which it is succeeding at the current task.” http://rll.berkeley.edu/deeplearningrobotics/
  • 78. Poker: Libratus http://www.dailymail.co.uk/sciencetech/article-4177262/AI-beats-professional-poker-players-Pittsburgh.html https://fr.pokernews.com/news/2017/01/ai-bot-libratus-poker-no-limit-wins-science-32312.htm “The research has implications for situations where information is incomplete and misinformation can be given, such as business negotiations, military strategy, cybersecurity and planning of medical treatments.”
  • 80. ML in datacenters “We’ve managed to reduce the amount of energy we use for cooling by up to 40 percent.” https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
  • 81. Device Placement with Reinforcement Learning Device Placement Optimization with Reinforcement Learning https://arxiv.org/abs/1706.04972
  • 82. Neural Architecture Search Efficient Neural Architecture Search via Parameter Sharing https://arxiv.org/abs/1802.03268
  • 83. Examples - Improving ML algorithms: Device placement, Architecture search, Optimizer search, Ensembling, ... - Optimizing indexes in DB (The Case for Learned Index Structures, https://arxiv.org/abs/1712.01208) - Improving datacenter efficiency: optimize cooling, optimize virtual machine placement, ... - … Computer Systems are filled with heuristics that work well “in general case”. But they generally don’t adapt to actual pattern of usage and don’t take into account available context. We can use ML anywhere we’re using heuristics to make a decision! See Jeff Dean talk at NIPS 2017 http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
  • 84. Examples Compilers: instruction scheduling, register allocation, loop nest parallelization strategies, … Networking: TCP window size decisions, backoff for retransmits, data compression, ... Operating systems: process scheduling, buffer cache insertion/replacement, file system prefetching, … Job scheduling systems: which tasks/VMs to co-locate on same machine, which tasks to pre-empt, ... ASIC design: physical circuit layout, test case selection, … See Jeff Dean talk at NIPS 2017 http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
  • 85. See Jeff Dean talk at NIPS 2017 http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
  • 87. No dataset — no deep learning Deep learning requires a lot of data (otherwise simple models could be better). But sometimes you have no dataset… Nonetheless several ways available: ● Transfer learning ● Data augmentation ● Mechanical Turk ● Unsupervised pre-training ● moving towards one-shot and zero-shot learning ● …
  • 88. The data scale versus the model performance
  • 90. Data & Models vs. Code The almost same state-of-the-art code is mostly available for all the market. Currently the real differentiator is a data or trained models (the data derivative thing). Using a publicly available code/algorithm with unique data it’s possible to create a better quality model than with the highly-specialized code with public data. There is a space for a new type of infrastructure ● Data and algorithm marketplaces ● Model marketplaces and model repositories ● AutoML (already appearing) ● Model management ● Model quality evaluation ● ...
  • 92. Still some issues exist: Computing power DL requires a lot of computations. Without a cluster or GPU machines much more time is required. ● Currently GPUs (mostly NVIDIA) is the only choice ● FPGA/ASIC are coming into this field (Google TPU gen.2, Bitmain Sophon, Intel 2018+). The situation resembles the path of Bitcoin mining ● Neuromorphic computing is on the rise (IBM TrueNorth, Intel, memristors, etc) ● Quantum computing can benefit machine learning as well (but probably it won’t be a desktop or in-house server solutions)
  • 95. Distributed training is a commodity now Image from: https://github.com/uber/horovod
  • 97. Trends: Supercomputer performance (GFLOPS FP64) https://en.wikipedia.org/wiki/TOP500
  • 98. Personal Supercomputers ● NVIDIA DGX-1 Server ($149,000) Performance: 1000 TFLOPS FP16, 125 TFLOPS FP32 * NVIDIA DGX-2 (16 TESLA V100, 2 PFLOPS FP16) is just announced ● DeepLearning11 ($16,500, contains 10x NVIDIA GeForce GTX 1080 Ti) Performance: 100 TFLOPS FP32 ● NVIDIA GTX Titan V gaming card ($3000) 6.9 TFLOPS FP64 (! it is not usually reported FP16 performance !) ○ Corresponds to the best supercomputer in the world at 2001–2002 (IBM ASCI White with 7.226 TFLOPS peak speed) and a supercomputer on 500th place (still a cool supercomputer) of the TOP500 list in November 2007 (the entry level to the list was the 5.9 TFlop/s) ● For comparison: Huawei Mate 10 smartphone with Kirin 970 Neural Network Processing Unit, 1.92 TFLOPS FP16 ○ A similar performance (but FP64) had the top performing supercomputer of 1997 https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664
  • 99. AI at the edge ● NVidia Jetson TK1/TX1/TX2 ○ 192/256/256 CUDA Cores ○ 64/64/128-bit 4/4/6-Core ARM CPU, 2/4/8 Gb Mem ○ Xavier is coming ● Tablets, Smartphones ○ Qualcomm Snapdragon 845 ○ Apple A11 Bionic ○ Huawei Kirin 970 ● Raspberry Pi 3 (1.2 GHz 4-core) ● Movidius Neural Compute Stick
  • 100. References: Hardware for Deep Learning series of posts: https://blog.inten.to/hardware-for-deep-learning-current-state-and-trends-51c01ebbb6dc ● Part 1: Introduction and Executive summary ● Part 2: CPU ● Part 3: GPU ● Part 4: FPGA ● Part 5: ASIC ● Part 6: Mobile AI ● Part 7: Neuromorphic computing ● Part 8: Quantum computing
  • 103. AI changes the landscape of threats ● Expansion of existing threats ○ The costs of attacks are lowered ■ Set of actors who can carry out attacks expands ■ The rate and scale of attacks can increase ■ The set of potential targets can expand ● Introduction of new threats ○ AI systems can compete tasks that would be otherwise impractical for humans ○ Exploiting vulnerabilities of AI systems ● Change to the typical character of threats ○ Attacks can be especially effective ○ Finely targeted ○ Difficult to attribute
  • 104. Many other issues exist as well ● Unintentional forms of AI misuse like algorithmic bias ● Indirect threats: mass unemployment, or other second- or third-order effects from the deployment of AI technology ● System-level threats that would come from the dynamic interaction between non-malicious actors, e.g. “race to the bottom” on AI safety ● Existential risks from the human-level AI ● Unclear regulation
  • 105. On the good side