3. Internet of Things Group 3
https://software.intel.com/en-us/openvino-toolkit
4. 4
Artificial
Intelligence
is the ability of machines
to learn from experience,
without explicit
programming, in order to
perform cognitive
functions associated with
the human mind
ArtificialIntelligence
Machinelearning
Algorithms whose performance
improve as they are exposed to
more data over time
Deeplearning
Subset of machine
learning in which
multi-layered neural
networks learn from
vast amounts of data
5. 5
Lots of
labeled data!
Training
Inference
Forward
Backward
Model weights
Forward
“Bicycle”?
“Strawberry”
“Bicycle”?
Error
Human
Bicycle
Strawberry
??????
DeeplearningBasics
Data set size
Accuracy
Didyouknow?
Training with a large
data set AND deep (many
layered) neural network
often leads to the highest
accuracy inference
7. 7
OS Support CentOS* 7.4 (64 bit) Ubuntu* 16.04.3 LTS (64 bit) Microsoft Windows* 10 (64 bit) Yocto Project* version Poky Jethro v2.0.3 (64 bit)
OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos
Intel® Architecture-Based
Platforms Support
Intel® Deep Learning Deployment Toolkit Traditional Computer Vision Tools & Libraries
Model Optimizer
Convert & Optimize
Inference Engine
Optimized InferenceIR OpenCV* OpenVX*
Photography
Vision
Optimized Libraries
IR = Intermediate
Representation file
For Intel® CPU & CPU with integrated graphics
Increase Media/Video/Graphics Performance
Intel® Media SDK
Open Source version
OpenCL™
Drivers & Runtimes
For CPU with integrated graphics
Optimize Intel® FPGA
FPGA RunTime Environment
(from Intel® FPGA SDK for OpenCL™)
Bitstreams
FPGA – Linux* only
Code Samples & 10 Pre-trained Models Code Samples
Openvino™toolkitCross-Platform Tool to Accelerate Computer Vision & Deep Learning Inference Performance
software.intel.com/openvino-toolkit
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.
8. Internet of Things Group 8
Discover Intel® OpenVINO™ Toolkit Capabilities
10. Internet of Things Group 10
Intel Architecture
Multicore/Multithreading/Vectorization
https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions
13. Internet of Things Group 13
Intel® FPGA
Intel Arria 10 GX FPGA
• High-performance, multi-gigabit SERDES
transceivers up to 15 Gbps
• 1,150K logic elements available (-2L speed
grade)
• 53 Mb of embedded memory
On-Board Memory
• 8 GB DDR4 memory banks with error correction
code (ECC) (2 banks)
• 1 Gb Mb (128 MB) flash
Interfaces
• PCIe x8 Gen3 electrical, x16 mechanical
• USB 2.0 interface for debug and programming of
FPGA and flash memory
• 1X QSFP+ with 4X 10GbE or 40GbE support
• Standard height, 1/2 length
• Low-profile option upon request
17. Internet of Things Group 17
Starting with OpenCV* & OpenVX*
Well established, open source, computer
vision library
Wide variety of algorithms and functions
available
Includes Intel® Photography Vision Library
OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc.
Intel optimized functions for faster performance on Intel hardware
Basic building blocks to speed performance, cut development time & allow customization
All-in-one package
Targeted at real time, low power
applications
Graph based representation,
optimization & execution
11 samples included
18. Internet of Things Group 18
OpenCV* Samples
• PeopleDetect : demonstrates the use of the HoG
descriptor.
• Background foreground segmentation : demonstrates
background segmentation.
• Dense Optical Flow : demonstrates using of dense
optical flow algorithms.
• PVL Face Detection and Recognition : demonstrate
Intel PVA face detection and recognition algorithms.
• Colorization : demonstrates recoloring grayscale
images with DNN.
• Opencl Custom Kernel : demonstrates running custom
OpenCL kernels by means of OpenCV T-API interface.
19. Internet of Things Group 19
OpenVX™ Samples
Gstreamer
Interoperability
Video
Stabilization
Hetero
Basic
Camera
Tampering
Census
Transform
Custom
OpenCL™
Kernel
Face
Detection
Auto-
Contrast
Lane
Detection
Motion
Detection
Color Copy
Pipeline
21. 21
Caffe*
Tensor Flow*
MxNet*
.dataIR
IR
IR = Intermediate
Representation format
Convert & optimize
to fit all targets
Load, infer
CPU Plugin
GPU Plugin
FPGA Plugin
Myriad Plugin
Model
Optimizer
Convert &
Optimize
Extendibility
C++
Extendibility
OpenCL™
Extendibility
OpenCL/TBD
Extendibility
TBD
Model Optimizer
What it is: Preparation step -> imports trained models
Why important: Optimizes for performance/space with
conservative topology transformations; biggest boost is
from conversion to data types matching hardware.
Inference Engine
What it is: High-level inference API
Why important: Interface is implemented as dynamically
loaded plugins for each hardware type. Delivers best
performance for each type without requiring users to
implement and maintain multiple code pathways.
Trained
Model
Inference
Engine
Common API
(C++)
Optimized cross-
platform inference
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos
GPU = Intel CPU with integrated graphics processing unit/Intel® Processor Graphics
Intel®DeepLearningDeploymentToolkit(DLDT)Take Full Advantage of the Power of Intel® Architecture for Deep Learning
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.
22. Internet of Things Group 22
Deep Learning Inference Workflow
MKL-DNN
cl-DNN
CPU:
Xeon/Core/Atom
GPU
FPGA
VPU
DLA
NYRIAD
1
Train
2
Prepare
model
3
Inference
23. Internet of Things Group 23
Public Models
Image Classification
Object Detection
Semantic Segmentation
Neural Style Transfer
1
Train
2
Prepare
model
3
Inference
25. Internet of Things Group 25
Run Model Optimizer
python mo.py
--input_model alexnet.caffemodel
--input data
--input_shape [1,3,227,227]
--data_type FP32
--log_level DEBUG
1
Train
2
Prepare
model
3
Inference
alexnet.xml
alexnet.bin
26. Intel ConfidentialIntel Confidential 26
python mo.py --input_model inception_v3_frozen.pb --input_shape
[1,299,299,3] -- mean_values [127.5,127.5,127.5] –scale_values
[127.5,127.5,127.5]
Model Optimizer arguments:
Common parameters:
- Path to the Input Model: TF_inception_v3/inception_v3_frozen.pb
- Path for generated IR: TF_inception_v3/FP32
- IR output name: inception_v3_frozen
- Log level: ERROR
- Batch: Not specified, inherited from the model
- Input layers: Not specified, inherited from the model
- Output layers: Not specified, inherited from the model
- Input shapes: [1,299,299,3]
- Mean values: [127.5,127.5,127.5]
- Scale values: [127.5,127.5,127.5]
- Scale factor: Not specified
- Precision of IR: FP32
- Enable fusing: True
- Enable grouped convolutions fusing: True
- Move mean values to preprocess section: False
- Reverse input channels: False
TensorFlow specific parameters:
- Input model in text protobuf format: False
- Offload unsupported operations: False
- Path to model dump for TensorBoard: None
- Update the configuration file with input/output node names: None
- Operations to offload: None
- Patterns to offload: None
- Use the config file: None
Model Optimizer version: 1.2.110.59f62983
[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: TF_inception_v3/FP32/inception_v3_frozen.xml
[ SUCCESS ] BIN file: TF_inception_v3/FP32/inception_v3_frozen.bin
[ SUCCESS ] Total execution time: 4.97 seconds.
TensorFlow* Inception V3 Model
classification_sample.exe -m inception_v3_frozen.xml -i
.democar.png
[ INFO ] InferenceEngine:
API version ............ 1.1
Build .................. 11653
[ INFO ] Parsing input parameters
[ INFO ] Loading plugin
API version ............ 1.1
Build .................. win_20180511
Description ....... MKLDNNPlugin
[ INFO ] Loading network files:
inception_v3_frozen.xml
inception_v3_frozen.bin
[ INFO ] Preparing input blobs
[ WARNING ] Image is resized from (787, 259) to (224, 224)
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference (1 iterations)
[ INFO ] Average running time of one iteration: 91.8117 ms
[ INFO ] Processing output blobs
Top 10 results:
Image democar.png
752 0.2250938 label racer, race car, racing car
818 0.1749035 label sports car, sport car
512 0.1381157 label convertible
628 0.0980638 label limousine, limo
437 0.0751940 label beach wagon, station wagon, wagon, estate car, beach waggon,
station waggon, waggon
480 0.0682043 label car wheel
582 0.0508757 label grille, radiator grille
706 0.0157829 label passenger car, coach, carriage
469 0.0073989 label cab, hack, taxi, taxicab
865 0.0056826 label tow truck, tow car, wrecker
[ INFO ] Execution successful
1
Train
2
Prepare
model
3
Inference
27. Internet of Things Group 27
Pretrained Models
1
Train
2
Prepare
model
3
Inference
28. Internet of Things Group 28
person-detection-retail-0001
This model is for a pedestrian detector used for Retail scenarios.
The model is based on a backbone with hyper-feature + R-FCN.
Metric Value
AP 80.91%
Pose coverage Standing upright, parallel to the
image plane
Support of occluded pedestrians YES
Occlusion coverage <50%
Min pedestrian height 80 pixels (on 1080p)
Max objects to detect 200
GFlops 12.584487
MParams 3.243982
Source framework Caffe*
Caffe* CPU
Intel® MKL
Inference
Engine Intel®
MKL-DNN
Inference
Engine clDNN
FP16
Inference
Engine clDNN
FP32
OpenCV* CPU
3.80 17.45 15.25 10.29 9.27
Average Precision (AP) is defined as an area under the precision/recall curve.
The validation dataset consists of about 50,000 images from about 100 scenes.
Performance (FPS) and System
Configuration
Ubuntu* 16.04
Intel® Core™ i5-6500 CPU @ 2.90GHz fixed
GPU GT2 @ 1.00GHz fixed
DDR4 PC17000/2133MHz
1
Train
2
Prepare
model
3
Inference
29. Internet of Things Group 29
IR
Read network
description +
weights
Load Network
Create Engine
Instance w/ HW
plugin
load input blobs
+ infer
CNNNetworkReader network_reader;
network_reader.ReadNetwork(“Model.xml”);
Network_reader.ReadWeights(“Model.bin")
InferenceEngine::InferenceEnginePluginPtr engine_ptr =
InferenceEngine::PluginDispatcher(pluginDirs).getSuitablePlugin(TargetDevice::eGPU);
InferencePlugin plugin(engine_ptr);
auto network = network_reader.getNetwork();
auto executable_network = plugin.LoadNetwork(network, {});
auto infer_request = executable_network.CreateInferRequest();
for (auto & item : inputInfo) {
auto input_name = item->first;
auto input = infer_request.GetBlob(input_name);
}
infer_request->infer();
UsingtheInferenceEngineAPI 1
Train
2
Prepare
model
3
Inference
30. Internet of Things Group 30
Deep Learning Object Classification
Output is an
array of
category
possibilities
1
Train
2
Prepare
model
3
Inference
928-0.543935-ice cream, icecream
// Read Network, Create Engine and Load Netowrk
Mat frame,frame2;
for (;;) {
cap >> frame; //OpenCV video capture
//resize to expected size (in IR .xml)
resize(frame,frame2,Size(227,227));
//run inference
long unsigned framesize= frame2.rows*frame2.step1();
ConvertImageToInput(frame2.data, framesize, *input);
sts = _plugin->Infer(*input, *output, &dsc);
//get top classifier label
int blobsize=output->size();
float *data=output->data();
float max=0;
int maxidx=0;
for (int i1=0; i1<blobsize; i1++) {
if (data[i1]>max) {
max=data[i1];
maxidx=i1;
}
}
// do something with classification data
imshow( "frame", frame2 );
if (waitKey(30) >= 0) break;
}
31. Internet of Things Group 31
Run Inference 1:
Model
vehicle-license-plate-
detection-barrier-0007
Detects Vehicles
Run Inference 2:
Model
vehicle-attributes-
recognition-barrier-0010
Classifies vehicle attributes
Run Inference 3:
Model
license-plate-recognition-
barrier-0001
Detects License Plates
Load Input Image(s)
Display Results
Asynchronous and Heterogeneous
1
Train
2
Prepare
model
3
Inference