【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh

Intel® OpenVINO™ Toolkit
如何加速AI視覺應用開發

Internet of Things Group 2
Legalnotices&disclaimersThis document contains information on products, services and/or processes in development. All information provided here is subject to change without notice.
Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at
intel.com, or from the OEM or retailer. No computer system can be absolutely secure.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual
performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance
and benchmark results, visit http://www.intel.com/performance.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect
future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a
number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the
annual report on Form 10-K.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current
characterized errata are available on request.
Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as
"Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm
whether referenced data are accurate.
Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes.
Any differences in your system hardware, software or configuration may affect your actual performance.
Intel, the Intel logo, Pentium, Celeron, Atom, Core, Xeon, Movidius, Saffron, OpenVINO, MediaSDK and others are trademarks of Intel Corporation in the U.S.
and/or other countries.
*Other names and brands may be claimed as the property of others.
© 2018 Intel Corporation.

https://software.intel.com/en-us/openvino-toolkit

4
Artificial
Intelligence
is the ability of machines
to learn from experience,
without explicit
programming, in order to
perform cognitive
functions associated with
the human mind
ArtificialIntelligence
Machinelearning
Algorithms whose performance
improve as they are exposed to
more data over time
Deeplearning
Subset of machine
learning in which
multi-layered neural
networks learn from
vast amounts of data

5
Lots of
labeled data!
Training
Inference
Forward
Backward
Model weights
Forward
“Bicycle”?
“Strawberry”
“Bicycle”?
Error
Human
Bicycle
Strawberry
??????
DeeplearningBasics
Data set size
Accuracy
Didyouknow?
Training with a large
data set AND deep (many
layered) neural network
often leads to the highest
accuracy inference

7
OS Support CentOS* 7.4 (64 bit) Ubuntu* 16.04.3 LTS (64 bit) Microsoft Windows* 10 (64 bit) Yocto Project* version Poky Jethro v2.0.3 (64 bit)
OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos
Intel® Architecture-Based
Platforms Support
Intel® Deep Learning Deployment Toolkit Traditional Computer Vision Tools & Libraries
Model Optimizer
Convert & Optimize
Inference Engine
Optimized InferenceIR OpenCV* OpenVX*
Photography
Vision
Optimized Libraries
IR = Intermediate
Representation file
For Intel® CPU & CPU with integrated graphics
Increase Media/Video/Graphics Performance
Intel® Media SDK
Open Source version
OpenCL™
Drivers & Runtimes
For CPU with integrated graphics
Optimize Intel® FPGA
FPGA RunTime Environment
(from Intel® FPGA SDK for OpenCL™)
Bitstreams
FPGA – Linux* only
Code Samples & 10 Pre-trained Models Code Samples
Openvino™toolkitCross-Platform Tool to Accelerate Computer Vision & Deep Learning Inference Performance
software.intel.com/openvino-toolkit
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.

Discover Intel® OpenVINO™ Toolkit Capabilities

Hardware Acceleration

Intel Architecture
Multicore/Multithreading/Vectorization
https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions

Intel Integrated Graphics

Movidius Neural Compute Stick (Myriad 2)

Intel® FPGA
Intel Arria 10 GX FPGA
• High-performance, multi-gigabit SERDES
transceivers up to 15 Gbps
• 1,150K logic elements available (-2L speed
grade)
• 53 Mb of embedded memory
On-Board Memory
• 8 GB DDR4 memory banks with error correction
code (ECC) (2 banks)
• 1 Gb Mb (128 MB) flash
Interfaces
• PCIe x8 Gen3 electrical, x16 mechanical
• USB 2.0 interface for debug and programming of
FPGA and flash memory
• 1X QSFP+ with 4X 10GbE or 40GbE support
• Standard height, 1/2 length
• Low-profile option upon request

Copyright © 2018, Intel Corporation. All rights reserved.
Optimization NoticeOptimization Notice
OpenVINO™ toolkit Technical Specifications
Intel® Platforms Compatible Operating Systems
Target
Solution
Platforms
CPU
 6th-8th generation Intel® Xeon® & Core™ processors
 Ubuntu* 16.04.3 LTS (64 bit)
 Microsoft Windows* 10 (64 bit)
 CentOS* 7.4 (64 bit)
 Intel® Pentium® processor N4200/5, N3350/5, N3450/5 with Intel® HD Graphics  Yocto Project* Poky Jethro v2.0.3 (64 bit)
Iris® Pro & Intel® HD Graphics
 6th-8th generation Intel® Core™ processor with Intel® Iris™ Pro graphics & Intel® HD Graphics
 6th-8th generation Intel® Xeon® processor with Intel® Iris™ Pro Graphics & Intel® HD Graphics
(excluding e5 product family, which does not have graphics1)
 Ubuntu 16.04.3 LTS (64 bit)
 Windows 10 (64 bit)
 CentOS 7.4 (64 bit)
FPGA
 Intel® Arria® FPGA 10 GX development kit
 Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA operating systems
 OpenCV* & OpenVX* functions must be run against the CPU or Intel® Processor Graphics (GPU)
VPU
 Intel Movidius™ Neural Compute Stick
 Windows 10 (64 bit)
Development
Platforms
6
th
-8
th
generation Intel® Core™ and Intel® Xeon® processors
 Ubuntu* 16.04.3 LTS (64 bit)
 Windows® 10 (64 bit)
 CentOS* 7.4 (64 bit)
Additional
Software
Requirements
Linux* build environment required components
 OpenCV 3.4 or higher  GNU Compiler Collection (GCC) 3.4 or higher
 CMake* 2.8 or higher  Python* 3.4 or higher
Microsoft Windows* build environment required components
 Intel® HD Graphics Driver (latest version)†  OpenCV 3.4 or higher
 Intel® C++ Compiler 2017 Update 4  CMake 2.8 or higher
 Python 3.4 or higher  Microsoft Visual Studio* 2015
External Dependencies/Additional Software View Product Site, detailed System Requirements
14
1Graphics drivers are required only if you use Intel® Processor Graphics (GPU).

Traditional Computer Vision

Typical Deep Learning Video Workload

Starting with OpenCV* & OpenVX*
 Well established, open source, computer
vision library
 Wide variety of algorithms and functions
available
 Includes Intel® Photography Vision Library
 Intel optimized functions for faster performance on Intel hardware
 Basic building blocks to speed performance, cut development time & allow customization
 All-in-one package
 Targeted at real time, low power
applications
 Graph based representation,
optimization & execution
 11 samples included

OpenCV* Samples
• PeopleDetect : demonstrates the use of the HoG
descriptor.
• Background foreground segmentation : demonstrates
background segmentation.
• Dense Optical Flow : demonstrates using of dense
optical flow algorithms.
• PVL Face Detection and Recognition : demonstrate
Intel PVA face detection and recognition algorithms.
• Colorization : demonstrates recoloring grayscale
images with DNN.
• Opencl Custom Kernel : demonstrates running custom
OpenCL kernels by means of OpenCV T-API interface.

OpenVX™ Samples
Gstreamer
Interoperability
Video
Stabilization
Hetero
Basic
Camera
Tampering
Census
Transform
Custom
OpenCL™
Kernel
Face
Detection
Auto-
Contrast
Lane
Detection
Motion
Detection
Color Copy
Pipeline

Deep Learning Computer Vision

21
Caffe*
Tensor Flow*
MxNet*
.dataIR
IR
IR = Intermediate
Representation format
Convert & optimize
to fit all targets
Load, infer
CPU Plugin
GPU Plugin
FPGA Plugin
Myriad Plugin
Model
Optimizer
Convert &
Optimize
Extendibility
C++
Extendibility
OpenCL™
Extendibility
OpenCL/TBD
Extendibility
TBD
Model Optimizer
 What it is: Preparation step -> imports trained models
 Why important: Optimizes for performance/space with
conservative topology transformations; biggest boost is
from conversion to data types matching hardware.
Inference Engine
 What it is: High-level inference API
 Why important: Interface is implemented as dynamically
loaded plugins for each hardware type. Delivers best
performance for each type without requiring users to
implement and maintain multiple code pathways.
Trained
Model
Inference
Engine
Common API
(C++)
Optimized cross-
platform inference
GPU = Intel CPU with integrated graphics processing unit/Intel® Processor Graphics
Intel®DeepLearningDeploymentToolkit(DLDT)Take Full Advantage of the Power of Intel® Architecture for Deep Learning
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.

Deep Learning Inference Workflow
MKL-DNN
cl-DNN
CPU:
Xeon/Core/Atom
GPU
FPGA
VPU
DLA
NYRIAD
1
Train
2
Prepare
model
3
Inference

Public Models
Image Classification
Object Detection
Semantic Segmentation
Neural Style Transfer
1
Train
2
Prepare
model
3
Inference

Intel Confidential 24
Supported Tensorflow* Models
Converting Your TensorFlow* Model
https://software.intel.com/en-us/articles/OpenVINO-Using-TensorFlow
Model Name Slim Model Classification
Inception v1 inception_v1_2016_08_28.tar.gz
Inception ResNet v2 inception_resnet_v2_2016_08_30.tar.gz
MobileNet v1 128 mobilenet_v1_0.25_128.tgz
MobileNet v2 1.4 224 mobilenet_v2_1.4_224.tgz
MobileNet v2 1.0 224 mobilenet_v2_1.0_224.tgz
NasNet Large nasnet-a_large_04_10_2017.tar.gz
NasNet Mobile nasnet-a_mobile_04_10_2017.tar.gz
ResidualNet-50 v1 resnet_v1_50_2016_08_28.tar.gz
VGG-16 vgg_16_2016_08_28.tar.gz
VGG-19 vgg_19_2016_08_28.tar.gz
Model Name TensorFlow Object Detection API Models (Frozen)
SSD Mobilenet V1 ssd_mobilenet_v1_coco_2018_01_28.tar.gz
SSD MobileNet V1 0.75 Depth ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_
07_03.tar.gz
SSD MobileNet V1 PPN ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco
14_sync_2018_07_03.tar.gz
SSD MobileNet V1 FPN ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco1
4_sync_2018_07_03.tar.gz
SSD ResNet50 FPN ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14
_sync_2018_07_03.tar.gz
SSD MobileNet V2 ssd_mobilenet_v2_coco_2018_03_29.tar.gz
SSD Inception V2 ssd_inception_v2_coco_2018_01_28.tar.gz
Faster R-CNN Inception V2 faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
Faster R-CNN ResNet 50 faster_rcnn_resnet50_coco_2018_01_28.tar.gz
Faster R-CNN ResNet 50 Low
Proposals
faster_rcnn_resnet50_lowproposals_coco_2018_01_28.tar.gz
Faster R-CNN ResNet 101 faster_rcnn_resnet101_coco_2018_01_28.tar.gz
Faster R-CNN ResNet 101 Low
Proposals
faster_rcnn_resnet101_lowproposals_coco_2018_01_28.tar.gz
Faster R-CNN Inception ResNet V2 faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.
gz
Faster R-CNN Inception ResNet V2
Low Proposals
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2
018_01_28.tar.gz
Faster R-CNN NasNet faster_rcnn_nas_coco_2018_01_28.tar.gz
Faster R-CNN NasNet Low
Proposals
faster_rcnn_nas_lowproposals_coco_2018_01_28.tar.gz
Mask R-CNN Inception ResNet V2 mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.
gz
Mask R-CNN Inception V2 mask_rcnn_inception_v2_coco_2018_01_28.tar.gz
Mask R-CNN ResNet 101 mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz
Mask R-CNN ResNet 50 mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz
1
Train
2
Prepare
model
3
Inference

Run Model Optimizer
python mo.py
--input_model alexnet.caffemodel
--input data
--input_shape [1,3,227,227]
--data_type FP32
--log_level DEBUG
1
Train
2
Prepare
model
3
Inference
alexnet.xml
alexnet.bin

Intel ConfidentialIntel Confidential 26
python mo.py --input_model inception_v3_frozen.pb --input_shape
[1,299,299,3] -- mean_values [127.5,127.5,127.5] –scale_values
[127.5,127.5,127.5]
Model Optimizer arguments:
Common parameters:
- Path to the Input Model: TF_inception_v3/inception_v3_frozen.pb
- Path for generated IR: TF_inception_v3/FP32
- IR output name: inception_v3_frozen
- Log level: ERROR
- Batch: Not specified, inherited from the model
- Input layers: Not specified, inherited from the model
- Output layers: Not specified, inherited from the model
- Input shapes: [1,299,299,3]
- Mean values: [127.5,127.5,127.5]
- Scale values: [127.5,127.5,127.5]
- Scale factor: Not specified
- Precision of IR: FP32
- Enable fusing: True
- Enable grouped convolutions fusing: True
- Move mean values to preprocess section: False
- Reverse input channels: False
TensorFlow specific parameters:
- Input model in text protobuf format: False
- Offload unsupported operations: False
- Path to model dump for TensorBoard: None
- Update the configuration file with input/output node names: None
- Operations to offload: None
- Patterns to offload: None
- Use the config file: None
Model Optimizer version: 1.2.110.59f62983
[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: TF_inception_v3/FP32/inception_v3_frozen.xml
[ SUCCESS ] BIN file: TF_inception_v3/FP32/inception_v3_frozen.bin
[ SUCCESS ] Total execution time: 4.97 seconds.
TensorFlow* Inception V3 Model
classification_sample.exe -m inception_v3_frozen.xml -i
.democar.png
[ INFO ] InferenceEngine:
API version ............ 1.1
Build .................. 11653
[ INFO ] Parsing input parameters
[ INFO ] Loading plugin
API version ............ 1.1
Build .................. win_20180511
Description ....... MKLDNNPlugin
[ INFO ] Loading network files:
inception_v3_frozen.xml
inception_v3_frozen.bin
[ INFO ] Preparing input blobs
[ WARNING ] Image is resized from (787, 259) to (224, 224)
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference (1 iterations)
[ INFO ] Average running time of one iteration: 91.8117 ms
[ INFO ] Processing output blobs
Top 10 results:
Image democar.png
752 0.2250938 label racer, race car, racing car
818 0.1749035 label sports car, sport car
512 0.1381157 label convertible
628 0.0980638 label limousine, limo
437 0.0751940 label beach wagon, station wagon, wagon, estate car, beach waggon,
station waggon, waggon
480 0.0682043 label car wheel
582 0.0508757 label grille, radiator grille
706 0.0157829 label passenger car, coach, carriage
469 0.0073989 label cab, hack, taxi, taxicab
865 0.0056826 label tow truck, tow car, wrecker
[ INFO ] Execution successful
1
Train
2
Prepare
model
3
Inference

Pretrained Models
1
Train
2
Prepare
model
3
Inference

person-detection-retail-0001
This model is for a pedestrian detector used for Retail scenarios.
The model is based on a backbone with hyper-feature + R-FCN.
Metric Value
AP 80.91%
Pose coverage Standing upright, parallel to the
image plane
Support of occluded pedestrians YES
Occlusion coverage <50%
Min pedestrian height 80 pixels (on 1080p)
Max objects to detect 200
GFlops 12.584487
MParams 3.243982
Source framework Caffe*
Caffe* CPU
Intel® MKL
Inference
Engine Intel®
MKL-DNN
Inference
Engine clDNN
FP16
Inference
Engine clDNN
FP32
OpenCV* CPU
3.80 17.45 15.25 10.29 9.27
Average Precision (AP) is defined as an area under the precision/recall curve.
The validation dataset consists of about 50,000 images from about 100 scenes.
Performance (FPS) and System
Configuration
Ubuntu* 16.04
Intel® Core™ i5-6500 CPU @ 2.90GHz fixed
GPU GT2 @ 1.00GHz fixed
DDR4 PC17000/2133MHz
1
Train
2
Prepare
model
3
Inference

IR
Read network
description +
weights
Load Network
Create Engine
Instance w/ HW
plugin
load input blobs
+ infer
CNNNetworkReader network_reader;
network_reader.ReadNetwork(“Model.xml”);
Network_reader.ReadWeights(“Model.bin")
InferenceEngine::InferenceEnginePluginPtr engine_ptr =
InferenceEngine::PluginDispatcher(pluginDirs).getSuitablePlugin(TargetDevice::eGPU);
InferencePlugin plugin(engine_ptr);
auto network = network_reader.getNetwork();
auto executable_network = plugin.LoadNetwork(network, {});
auto infer_request = executable_network.CreateInferRequest();
for (auto & item : inputInfo) {
auto input_name = item->first;
auto input = infer_request.GetBlob(input_name);
}
infer_request->infer();
UsingtheInferenceEngineAPI 1
Train
2
Prepare
model
3
Inference

Deep Learning Object Classification
Output is an
array of
category
possibilities
1
Train
2
Prepare
model
3
Inference
928-0.543935-ice cream, icecream
// Read Network, Create Engine and Load Netowrk
Mat frame,frame2;
for (;;) {
cap >> frame; //OpenCV video capture
//resize to expected size (in IR .xml)
resize(frame,frame2,Size(227,227));
//run inference
long unsigned framesize= frame2.rows*frame2.step1();
ConvertImageToInput(frame2.data, framesize, *input);
sts = _plugin->Infer(*input, *output, &dsc);
//get top classifier label
int blobsize=output->size();
float *data=output->data();
float max=0;
int maxidx=0;
for (int i1=0; i1<blobsize; i1++) {
if (data[i1]>max) {
max=data[i1];
maxidx=i1;
}
}
// do something with classification data
imshow( "frame", frame2 );
if (waitKey(30) >= 0) break;
}

Run Inference 1:
Model
vehicle-license-plate-
detection-barrier-0007
Detects Vehicles
Run Inference 2:
Model
vehicle-attributes-
recognition-barrier-0010
Classifies vehicle attributes
Run Inference 3:
Model
license-plate-recognition-
barrier-0001
Detects License Plates
Load Input Image(s)
Display Results
Asynchronous and Heterogeneous
1
Train
2
Prepare
model
3
Inference

OpenVINO Documentation

33
OpenVINO Reference Documents
IE document (local)
<openvino>/deployment_tools/documentation/index.html
OpenVINO online documents
https://software.intel.com/en-us/openvino-toolkit/documentation/featured
OpenVINO Get Started Guide
https://software.intel.com/en-us/openvino-toolkit/documentation/get-started

34
Model / Layer Support information
• Open IE document (local)
• Go to “Converting Your XXXX Model”

35
Inference Engine API Reference Manual
Open IE document (local)
Enter keywords to search box and incremental search result will be displayed

Copyright © 2018, Intel Corporation. All rights reserved.
Optimization NoticeOptimization Notice
36
Benefits of the OpenVINO™ toolkit
Reduce time using a library of optimized
OpenCV* & OpenVX* functions, & 15+ samples.
Develop once, deploy for current
& future Intel-based devices.
Use the increasing repository
of OpenCL™ starting points in OpenCV*
to add your own unique code.
Access Intel computer vision accelerators.
Speed code performance.
Supports heterogeneous processing
& asynchronous execution.
AcceleratePerformance
IntegrateDeeplearning
Unleash convolutional neural
network (CNN) based deep
learning inference
Up to
19.9x
increase1
1Performance increase comparing certain standard framework models vs. Intel-optimized models in the Intel® Deep Learning Deployment Toolkit. Performance results are based on testing as of June 13, 2018 and
may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit
www.intel.com/benchmarks. Testing by Intel as of June 13, 2018. See Benchmark slide 14 for configuration details.
Harness the Power of Intel® Processors: CPU, CPU with Integrated Graphics, FPGA,VPU
speeddevelopment Innovate&customize
using a common API, pre-trained models &
computer vision algorithms.

Placeholder Footer Copy / BU Logo or Name Goes Here

【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à 【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh

Similaire à 【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh (20)

Plus de MAKERPRO.cc

Plus de MAKERPRO.cc (20)

Dernier

Dernier (20)

【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh