Discussions about the future of data and artificial intelligence, from the Internet of Things (IoT) to predictive analytics, generally focus on potential to reshape how we live and how products perform for the better for accurately predicting cataclysmic failures or business problems far faster than ever imagined. But all this potential will remain theoretical if enterprises do not have the processing power to analyze all the data at their disposal. When trying to integrate real-time or streaming data into their BI and AI platforms, many organizations are experiencing crashes due to the limits of their current processing capabilities. In other words, the expansion in data must be accompanied by an expansion of the capacity to process it.
To extract timely meaning form data in the future, many companies will have to change, adapt, or move on from their current technologies. GPU databases are one of the best ways for enterprises to get full utility from streaming data in real time and to converge big data analytics with machine learning AI workloads in a single platform.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Webinar Slides: Explaining GPUs to Your CEO
1. Explaining GPUs to Your
CEO
The Power of Productization
Dan Woods, CTO and Editor, CITO Research | Amit Vij, CEO and Co-Founder, Kinetica
November 9, 2017
2. Dan Woods
CTO and Editor, CITO Research
Dan Woods is CTO and Editor of CITO Research, a firm that focuses on the membrane between the
world of IT and the hotbeds of advanced technology around the world. Dan’s goal is to help CIOs and
CTOs who lead the application of technology in organizations of all sizes become better leaders. Dan
helps vendors of technology understand how to adapt their technology to the world of IT.
Dan has written more than 25 books, most recently APIs: A Strategy Guide, published by O’Reilly
Media, and is a regular contributor to Forbes.
Amit Vij
CEO and Co-Founder, Kinetica
Amit is responsible for the vision, administrative, and executive decisions for the company. With a
background in computer engineering, he has over a decade of software development experience in
the commercial and federal space, with an emphasis in analyzing and visualizing big data, and helped
architect Kinetica.
Amit served as the chief GEOINT technical architect as a contractor for a major classified cloud
initiative between the US Army, NSA, and the DIA. Prior to Kinetica, Amit had been chief architect and
a subject matter expert on geospatial intelligence for several Department of Defense and Department
of Homeland Security contracts as. Amit received a B.S. in Computer Engineering from the University
of Maryland with concentrations in Computer Science, Electrical Engineering, and Mathematics.
Presenter Bios
2
3. Why is hardware hot again?
3
1995
PC Internet
2005
Mobile Cloud
2015
AI and IoT
Pervasive use of
GPUS
Massively parallel computing needed to
tackle data tsunami and advanced
analytics needs on a single database
platform
4. Act 1
The End of Moore’s Law
Amit Vij | CEO | Kinetica
5. Life After Moore’s Law
40 Years of Microprocessor Trend Data
1980 1990 2000 2010 2020
102
103
104
105
106
107
Single-threaded perf
1.5X per year
1.1X per year
Transistors
(thousands)
Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp
5
6. Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp
1980 1990 2000 2010 2020
102
103
104
105
106
107
Single-threaded perf
1.5X per year
1.1X per year
GPU-Computing perf
1.5X per year 1000X
By 2025
Rise of GPU Computing
6
7. GPU Acceleration Overcomes Processing Bottleneck
5,000+ cores per device
versus ~16 cores per
typical CPU
High performance
computing trend to using
GPU’s to solve massive
processing challenges
GPU acceleration brings
high performance compute
to commodity hardware
Parallel processing is ideal
for scanning entire dataset
& brute force compute
77
8. GPU-Accelerated, Distributed, Scale-out Architecture
8
+
+
+
On Demand Scale-out
• CPUs geared more for sequential processing
• GPUs geared more for parallel processing
• CPUs and GPUs are paired together for the best overall
optimization
10. Act 2: What the Heck are GPUs Anyway?
10
Each pixel of a
display was assigned
a simple processor,
not as complex an
instruction set as a
CPU, but enough to
control all the pixels
in a massively
parallel form.
We have this
architecture to thank
for sharp and
responsive displays.
NVIDIA created an
API to allow GPUs to
be used for other
types of computing.
The CUDA API
This opened up a
powerful new source
of computing for
particular types of
applications.
GPUs were
developed for
making displays
more powerful.
12. Act 3: Vectors Heart GPUs
12
It turns out that
AI and Machine
Learning algorithms
love GPUs.
Many deep learning
and ML algorithms
involve processing
vectors that are
matrices of numbers.
3
5
10
4
GPUs add and
multiply vectors
faster than any other
method−a lot faster.
Many of the victories
in AI such as Alpha
GO and Image net
were achieved
through algorithms
powered by GPUs.
The victories were
also powered by
open source, the
availability of
massive pools of
on-demand
computing, the
availability of data.
14. Act 4: The Power of Integrated GPUs
The power of GPUs has been well
understood in high tech circles for
a while.
Even before the recent wave of AI
victories, companies have been figuring
out how to harness GPUs for a broader
enterprise workloads, not just for AI.
There is a fascinating evolution going
on right now as different ways of
packaging up the power of GPUs are
being developed.
The opportunity for the enterprise
is to harness the power of GPUs to
accelerate analysis.
The challenge vendors face is how to use
GPUs to make traditional workloads
faster and power AI workloads.
In most companies using big data, the
focus has been on batch use because
processing has been bound by CPUs.
14
15. GPU-Based Systems
15
Make processing and analysis of
big data an interactive process
that leads to a much faster cycle
to create insights.
Package and make AI models
widely available.
Allow big data analytics and AI
insights to be available in time to
matter for real-time business
processes.
Absorb and make use of streaming
data faster than other methods.
16. The Big Reveal: Implications for CEOs
16
CEOs should understand
the value of breaking through
those bottlenecks.
Understand the Value
CEOs can easily determine if GPUs
matter by attempting experiments
and POCs.
Attempt Experiments
CEOs should understand when
their use of data and analytics
is running into bottlenecks.
Find the Bottleneck
18. Accelerated Business Intelligence
18
Tableau + Kinetica
Kinetica combines GPU’s brute-force compute with the
simplicity of a relational database for millisecond query
response on massive data sets without extensive tuning.
• Incredibly fast query performance.
• Distributed design - ideal for large and streaming datasets.
• SQL-92 compliant relational database – without limits.
• More power means less need for tuning, indexing, and
administration of the database.
• No need to do pre-aggregation or build out cubes.
• Reduce reliance on specialized skills to prep and set-up data.
19. Distributed Location-Based Analytics
19
NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA
Native Geospatial Object Types
• Points, Shapes, Tracks, Labels
Native Geospatial Functions
• Filters (by area, by series, by geometry, etc.)
• Aggregation (histograms)
• Geofencing - triggers
• Video generation (based on dates/times)
Generate Map Overlay Imagery (via WMS)
• Rasterize points
• Style based on attributes (class-break)
• Heat maps
20. In-Database Machine Learning
ETL / STREAM
PROCESSING
ON DEMAND SCALE OUT +
1TB MEM / 2 GPU CARDS
SQL
Native
APIs
PARALLELINGEST
Geospatial
WMS
Custom
Connectors
In-Database Processing
CUSTOM LOGIC
BIDMach
MLLibs
BI DASHBOARDS
BI / GIS / APPS
CUSTOM APPS
& GEOSPATIAL
KINETICA ‘REVEAL’
STREAMINGDATAERP/CRM/
TRANSACTIONALDATA
UDFs
20
22. ENTERTAINMENT | Customer 360
22
CASE STUDY : BI ACCELERATION
BUSINESS OBJECTIVE
• Accelerate Tableau dashboards for faster customer 360 analytics
NEW CAPABILITIES DELIVERED
• 24X faster dashboard loads
• 3.5X faster slice and dice, drilldowns, filters
SOLUTION OVERVIEW
• Tableau Server and Kinetica running on Google Cloud Platform
• Kinetica accelerates EDW workload
• Simply point to Kinetica using Tableau’s replace data source feature
5s
Load Dashboard
4s
Kinetica
Update Customer Filters
120s
15s
Teradata
4s
5s
23. One of the things I like about
Kinetica is it gives us more of a
general-purpose use of the
technology. There has been a lot
of software created to answer
certain questions [but] highly
specialized tools have limited
functionality and are tuned to do a
certain workload.
"
Mark Ramsey, Chief Data Officer at GSK
BUSINESS OBJECTIVE
• Faster processing of transcriptomics to run simulations of
chemical reactions for drug discovery, research, and
development
NEW CAPABILITIES DELIVERED
• In-database processing to develop models, leveraging GPU
acceleration for performance, and direct access to CUDA APIs
via UDFs deployed within Kinetica
• Seek out signals from massive collection of drug targets
combined from external data, historical data from
experiments, ad clinical trials
SOLUTION OVERVIEW
• Kinetica running on-premises on a cluster of 7 HPE DL 380
servers
• Familiar relational database with GPU acceleration
LIFE SCIENCES : GENOMICS RESEARCH
CASE STUDY : ADVANCED IN-DATABASE ANALYTICS
23
24. LOGISTICS | Workforce optimization
BUSINESS OBJECTIVE
• Deliver better business services, optimize operations, and save
costs across 600,000 employees, 215,000 delivery vehicles, and
deliver 500 million pieces of mail daily
NEW CAPABILITIES DELIVERED
• Real-time delivery and pickup notifications, shipment routing,
just-in-time supplies
• Real-time route optimization - route planning, rerouting
• Geospatial analytics to uncover overlapping coverage areas,
uncovered areas, and distribution bottlenecks
• Advanced workload optimization for last minute route changes
SOLUTION OVERVIEW
• Kinetica collects, processes, and analyzes 200,000 messages per
minute for real-time streaming analytics. 15,000 daily sessions
with 5 9’s uptime
24
25. Download the O’Reilly Book
Kinetica.com/ebook
info@kinetica.com
Contact: dwoods@EvolvedMedia.com
Follow Dan: @danwoodsearly
Work with Dan:
www.CITOResearch.com: Finding
Technology for Early Adopters (Research
and IT Consulting)
www.EvolvedMedia.com, Helping B2B Tech
vendors find customers through content
marketing.
Notes de l'éditeur
Mark to intro and read bio slide ~2min – Dan to set stage with slide 3 ~2 min
Dan: ~1-2 min
Amit leads this section; Dan w/additional commentary - ~8 min
Amit this section - ~5
GPUs are perfect for Risk Modeling
GPUs are composed of thousands of small, efficient cores that are well suited to performing repeated similar instructions in parallel. This makes them well-suited to compute-intensive workloads on big data.
All Dan ~2-3 min
Dan - ~ 5min
All Dan ~2-3min
Dan ~
All Dan on 14&15 – On 16 Dan will lead with commentary from Amit - ~10min
Dan lead with Amit commentary - Amit – We were doing it before it was “cool”
More to the power than just ML and AI
Dan lead Amit with optional commentary
Dan lead with Amit chiming in ~8 min
Why does sr. mgmt care?
Where are the bottlenecks?
Amit – CEO should never settle for batch answers – everything needs to be real-time – horse power of the GPUs enab;es new use cases that were not possible with previous tech – fuse multiple datafeeds together for rig=ch analytics
All Amit 8min
Amit to pick 3 use case slides - Dan will ask questions during this section~ 10 min