2. About Me
• Engineering Background - AppDev
• Open source Contributor
• Hadoop – 10 years
• HWX Principle Solution Engineer
• Director, Solutions Engineering @ Kinetica
• Kinetica Local Contact Information
• Sunile Manjee, Director Solutions Engineering, smanjee@kinetica.com
• Phil Zacharia, Director Central Region, pzacharia@kinetica.com
2
3. The image part with relationship ID rId2 was not found in the file.
What is Kinetica?
3
Patented
In Memory
Columnar
Distributed
GPU Accelerated
Database
4. The image part with relationship ID rId2 was not found in the file.
Developed to Identify Terroristic Threats in Real-
Time
4
Kinetica incubated as a massively parallel
computational engine for US Army INSCOM
Ingests 200+ sources of streaming data –
mobile devices, drones, social media, cyber data
200B new records per hour
Incorporates geospatial and temporal data
Real-time, actionable threat intelligence
First high-performance database leveraging GPUs
4
5. The image part with relationship ID rId2 was not found in the file.
Who is Kinetica?
2009
‘HPC Research Project’
incubated by US military
2010
2011
Patent # US8373710
B1 issued to GPUdb
2012
US Army deploys
GPUdb
2013
GPUdb commercially
available
2014
IDC HPC innovation
excellence award
Army
GPUdb goes
into production
at USPS
2015
Iron Net selects
GPUdb for Cyber
Defense
2015
PG&E selects GPUdb
for electric grid
analysis
IDC HPC innovation
excellence award
USPS
2016
Rebrand to
The image part with relationship ID rId3 was
not found in the file.
4
2012
Confidential Information
6. Confidential Information
6
Current Data Architectures Can’t Keep Up | Complex, Rigid, Agility
Challenges
• Infrastructure complexity, costs – stitch together multiple
tools – separate tools for BI, ML, OLAP cubes, databases
• High Latency – can’t handle big data’s volume, variety,
velocity
• Data needs to be pre-aggregated and transformed to cubes
• Processing is batch and not real-time
• Rigid – can’t handle changing requirements, changing data
• Dashboard slowness pains
• Datamarts in Tableau, caching, very complex query
• Difficult to simultaneously ingest and analyze at scale
• Limited Agility – admin overhead, resources, skills
Tableau
EDW
(Teradata, Oracle)
Star schema – facts & dimensions
DATA
3rd partyERP, CRM, SFA Databases Flat files
MSTR SAS
Data Integration (INFA, Talend)
Others
Hadoop
(Horton,
Cloudera)
DATA
MARTS
OLAP
CUBES
INDICES
SUMMARY
Tables
NiFi,
Kafka
7. Confidential Information
7
Kinetica Database | Real-Time, Flexible, Simple Data and Analytics
Tableau
EDW
(Teradata, Oracle)
DATA
3rd partyERP, CRM, SFA Databases Flat files
MSTR SAS
Data Integration (INFA, Talend)
Others
Hadoop
(HDP, CDH,
MapR)
Kinetica
NiFi, Kafka
Solution
• Low Latency – millisecond response time
• Real-time at scale – simultaneously ingest and analyze
• Full data provisioning – ingest, manage, analyze, visualize
• Flexible – handle changing requirements, changing data,
minimize aggregates, indexes, cubes
• Simplicity – minimize admin overhead, resources, skills
Plus
• Converge AI and BI
• Location-based Analytics
• Deploy on commodity hardware on-prem, cloud
8. The image part with relationship ID rId2 was not found in the file.
Confidential Information
Kinetica : Unique Strengths & Capabilities
Fast, Distributed, In-Memory Analytics
Engine for Fast Moving, Large Scale Data
Kinetica is designed to take advantage of the
parallel processing nature of the GPU. It delivers
low-latency, high performance analytics on large
data sets, and makes streaming data available for
query in real-time.
8
OLAP
Performance,
Scalability,
Stability
Geospatial
Processing &
Visualization
API for GPU
Powered
Data &
Compute
Orchestration
Converged AI and BI
User Defined Functions
(UDFs) and orchestration of
data in a distributed manner
enable Kinetica to offer low-
level customizations for
machine learning and AI
workloads
Native Geospatial &
Visualization Pipeline
Native visualization pipeline makes it
easier to work with large geospatial
data sets. Ideal for IoT use-cases, and
powering geospatial
applications
Sonic Layer
(Fast/True Real time
Analytics)
Historic and
Predictive
Insights
Interactive
Location-Based
Analytics
9. c
c
9
CUDA
SELECT a*x+y FROM TABLE
SQL
Python
import gpudb
h_db = gpudb.GPUdb(encoding = 'BINARY', host = '127.0.0.1', port = '9191’)
response = h_db.get_records_by_column(’TABLE', ["(a*x+y)"], 0, 10, 'json', {})
Make/Build
Cuda Abstraction, SaxPy Example
https://devblogs.nvidia.com/parallelforall/easy-introduction-cuda-c-and-c/
Confidential Information
12. The image part with relationship ID rId2 was not found in the file.
Core Design & Architecture
12
GPU
SHARD
Chunk
Logical Node
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
GPU
Logical Node
GPU
SHARD
Chunk
Logical Node
CPU Socket
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
GPU
Logical Node
System Memory (RAM)
ChunkChunk ChunkChunk ChunkChunk Chunk Chunk
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Table:Column:Data
Map to
Persist
CPU Socket
Confidential Information
13. The image part with relationship ID rId2 was not found in the file.
Kinetica UDF
13
GPU
SHARD
Chunk
Logical Node
CPU Socket
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
Logical Node
GPU
SHARD
Chunk
Logical Node
CPU Socket
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
SHARD
Chunk
Chunk
Logical Node
System Memory (RAM)
ChunkChunk ChunkChunk ChunkChunk Chunk Chunk
The image part with relationship ID rId3 was not found in the file.
The image part with
relationship ID rId3
was not found in the
file.
The image part with
relationship ID rId3
was not found in the
file.
The image part
with
relationship ID
rId3 was not
found in the
file.
The image part
with
relationship ID
rId3 was not
found in the
file.
GPU GPU
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
The image part with
relationship ID rId3 was
not found in the file.
Confidential Information
14. The image part with relationship ID rId2 was not found in the file.
CPU Bound "Real Time” Architectures
14
Data Stream
Buy/Add
More Nodes
Concurrent Ingest & Analytics
Confidential Information
15. The image part with relationship ID rId2 was not found in the file.
Kinetica Real Time Analytics Architecture
15
Data Stream
Concurrent Ingest & Analytics
GPU
Confidential Information
16. The image part with relationship ID rId2 was not found in the file.
Demo 16