SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
February, 2018
Andy Ellicott, Crate.io
SQL for Machine
Data?
Logistics…
• Submit questions at any time via the questions panel

• Slides & recording will be shared via email after the event
Agenda
–
• Machine data - the next big wave?

• Machine data use cases

• Machine data management options - Splunk, ELK, Time Series, 

• Reinventing SQL for machine data

• SQL examples

• Questions & answers
I like databases
25 years in DBMS & software development companies

IMHO…the coolest ways software is changing what’s
possible in life and business…is usually due to some
database changing what’s possible with software.
The next wave of big data
will come from machines
“Things Data”
The next wave…“Things Data”
–
By 2020, 50% of new
software systems are IoT
related
IoT
Putting Machine Data to Work
—
• Definitive record of all activity and behavior

- What happened, when, where, by whom

• Tells us how to optimize: 

- Customer experience

- Safety

- Production

- Profitability

• Where things are going right vs. wrong

• Fingerprints of fraud
Customer: 

–
“CrateDB’s real-time SQL
performance, simple scaling, and
high availability make it a key
element of our stack”
Sekhar Sarukkai

Co-founder
Use case: Cyber Security - Campbell, CA
• Leading Cloud Access Security Broker (CASB) 

• SaaS system monitors internet traffic for security risks

- 700 customers, 40% of F500

Data Challenges

• Original MySQL-ElasticSearch platform grew too costly to run &
too hard to maintain

- Duplicate data storage, DB syncing code

CrateDB Results
• Replaced MySQL/ElasticSearch with CrateDB in 2015

• ~100TB data, billions of network messages per day

• Real-time queries for 1000s of concurrent users

• 20x faster, 75% lower AWS costs
Customer:

–
Use case: Industrial IoT - Atlanta, GA
• $4B producer of bottles for Coca Cola, P&G, Unilever

• 2016 initiative: Use real-time IoT data to optimize overall equipment
effectiveness across 170 factories

Data Challenges
• Diversity - 900 different sensor types per production line

• MS SQL Server too slow and inflexible

- 900 tables (1 per sensor type)

- 3 - 5 minute query response times

CrateDB Results
• Easier development - 1 table vs. 900 in SQL Server

• Faster dashboards - 20ms vs. 4,000ms

• Central cloud + edge deployment = insight on factory floor and in
central “Mission Control”

• Lower labor costs and greater overall equipment effectiveness (OEE)
“Thousands of sensors generate data
along our production lines, and CrateDB
allows us to analyze that firehose of data
24 hours a day to make real-time
improvements to factory efficiency.”
Philipp Lehner, 

CEO Alpla, USA
Customer: 

–
Use case: Smart Lighting - Los Angeles, CA
• $2B global leader in IoT-enabled industrial lighting 

• Lighting Burj Khalifa, OfficeMax & Sainsbury’s chains

• Software to control & monitor complex network of lighting, plus
presence, energy, & WiFi sensors

Data Challenges
• MySQL could not scale to support new initiatives:
- Shift to SaaS - central cloud portal

- Real-time reporting

- Time series analysis of operational metrics

CrateDB Results
• Easy migration from MySQL, in weeks

• Simple scaling with CrateDB on Docker

• Real-time data - concurrent SaaS users and API for application
partners

• 40x better DBMS price-performance vs. MySQL
Customer:
–
“We need to process massive amounts of
data our customers’ vehicles generate, in
real time. CrateDB offered the best
performance, scalability, and ease-of-use of
any SQL or NoSQL DBMS we tried.”
Mark Sutheran, 

Founder, Clickdrive
Use case: Vehicle Fleet Management - Singapore
• Internet-enabled vehicle fleet monitoring system

• Used by Singapore taxis, insurance vehicle fleets

• Real-time monitoring of vehicle location & health, improves fleet
utilization, safety, driver behavior, profitability 

Data Challenges
• Real-time vehicle status & location, while ingesting 1,500 data points per
second per car, 24x7

• Data science - query 10s of terabytes of vehicle system data to develop
predictive maintenance algorithms

• MySQL can’t scale, Cassandra required too much tuning

CrateDB Results
• Revealed hidden maintenance issues with 50% of vehicles

• Reduced repair costs 20% by predicting problems earlier

• Data processing speed enabling development of 3D accident recreation
within minutes
The Next Wave of Big Data
–
“IoT is creating unparalleled information
management and analytics challenges.”
- Jim Hare, Gartner
Every
Step
Every
Lightbulb
Every
Message
Every
Bottle
•Firehose of data
•Complex data
•Real-time
•Edge + Cloud
Millions of data points per second
Instantly actionable - current & large historic data sets
Run anywhere. Cloud. On-premises Containers. Small
footprint or large clusters with 100+ nodes.
Joins, Time Series, Geospatial, JSON, Text search, AI, Blobs
Your Machine Data Management Options
-
But More Likely …
-
First… Then… Lately…
Log search,
analytics 

Full stack -
forwarders,
indexers, search
heads, visualization
Open source

Log search,
analytics

Full stack -
Elasticsearch,
Logstash, Kibana
Time Series, 

IT metrics
Traditional SQL Splunk, et al
Firehose of data ❌ ✅
Complex queries &
dynamic data
❌ ✅
Fast (Real-time) Queries ❌ ✴
Why Not SQL?

–
SQL Mainstream Must be Enabled to Achieve IoT Growth

–
45:1
Ratio of SQL to NoSQL
developers 

(Source: LinkedIn)
By 2020, 50% of new
systems are IoT related
IoT
Reinventing SQL
for Machine Data
The Newest Generation of SQL
–
SQL NOSQL
Crate Components (  ​   ​Crate   ​   ​ Elasticsearch ,   ​   ​other Open Source) 
The CrateDB Open Source Stack
–
1 file to download & install

Benefits of NoSQL with
SQL ease of use
CrateDB - the key inventions

–
Distributed SQL with search, time
series, geospatial, aggregations
Cloud-native architecture
easy scaling via Containers
NoSQL storage & clustering for
horizontal scaling & dynamic schema
Columnar Caches for real-time, in-
memory SQL query performance
shared-nothing architecture
If you know SQL, you know CrateDB
–
Simple install

Zero-configuration, auto-join
Compatible

ANSI SQL vis Postgres-wire
protocol, JDBC, REST
Real-time performance

Distributed SQL query engine
Dynamic schema

all data (structured + JSON), time
series, geospatial
Distributed SQL query versatility

Aggregations, time series, search,
geospatial…
Simpler scalability

Shared nothing, horizontal scale out

Always on

High availability, replication, self-
healing
Flexible

No lock-in, runs any cloud and on-
premise
CrateDB Traditional SQL NoSQL
Firehose of data ✅ ✴ ✅
Complex,
dynamic data ✅ ❌ ✅
Real-Time Queries ✅ ❌ ✴
SQL ✅ ✅ ❌
New DBMS Required for “Things Data” Era?

–
Performance?
–
• CrateDB linear scalability

- Performance rises linearly with cluster
size

• CrateDB vs. PostgreSQL

- Complex queries run 29x faster in
CrateDB on 30% lower hardware cost

• InfluxDB (time series)

- 7x more query throughput under
concurrent user load - better for multi-
user time series apps (SaaS)
Apps
DB
Input
CrateDB Open Machine Data Stack - build your own with SQL
—
‣ Integrates easily
‣ Low learning curve
‣ Greatest flexibility
‣ No lock in
Custom

SQL Apps
Built for the Open Machine Data Stack
—
A database rarely exists independently. Instead, it is usually part of an ecosystem of tools and
other products, with each covering a different need in a data pipeline.
1. Trackers 2. Collectors 3. Enrich 4. Storage
5. Data
Modeling
6. Analytics
If You’re Doing Distributed…
–
Gateway
Devices
Servers, Sensors, 

Actuators, Machines,

Wearables, Cars etc.
Applications

& PlatformsGateway & DB
Edge Public/Hybrid/Private
shared-nothing architecture
CrateDB enables use-cases at the “edge” and in the cloud, with SQL, horizontal scaling, high availability, and multi-model data
structures. With CrateDB, customers can extract value from realtime data, enabling applications & services not possible before.
MQTT Broker & Ingestion Framework
–
• Message queues were invented to compensate for
DBMS weaknesses

- Downtime

- Slow ingestion

• New databases like CrateDB don’t have those
pitfalls

• Embedding MQTT broker in CrateDB 

- Define “Ingestion rules” in CrateDB

• MQTT topic —> Target table for storage

- Stores messages in tables

- Eliminates the need for extra middleware

• Lowers hosting costs, complexity, development time
Message Queue
Devices
MQTT messages
versus
DBMS
slow ingest &
DB downtime Fast ingestion. Always-on architecture
Embedded MQTT Broker
MQTT messages
Devices
MQTT Broker
MQTT Consumer/Writer
CrateDB Output Plugin for Telegraf
–
• Telegraf is a plugin-driven server for
collecting metrics, usually connecting
to InfluxDB

• New Telegraf plug-in writes to
CrateDB via the PostgreSQL protocol

• More turnkey integration with popular
time series data sources

• Makes it easy to migrate existing time
series data workloads to CrateDB

- For more complex data & queries

- SQL access

- Larger data / time windows

- More concurrent users
Applications

& Platforms
shared-nothing architecture
System
Stats
DBs
Networks
Message

Queues
Apps
Telegraf
Connect CrateDB to
dozens of data sources
SQL
Prometheus Integration
–
• Prometheus is a standard time series store
for monitoring IT infrastructure

- Simple, standard systems monitoring data
endpoint e.g. Docker

• Prometheus Remote Adapter for CrateDB
- Developed by RobustPerception.io

- Standard way for Prometheus to pass read/
write requests to other back-end databases

• Docker & other IT software can use CrateDB
for larger, more complex time series analysis
CrateDB
Adapter
Local storage
Unlimited storage
Unlimited data &
query complexity
Remote
read/write
protocol
Prometheus
IT Software
CrateDB
Systems
monitoring
event data
SQL for Machine
Data at ALPLA
Customer - ALPLA

–
•172 factories in 45 countries

•18,000 employees

•Global manufacturer

- Innovation leader

- Cost leader

•Plastic packaging products

- Bottles, caps, …

• eg. every CocaCola bottle in USA
Use Case
–
•Through real-time monitoring:

- Increase equipment efficiency (OEE)

- Decrease resource utilization

- Simplify labor management

•Complexity:

- 1500 production lines

- 900 different sensor types

- 160M bottles/day to be measured
Data collection
–
Production machine
data is collected at the
edge (Docker, CrateDB) 

JSON messages sent
over internet to cloud

Central data storage for
realtime dashboards,
monitoring, alerting,
prediction, machine
learning
Solution
–
24x7 central

Mission Control
for all factories
• Scale to all production lines, connect all feeds, collect all raw data

• Aggregate, monitor, predict things from huge data volumes

• Take action from data immediately through tablets, Hololens, etc.
Docker in the
cloud
–
• RabbitMQ receiving data

• CrateDB as storage for raw data

• Enrichment of data

• CrateDB as storage for enriched
data

• API

• Realtime management system

• Dashboards

• API for Hololens

RabbitMQ
CrateDB Enrichment
API Dashboards
Hololens …
In Summary…
-
• New machine data requirements

- Firehose

- Complex

- Real time

• SQL coming [back] to the rescue

- New DBMS architecture

- Same scale, performance, dynamic data as NoSQL

- Easier learning curve & integration (more choices)

- Better economics

• Splunk & ELK stack a good choice when

- You need turnkey Security Analytics / SIEM
Thank You!
-
• CrateDB

- https://crate.io

• Slides & recording of this will be sent to you shortly, via email

• Ping me any time

- Andy Ellicott

- andy@crate.io

Contenu connexe

Tendances

Big data on AWS
Big data on AWSBig data on AWS
Big data on AWSStylight
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...Lucas Jellema
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AITorsten Steinbach
 
Playing to Win: Turbocharged Tableau with a GPU Database
Playing to Win: Turbocharged Tableau with a GPU DatabasePlaying to Win: Turbocharged Tableau with a GPU Database
Playing to Win: Turbocharged Tableau with a GPU DatabaseKinetica
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsKinetica
 
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...In-Memory Computing Summit
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing Luay AL-Assadi
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabaseKinetica
 
Reference architecture for Internet of Things
Reference architecture for Internet of ThingsReference architecture for Internet of Things
Reference architecture for Internet of ThingsSujee Maniyam
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeSimplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeDatabricks
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsKinetica
 
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Lucas Jellema
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesDaniel Mescheder
 
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...Data Con LA
 

Tendances (20)

Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
 
Playing to Win: Turbocharged Tableau with a GPU Database
Playing to Win: Turbocharged Tableau with a GPU DatabasePlaying to Win: Turbocharged Tableau with a GPU Database
Playing to Win: Turbocharged Tableau with a GPU Database
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
 
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...
IMCSummit 2015 - Day 2 Developer Track - A Reference Architecture for the Int...
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
 
Reference architecture for Internet of Things
Reference architecture for Internet of ThingsReference architecture for Internet of Things
Reference architecture for Internet of Things
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeSimplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
 
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
A Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data PipelinesA Microservice Architecture for Big Data Pipelines
A Microservice Architecture for Big Data Pipelines
 
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...
Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscap...
 

Similaire à Webinar: SQL for Machine Data?

Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar Caroline Stewart
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Dataconomy Media
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastDatabricks
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewAmazon Web Services
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAmazon Web Services
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeSingleStore
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scaleOvais Tariq
 
Future Grid Overview 2018
Future Grid Overview 2018Future Grid Overview 2018
Future Grid Overview 2018Chris J Law
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseDataStax
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Qubole
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWSCaserta
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Gary Arora
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analyticsAmazon Web Services
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformDr. Ketan Parmar
 

Similaire à Webinar: SQL for Machine Data? (20)

Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and Fast
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real Time
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scale
 
Future Grid Overview 2018
Future Grid Overview 2018Future Grid Overview 2018
Future Grid Overview 2018
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWS
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 

Dernier

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Dernier (20)

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Webinar: SQL for Machine Data?

  • 1. February, 2018 Andy Ellicott, Crate.io SQL for Machine Data?
  • 2. Logistics… • Submit questions at any time via the questions panel • Slides & recording will be shared via email after the event
  • 3. Agenda – • Machine data - the next big wave? • Machine data use cases • Machine data management options - Splunk, ELK, Time Series, • Reinventing SQL for machine data • SQL examples • Questions & answers
  • 4. I like databases 25 years in DBMS & software development companies IMHO…the coolest ways software is changing what’s possible in life and business…is usually due to some database changing what’s possible with software.
  • 5. The next wave of big data will come from machines “Things Data”
  • 6. The next wave…“Things Data” – By 2020, 50% of new software systems are IoT related IoT
  • 7. Putting Machine Data to Work — • Definitive record of all activity and behavior - What happened, when, where, by whom • Tells us how to optimize: - Customer experience - Safety - Production - Profitability • Where things are going right vs. wrong • Fingerprints of fraud
  • 8. Customer: – “CrateDB’s real-time SQL performance, simple scaling, and high availability make it a key element of our stack” Sekhar Sarukkai Co-founder Use case: Cyber Security - Campbell, CA • Leading Cloud Access Security Broker (CASB) • SaaS system monitors internet traffic for security risks - 700 customers, 40% of F500
 Data Challenges • Original MySQL-ElasticSearch platform grew too costly to run & too hard to maintain - Duplicate data storage, DB syncing code
 CrateDB Results • Replaced MySQL/ElasticSearch with CrateDB in 2015 • ~100TB data, billions of network messages per day • Real-time queries for 1000s of concurrent users • 20x faster, 75% lower AWS costs
  • 9. Customer: – Use case: Industrial IoT - Atlanta, GA • $4B producer of bottles for Coca Cola, P&G, Unilever • 2016 initiative: Use real-time IoT data to optimize overall equipment effectiveness across 170 factories
 Data Challenges • Diversity - 900 different sensor types per production line • MS SQL Server too slow and inflexible - 900 tables (1 per sensor type) - 3 - 5 minute query response times
 CrateDB Results • Easier development - 1 table vs. 900 in SQL Server • Faster dashboards - 20ms vs. 4,000ms • Central cloud + edge deployment = insight on factory floor and in central “Mission Control” • Lower labor costs and greater overall equipment effectiveness (OEE) “Thousands of sensors generate data along our production lines, and CrateDB allows us to analyze that firehose of data 24 hours a day to make real-time improvements to factory efficiency.” Philipp Lehner, CEO Alpla, USA
  • 10. Customer: – Use case: Smart Lighting - Los Angeles, CA • $2B global leader in IoT-enabled industrial lighting • Lighting Burj Khalifa, OfficeMax & Sainsbury’s chains • Software to control & monitor complex network of lighting, plus presence, energy, & WiFi sensors
 Data Challenges • MySQL could not scale to support new initiatives: - Shift to SaaS - central cloud portal - Real-time reporting - Time series analysis of operational metrics
 CrateDB Results • Easy migration from MySQL, in weeks • Simple scaling with CrateDB on Docker • Real-time data - concurrent SaaS users and API for application partners • 40x better DBMS price-performance vs. MySQL
  • 11. Customer: – “We need to process massive amounts of data our customers’ vehicles generate, in real time. CrateDB offered the best performance, scalability, and ease-of-use of any SQL or NoSQL DBMS we tried.” Mark Sutheran, Founder, Clickdrive Use case: Vehicle Fleet Management - Singapore • Internet-enabled vehicle fleet monitoring system • Used by Singapore taxis, insurance vehicle fleets • Real-time monitoring of vehicle location & health, improves fleet utilization, safety, driver behavior, profitability Data Challenges • Real-time vehicle status & location, while ingesting 1,500 data points per second per car, 24x7 • Data science - query 10s of terabytes of vehicle system data to develop predictive maintenance algorithms • MySQL can’t scale, Cassandra required too much tuning
 CrateDB Results • Revealed hidden maintenance issues with 50% of vehicles • Reduced repair costs 20% by predicting problems earlier • Data processing speed enabling development of 3D accident recreation within minutes
  • 12. The Next Wave of Big Data – “IoT is creating unparalleled information management and analytics challenges.” - Jim Hare, Gartner Every Step Every Lightbulb Every Message Every Bottle •Firehose of data •Complex data •Real-time •Edge + Cloud Millions of data points per second Instantly actionable - current & large historic data sets Run anywhere. Cloud. On-premises Containers. Small footprint or large clusters with 100+ nodes. Joins, Time Series, Geospatial, JSON, Text search, AI, Blobs
  • 13. Your Machine Data Management Options -
  • 14. But More Likely … - First… Then… Lately… Log search, analytics Full stack - forwarders, indexers, search heads, visualization Open source Log search, analytics Full stack - Elasticsearch, Logstash, Kibana Time Series, IT metrics
  • 15. Traditional SQL Splunk, et al Firehose of data ❌ ✅ Complex queries & dynamic data ❌ ✅ Fast (Real-time) Queries ❌ ✴ Why Not SQL? –
  • 16. SQL Mainstream Must be Enabled to Achieve IoT Growth – 45:1 Ratio of SQL to NoSQL developers 
 (Source: LinkedIn) By 2020, 50% of new systems are IoT related IoT
  • 18. The Newest Generation of SQL – SQL NOSQL
  • 19. Crate Components (  ​   ​Crate   ​   ​ Elasticsearch ,   ​   ​other Open Source)  The CrateDB Open Source Stack – 1 file to download & install Benefits of NoSQL with SQL ease of use
  • 20. CrateDB - the key inventions
 – Distributed SQL with search, time series, geospatial, aggregations Cloud-native architecture easy scaling via Containers NoSQL storage & clustering for horizontal scaling & dynamic schema Columnar Caches for real-time, in- memory SQL query performance shared-nothing architecture
  • 21. If you know SQL, you know CrateDB – Simple install
 Zero-configuration, auto-join Compatible
 ANSI SQL vis Postgres-wire protocol, JDBC, REST Real-time performance
 Distributed SQL query engine Dynamic schema
 all data (structured + JSON), time series, geospatial Distributed SQL query versatility
 Aggregations, time series, search, geospatial… Simpler scalability
 Shared nothing, horizontal scale out Always on
 High availability, replication, self- healing Flexible
 No lock-in, runs any cloud and on- premise
  • 22. CrateDB Traditional SQL NoSQL Firehose of data ✅ ✴ ✅ Complex, dynamic data ✅ ❌ ✅ Real-Time Queries ✅ ❌ ✴ SQL ✅ ✅ ❌ New DBMS Required for “Things Data” Era? –
  • 23. Performance? – • CrateDB linear scalability - Performance rises linearly with cluster size • CrateDB vs. PostgreSQL - Complex queries run 29x faster in CrateDB on 30% lower hardware cost • InfluxDB (time series) - 7x more query throughput under concurrent user load - better for multi- user time series apps (SaaS)
  • 24. Apps DB Input CrateDB Open Machine Data Stack - build your own with SQL — ‣ Integrates easily ‣ Low learning curve ‣ Greatest flexibility ‣ No lock in Custom
 SQL Apps
  • 25. Built for the Open Machine Data Stack — A database rarely exists independently. Instead, it is usually part of an ecosystem of tools and other products, with each covering a different need in a data pipeline. 1. Trackers 2. Collectors 3. Enrich 4. Storage 5. Data Modeling 6. Analytics
  • 26. If You’re Doing Distributed… – Gateway Devices Servers, Sensors, 
 Actuators, Machines,
 Wearables, Cars etc. Applications & PlatformsGateway & DB Edge Public/Hybrid/Private shared-nothing architecture CrateDB enables use-cases at the “edge” and in the cloud, with SQL, horizontal scaling, high availability, and multi-model data structures. With CrateDB, customers can extract value from realtime data, enabling applications & services not possible before.
  • 27. MQTT Broker & Ingestion Framework – • Message queues were invented to compensate for DBMS weaknesses - Downtime - Slow ingestion • New databases like CrateDB don’t have those pitfalls • Embedding MQTT broker in CrateDB - Define “Ingestion rules” in CrateDB • MQTT topic —> Target table for storage - Stores messages in tables - Eliminates the need for extra middleware • Lowers hosting costs, complexity, development time Message Queue Devices MQTT messages versus DBMS slow ingest & DB downtime Fast ingestion. Always-on architecture Embedded MQTT Broker MQTT messages Devices MQTT Broker MQTT Consumer/Writer
  • 28. CrateDB Output Plugin for Telegraf – • Telegraf is a plugin-driven server for collecting metrics, usually connecting to InfluxDB
 • New Telegraf plug-in writes to CrateDB via the PostgreSQL protocol • More turnkey integration with popular time series data sources • Makes it easy to migrate existing time series data workloads to CrateDB - For more complex data & queries - SQL access - Larger data / time windows - More concurrent users Applications & Platforms shared-nothing architecture System Stats DBs Networks Message Queues Apps Telegraf Connect CrateDB to dozens of data sources SQL
  • 29. Prometheus Integration – • Prometheus is a standard time series store for monitoring IT infrastructure - Simple, standard systems monitoring data endpoint e.g. Docker • Prometheus Remote Adapter for CrateDB - Developed by RobustPerception.io - Standard way for Prometheus to pass read/ write requests to other back-end databases • Docker & other IT software can use CrateDB for larger, more complex time series analysis CrateDB Adapter Local storage Unlimited storage Unlimited data & query complexity Remote read/write protocol Prometheus IT Software CrateDB Systems monitoring event data
  • 31. Customer - ALPLA
 – •172 factories in 45 countries •18,000 employees •Global manufacturer - Innovation leader - Cost leader •Plastic packaging products - Bottles, caps, … • eg. every CocaCola bottle in USA
  • 32. Use Case – •Through real-time monitoring: - Increase equipment efficiency (OEE) - Decrease resource utilization - Simplify labor management •Complexity: - 1500 production lines - 900 different sensor types - 160M bottles/day to be measured
  • 33. Data collection – Production machine data is collected at the edge (Docker, CrateDB) JSON messages sent over internet to cloud Central data storage for realtime dashboards, monitoring, alerting, prediction, machine learning
  • 34. Solution – 24x7 central
 Mission Control for all factories • Scale to all production lines, connect all feeds, collect all raw data • Aggregate, monitor, predict things from huge data volumes • Take action from data immediately through tablets, Hololens, etc.
  • 35. Docker in the cloud – • RabbitMQ receiving data • CrateDB as storage for raw data • Enrichment of data • CrateDB as storage for enriched data • API • Realtime management system • Dashboards • API for Hololens RabbitMQ CrateDB Enrichment API Dashboards Hololens …
  • 36. In Summary… - • New machine data requirements - Firehose - Complex - Real time • SQL coming [back] to the rescue - New DBMS architecture - Same scale, performance, dynamic data as NoSQL - Easier learning curve & integration (more choices) - Better economics • Splunk & ELK stack a good choice when - You need turnkey Security Analytics / SIEM
  • 37. Thank You! - • CrateDB - https://crate.io • Slides & recording of this will be sent to you shortly, via email • Ping me any time - Andy Ellicott - andy@crate.io