SlideShare une entreprise Scribd logo
1  sur  43
introduction
20 years of enterprise architecture experience
CRM, EDRM, ERP, EIP, Digital Services, Security, BI,
Analytics, and MDM. TOGAF certified and a BA (hons)
degree in Theology (!) and Computer Studies from the
University of Birmingham in the United Kingdom.
Peter Marshall
EMEA Field Engineering
peter.marshall@imply.io
https://www.linkedin.com/in/amillionbytes
+44 781 361 8895
Agenda
What inspired Druid?
How is it built?
What does it do to the data?
The Druid Story
A Druid Cluster
Data in Druid
Q&A
Agenda
Why was Druid developed?
Who developed it?
The Druid Story
A Druid Cluster
Data in Druid
Q&A
unscalable poor query and ingestion scalability
disjointed event data and batch data separately
limiting expensive, product-constraining precalculation
slow high latencies from production to presentation
high performance low-latency, distributed query execution and high
throughput ingestion
real-time
event data (clickstream, network flows, user
behavior, programmatic advertising, server
metrics, IoT…)
analytics counting, ranking, statistics...
database
highly-available, time-sharded, partitioned,
columnar, indexed, compressed, versioned
materialized view
2012 Druid open sourced
Druid 0.5 - Query prioritization, spatial indexing, group limits and ordering, indexing improvements
PyDruid promoted
Druid 0.6 - Union, topN, alpha sorting, smarter brokers
Apache 2.0 License applied
Druid 0.7.1 - Case sensitivity, pluggable metadata DB, LZ4 compression, improved balance, GROUP BY hashing
Druid 0.7.2
Community Project formed
Druid 0.8.3 - Broker performance improvements, Azure Blob storage support, TopN improvements, IPv6, more metrics
Druid 0.9.0 - Extensions overhaul, doubleMax / Min aggregators, Avro support, Graphite emitter, regex search, dimension ordering
Druid 0.9.1 - Query-time lookups, native Kafka indexing, statsD emitter, authorization, cost-based distribution, filter optimization
Druid 0.9.2 - New GROUP BY engine, roll-up made optional, Long filtering and encoding, datasketches, ORC support
Druid 010.0 - Apache Calcite SQL, Kerberos, LIKE, 2GB+ dims, GROUP BY improved, Ambari emitter, more DB support
First Powered By list includes Yahoo!, Walmart, TripleLift, PayPal, Netflix, Hulu, eBay, and AirBnB
Druid 0.10.1
Druid 0.11.0 - Double columns, TLS encryption, auth extensions, Redis cache, GROUP BY improvements, MORE SQL
Druid becomes an incubating Apache project
Druid 0.12.0 - Incremental Kafka hand-off, compaction task, Quantiles sketch, basic auth, query queue improvements, MORE SQL
Druid 0.12.1 - Kerberos and Kafka improvements
Druid 0.12.2 - Ingestion improvements and better query caching
Druid 0.12.3 - Even more ingestion and query improvements
Druid 0.13 - Parallel batch, auto-compaction, SYSTEM, HyperLogLog, NULLs, blooms, new aggregators, OpenTSDB emitter
Druid 0.14 - New web console, Kinesis indexer, Parquet support, DogStatsD emitter, improved data balance
Druid 0.14.2 - Improved datasketch support
Druid 0.15 - Dataloader UI, moving averages, Moments datasketch, Orc core extensions, MORE SQL
Druid 0.16 - Server / SQL / Data UI, vectorization, minor compaction, GROUP BY arrays, Batch Shuffle, Docker imageexcel
Druid 0.17 - Superbatch, parallel auto-compaction, optimistic result merging, SQL NULLs, LDAP, Tasks sys table, lazy loading
Apache Druid Summit, San Francisco
2013
2014
2015
2016
2017
2018
2019
2020
“It can do very large, OLAP-style processing
on the fly in hundreds of milliseconds instead
of precalculating everything…
“The performance is great ... some of the
tables that we have internally in Druid have
billions and billions of events in them, and
we’re scanning them in under a second.”
https://www.infoworld.com/article/2949168/hadoop/yahoo-struts-its-hadoop-stuff.html
TIM TULLY
VP Engineering
Message Buses & Queues
Relational Databases
NoSql Databases
Applications & APIs
Specialised Collectors
Consolidation
Enrichment
Transformation & Stripping
Verification & Validation
Filtering & Sorting
Logical Reasoning
Feature & Structure Discovery
Segmentation & Classification
Recommendation
Forecasting & Prediction
Anomaly Detection
Automated Decision Making
Statistical Calculations
Real-time Analytics
BI Reporting & Dashboards
Search
Buses, Queues & Databases
Applications & APIs
Data Procurement Data Preparation Data Processing Data Presentation
Data Pipeline
Complete Scalable Focused
parallelised ingestion and
distributed storage
broad, detailed, current and
historical data sets
sub-second ad-hoc
statistical calculation
Complete Scalable Focused
parallelised ingestion and
distributed storage
broad, detailed, current and
historical data sets
sub-second ad-hoc
statistical calculation
search
real-time ingest, flexible
schema, text search, combined
view
columnar
efficient storage,
fast analytic queries
timeseries
low latency, time-based
datasets and functions
Agenda
The Druid Story
A Druid Cluster
Data in Druid
Q&A
Agenda
What are the key processes?
What do they do?
How scalable is it?
How do I keep an eye on health?
The Druid Story
A Druid Cluster
Data in Druid
Q&A
Zookeeper
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
Deep Store
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
Zookeeper
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
Deep Store
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
EVENT INGEST
BATCH INGEST
CloudStorage
Web
Disk
Hadoop
EventHubs
middlemanager
Stream & Batch Data Ingestion Tasks
Data Segment Creation
Deep Storage Writing
Fresh Data Querying (Memory / Disk Cache)
Data Compaction Processing
REGISTER
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
Zookeeper
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
Deep Store
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
middlemanager
EVENT INGEST
BATCH INGEST
CloudStorage
Web
Disk
Hadoop
EventHubs
Stream & Batch Data Ingestion Tasks
Data Segment Creation
Deep Storage Writing
Fresh Data Querying (Memory / Disk Cache)
Data Compaction Processing
REGISTER
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
historical
historical
LOAD
historical
historical
historical
historical
Deep Data Storage & Retrieval
Deep Data Disposal
Deep Data Replication & Transfer
Master Data Lookups
historical
Zookeeper
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
Deep Store
historical
historical
historical
historical
historical
historical
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
middlemanager
middlemanager
middlemanager
middlemanager
middlemanager
middlemanager
Zookeeper
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
In Memory Disk
Query Cache
Deep Store
8091
Last Known Good Query Distribution
Results Consolidation & Aggregation
Druid Metadata Views
Apache Calcite SQL-to-Druid JSON
SQL, REST, JDBC interfaces
broker
API
HISTORICAL RESULTS
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
REAL-TIME RESULTS
In Memory
historical
historical
historical
historical
historical
historical
middlemanager
middlemanager
middlemanager
middlemanager
middlemanager
middlemanager
LOCATION
Zookeeper
Data
INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES
Query
LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS
Last Known Good Query Distribution
Results Consolidation & Aggregation
Druid Metadata Views
Apache Calcite SQL-to-Druid JSON
SQL, REST, JDBC interfaces
broker
Deep Data Storage & Retrieval
Deep Data Disposal
Deep Data Replication & Transfer
Master Data Lookups
historical
Stream & Batch Data Ingestion Tasks
Data Segment Creation
Deep Storage Writing
Fresh Data Querying (Memory / Disk Cache)
Data Compaction Processing
Deep Store
Master
ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE
Data Distribution Task Co-ordination
Segment Tracking
Retention, Load Balancing, Health Goals
Ingestion Task State Recording
Ingestion Task Management
overlordcoordinator
Metadata DB
Query Cache
In Memory Disk
In Memory
middlemanager
8082
8083
8081
8090
8091
Agenda
The Druid Story
A Druid Cluster
Data in Druid
Q&A
Agenda
What are Druid’s data optimisations?
Why are those optimisations so cool?
What types of data work best?
The Druid Story
A Druid Cluster
Data in Druid
Q&A
Time Series
Columnar
Search
Source Format
Segment Period
Dimensions
Row Filters
Transforms
Build your ingestion spec in a GUI... Monitor tasks, check segments, run SQL...
Columnar
Dictionary Encoded
Bitmap Indexed
Compressed
Source Format
Segment Period
Dimensions
Row Filters
Transforms
when who what stockChange
12:30 peter purchased 5
12:32 paul purchased 6
12:33 paul purchased 9
12:35 paul purchased 2
12:40 peter returned -2
12:42 peter purchased 3
12:45 paul purchased 8
12:50 ahmed purchased 2
12:52 paul purchased 4
12:55 peter returned -5
satisfaction
5
3
4
2
1
2
3
4
2
1
when who what stockChange
12:30 peter purchased 5
12:32 paul purchased 6
12:33 paul purchased 9
12:35 paul purchased 2
12:40 peter returned -2
12:42 peter purchased 3
12:45 paul purchased 8
12:50 ahmed purchased 2
12:52 paul purchased 4
12:55 peter returned -5
satisfaction
5
3
4
2
1
2
3
4
2
1
when who what stockChange
12:30 peter purchased 5
12:32 paul purchased 6
12:33 paul purchased 9
12:35 paul purchased 2
12:40 peter returned -2
12:42 peter purchased 3
12:45 paul purchased 8
12:50 ahmed purchased 2
12:52 paul purchased 4
12:55 peter returned -5
satisfaction
5
3
4
2
1
2
3
4
2
1
when what stockChange
12:30 1 5
12:32 6
12:33 9
12:35 2
12:40 2 -2
12:42 3
12:45 8
12:50 2
12:52 4
12:55 -5
satisfactionwho
1
2
1
3
1
1
2
3
peter
paul
ahmed
1
2
purchased
returned
2
2
2
5
3
4
2
1
2
3
4
2
1
2
2
1
1
1
1 1
1
1
1
when what stockChange
12:30
1
5
12:32 6
12:33 9
12:35 2
12:40 2 -2
12:42
1
3
12:45 8
12:50 2
12:52 4
12:55 -5
satisfactionwho
1
2
1
3
1
1
2
3
peter
paul
ahmed
1
2
purchased
returned
2
2
2
5
3
4
2
1
2
3
4
2
1
1
2
3
1000110001
0111001010
0000000100
1
2
1111011110
0000100001
when what stockChange
12:30
1
5
12:32 6
12:33 9
12:35 2
12:40 2 -2
12:42
1
3
12:45 8
12:50 2
12:52 4
12:55 -5
satisfactionwho
1
2
1
3
1 2
2
2
5
3
4
2
1
2
3
4
2
1
Dimensions
Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing
Windowed Aggregates
Metrics ~ Sets ~ Approx Count / Quantiles
when what
1
2
1
who
1
2
1
3
1 2
2
2
stockChange
5
6
9
2
-2
3
8
2
4
-5
satisfaction
5
3
4
2
1
2
3
4
2
1
12:30
12:40
12:50
Dimensions
Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing
Windowed Aggregates
Metrics ~ Sets ~ Approx Count / Quantiles
satisfaction
5
3
4
2
1
2
3
4
2
1
when
12:30
12:40
12:50
what
1
2
1
2
stockChange
5
6
9
2
-2
3
8
2
4
-5
Dimensions
Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing
sumStkChg
22
9
1
maxSat
5
3
4
Windowed Aggregates
Metrics ~ Sets ~ Approx Count / Quantiles
who
1
2
1
3
1
2
2
when
12:30
12:40
12:50
what
1
2
1
who
1
2
1
3
1 2
2
2
Dimensions
Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing
whoHLL
fgndnbe
bsdsfdb
bdfbdsf
Windowed Aggregates
Metrics ~ Sets ~ Approx Count / Quantiles
sumStkChg
22
9
1
maxSat
5
3
4
1
2
3
peter
paul
ahmed
1
2
purchased
returned
wn
12:30
12:40
12:50
wt
1
2
1
2
ssc
22
9
1
ms
5
3
4
wH
fgndnbe
bsdsfdb
bdfbdsf
when who what stockChange
12:30 peter purchased 5
12:32 paul purchased 6
12:33 paul purchased 9
12:35 paul purchased 2
12:40 peter returned -2
12:42 peter purchased 3
12:45 paul purchased 8
12:50 ahmed purchased 2
12:52 paul purchased 4
12:55 peter returned -5
satisfaction
5
3
4
2
1
2
3
4
2
1
1
2
3
peter
paul
ahmed
1
2
purchased
returned
wn
12:30
12:40
12:50
wt
1
2
1
2
ssc
22
9
1
ms
5
3
4
wH
fgndnbe
bsdsfdb
bdfbdsf
https://www.researchgate.net/publication/333831332_Challenging_SQL-on-Hadoop_Performance_with_Apache_Druid
Time Series
Columnar
Search
Agenda
The Druid Story
A Druid Cluster
Data in Druid
Q&A
http://druid.apache.org/
Druid Forum
https://groups.google.com/forum/#!forum/druid-user
Apache Distribution
https://github.com/apache/druid
Druid Community
https://druid.apache.org/community/
Imply Distribution
https://imply.io/get-started
Agenda
The Druid Story
A Druid Cluster
Data in Druid
Q&A

Contenu connexe

Tendances

Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsVMware Tanzu
 
Webinar - Risky Business: How to Balance Innovation & Risk in Big Data
Webinar - Risky Business: How to Balance Innovation & Risk in Big DataWebinar - Risky Business: How to Balance Innovation & Risk in Big Data
Webinar - Risky Business: How to Balance Innovation & Risk in Big DataZaloni
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...DataWorks Summit
 
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...Edureka!
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Zaloni
 
Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery PlatformScott Antony
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignKent Graziano
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013Adam Muise
 
Using Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementUsing Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementDataWorks Summit
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Lviv Startup Club
 
Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsQuontra Solutions
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfRob Winters
 
Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?DATAVERSITY
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 

Tendances (20)

Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Webinar - Risky Business: How to Balance Innovation & Risk in Big Data
Webinar - Risky Business: How to Balance Innovation & Risk in Big DataWebinar - Risky Business: How to Balance Innovation & Risk in Big Data
Webinar - Risky Business: How to Balance Innovation & Risk in Big Data
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
Verizon: Finance Data Lake implementation as a Self Service Discovery Big Dat...
 
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe?
 
Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery Platform
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013
 
Using Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementUsing Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data Management
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
Big data with java
Big data with javaBig data with java
Big data with java
 
Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra Solutions
 
Aster getting started
Aster getting startedAster getting started
Aster getting started
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
 
Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 

Similaire à 20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, real-time, analytical datastore

August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid Imply
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Imply
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Mark Tabladillo
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoCodecamp Romania
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Lace Lofranco
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopAdam Muise
 
Data Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingData Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingAll Things Open
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisNetAppUK
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseElena Lopez
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastEric Kavanagh
 
A Tale of Two BI Standards
A Tale of Two BI StandardsA Tale of Two BI Standards
A Tale of Two BI StandardsArcadia Data
 
5 Years of Progress in Active Data Warehousing
5 Years of Progress in Active Data Warehousing5 Years of Progress in Active Data Warehousing
5 Years of Progress in Active Data WarehousingTeradata
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesDenodo
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And HadoopEdureka!
 

Similaire à 20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, real-time, analytical datastore (20)

August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
 
Data Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingData Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data Warehousing
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast
 
datavault2.pptx
datavault2.pptxdatavault2.pptx
datavault2.pptx
 
A Tale of Two BI Standards
A Tale of Two BI StandardsA Tale of Two BI Standards
A Tale of Two BI Standards
 
5 Years of Progress in Active Data Warehousing
5 Years of Progress in Active Data Warehousing5 Years of Progress in Active Data Warehousing
5 Years of Progress in Active Data Warehousing
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
 

Plus de Athens Big Data

22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
 
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage systemAthens Big Data
 
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...Athens Big Data
 
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query executionAthens Big Data
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...Athens Big Data
 
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the coversAthens Big Data
 
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: VeltiAthens Big Data
 
19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understandingAthens Big Data
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on KubernetesAthens Big Data
 
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a ServiceAthens Big Data
 
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...Athens Big Data
 
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...Athens Big Data
 
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...Athens Big Data
 
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...Athens Big Data
 
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On MesosAthens Big Data
 
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...Athens Big Data
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...Athens Big Data
 
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...Athens Big Data
 
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And GradingAthens Big Data
 

Plus de Athens Big Data (20)

22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
 
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
 
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
 
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
 
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
 
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
 
19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
 
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
 
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
 
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
 
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
 
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
 
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
 
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...
 
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
 
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
 

Dernier

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, real-time, analytical datastore

  • 2. 20 years of enterprise architecture experience CRM, EDRM, ERP, EIP, Digital Services, Security, BI, Analytics, and MDM. TOGAF certified and a BA (hons) degree in Theology (!) and Computer Studies from the University of Birmingham in the United Kingdom. Peter Marshall EMEA Field Engineering peter.marshall@imply.io https://www.linkedin.com/in/amillionbytes +44 781 361 8895
  • 3. Agenda What inspired Druid? How is it built? What does it do to the data? The Druid Story A Druid Cluster Data in Druid Q&A
  • 4. Agenda Why was Druid developed? Who developed it? The Druid Story A Druid Cluster Data in Druid Q&A
  • 5. unscalable poor query and ingestion scalability disjointed event data and batch data separately limiting expensive, product-constraining precalculation slow high latencies from production to presentation
  • 6. high performance low-latency, distributed query execution and high throughput ingestion real-time event data (clickstream, network flows, user behavior, programmatic advertising, server metrics, IoT…) analytics counting, ranking, statistics... database highly-available, time-sharded, partitioned, columnar, indexed, compressed, versioned materialized view
  • 7. 2012 Druid open sourced Druid 0.5 - Query prioritization, spatial indexing, group limits and ordering, indexing improvements PyDruid promoted Druid 0.6 - Union, topN, alpha sorting, smarter brokers Apache 2.0 License applied Druid 0.7.1 - Case sensitivity, pluggable metadata DB, LZ4 compression, improved balance, GROUP BY hashing Druid 0.7.2 Community Project formed Druid 0.8.3 - Broker performance improvements, Azure Blob storage support, TopN improvements, IPv6, more metrics Druid 0.9.0 - Extensions overhaul, doubleMax / Min aggregators, Avro support, Graphite emitter, regex search, dimension ordering Druid 0.9.1 - Query-time lookups, native Kafka indexing, statsD emitter, authorization, cost-based distribution, filter optimization Druid 0.9.2 - New GROUP BY engine, roll-up made optional, Long filtering and encoding, datasketches, ORC support Druid 010.0 - Apache Calcite SQL, Kerberos, LIKE, 2GB+ dims, GROUP BY improved, Ambari emitter, more DB support First Powered By list includes Yahoo!, Walmart, TripleLift, PayPal, Netflix, Hulu, eBay, and AirBnB Druid 0.10.1 Druid 0.11.0 - Double columns, TLS encryption, auth extensions, Redis cache, GROUP BY improvements, MORE SQL Druid becomes an incubating Apache project Druid 0.12.0 - Incremental Kafka hand-off, compaction task, Quantiles sketch, basic auth, query queue improvements, MORE SQL Druid 0.12.1 - Kerberos and Kafka improvements Druid 0.12.2 - Ingestion improvements and better query caching Druid 0.12.3 - Even more ingestion and query improvements Druid 0.13 - Parallel batch, auto-compaction, SYSTEM, HyperLogLog, NULLs, blooms, new aggregators, OpenTSDB emitter Druid 0.14 - New web console, Kinesis indexer, Parquet support, DogStatsD emitter, improved data balance Druid 0.14.2 - Improved datasketch support Druid 0.15 - Dataloader UI, moving averages, Moments datasketch, Orc core extensions, MORE SQL Druid 0.16 - Server / SQL / Data UI, vectorization, minor compaction, GROUP BY arrays, Batch Shuffle, Docker imageexcel Druid 0.17 - Superbatch, parallel auto-compaction, optimistic result merging, SQL NULLs, LDAP, Tasks sys table, lazy loading Apache Druid Summit, San Francisco 2013 2014 2015 2016 2017 2018 2019 2020
  • 8. “It can do very large, OLAP-style processing on the fly in hundreds of milliseconds instead of precalculating everything… “The performance is great ... some of the tables that we have internally in Druid have billions and billions of events in them, and we’re scanning them in under a second.” https://www.infoworld.com/article/2949168/hadoop/yahoo-struts-its-hadoop-stuff.html TIM TULLY VP Engineering
  • 9. Message Buses & Queues Relational Databases NoSql Databases Applications & APIs Specialised Collectors Consolidation Enrichment Transformation & Stripping Verification & Validation Filtering & Sorting Logical Reasoning Feature & Structure Discovery Segmentation & Classification Recommendation Forecasting & Prediction Anomaly Detection Automated Decision Making Statistical Calculations Real-time Analytics BI Reporting & Dashboards Search Buses, Queues & Databases Applications & APIs Data Procurement Data Preparation Data Processing Data Presentation Data Pipeline
  • 10. Complete Scalable Focused parallelised ingestion and distributed storage broad, detailed, current and historical data sets sub-second ad-hoc statistical calculation
  • 11. Complete Scalable Focused parallelised ingestion and distributed storage broad, detailed, current and historical data sets sub-second ad-hoc statistical calculation search real-time ingest, flexible schema, text search, combined view columnar efficient storage, fast analytic queries timeseries low latency, time-based datasets and functions
  • 12. Agenda The Druid Story A Druid Cluster Data in Druid Q&A
  • 13. Agenda What are the key processes? What do they do? How scalable is it? How do I keep an eye on health? The Druid Story A Druid Cluster Data in Druid Q&A
  • 14. Zookeeper Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS Deep Store Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB
  • 15. Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES Zookeeper Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS Deep Store Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE EVENT INGEST BATCH INGEST CloudStorage Web Disk Hadoop EventHubs middlemanager Stream & Batch Data Ingestion Tasks Data Segment Creation Deep Storage Writing Fresh Data Querying (Memory / Disk Cache) Data Compaction Processing REGISTER Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB
  • 16. Zookeeper Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS Deep Store Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE middlemanager EVENT INGEST BATCH INGEST CloudStorage Web Disk Hadoop EventHubs Stream & Batch Data Ingestion Tasks Data Segment Creation Deep Storage Writing Fresh Data Querying (Memory / Disk Cache) Data Compaction Processing REGISTER Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB historical historical LOAD historical historical historical historical Deep Data Storage & Retrieval Deep Data Disposal Deep Data Replication & Transfer Master Data Lookups historical
  • 17. Zookeeper Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES Deep Store historical historical historical historical historical historical Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS middlemanager middlemanager middlemanager middlemanager middlemanager middlemanager
  • 18. Zookeeper Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES In Memory Disk Query Cache Deep Store 8091 Last Known Good Query Distribution Results Consolidation & Aggregation Druid Metadata Views Apache Calcite SQL-to-Druid JSON SQL, REST, JDBC interfaces broker API HISTORICAL RESULTS Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB REAL-TIME RESULTS In Memory historical historical historical historical historical historical middlemanager middlemanager middlemanager middlemanager middlemanager middlemanager LOCATION
  • 19. Zookeeper Data INGEST DATA ⬩ STORE DATA ⬩ RESPOND TO QUERIES Query LOCATE DATA ⬩ ISSUE QUERY ⬩ MERGE RESULTS Last Known Good Query Distribution Results Consolidation & Aggregation Druid Metadata Views Apache Calcite SQL-to-Druid JSON SQL, REST, JDBC interfaces broker Deep Data Storage & Retrieval Deep Data Disposal Deep Data Replication & Transfer Master Data Lookups historical Stream & Batch Data Ingestion Tasks Data Segment Creation Deep Storage Writing Fresh Data Querying (Memory / Disk Cache) Data Compaction Processing Deep Store Master ISSUE TASKS ⬩ CATALOGUE DATA ⬩ MONITOR STATE Data Distribution Task Co-ordination Segment Tracking Retention, Load Balancing, Health Goals Ingestion Task State Recording Ingestion Task Management overlordcoordinator Metadata DB Query Cache In Memory Disk In Memory middlemanager 8082 8083 8081 8090 8091
  • 20. Agenda The Druid Story A Druid Cluster Data in Druid Q&A
  • 21. Agenda What are Druid’s data optimisations? Why are those optimisations so cool? What types of data work best? The Druid Story A Druid Cluster Data in Druid Q&A
  • 24. Build your ingestion spec in a GUI... Monitor tasks, check segments, run SQL...
  • 25. Columnar Dictionary Encoded Bitmap Indexed Compressed Source Format Segment Period Dimensions Row Filters Transforms
  • 26. when who what stockChange 12:30 peter purchased 5 12:32 paul purchased 6 12:33 paul purchased 9 12:35 paul purchased 2 12:40 peter returned -2 12:42 peter purchased 3 12:45 paul purchased 8 12:50 ahmed purchased 2 12:52 paul purchased 4 12:55 peter returned -5 satisfaction 5 3 4 2 1 2 3 4 2 1
  • 27. when who what stockChange 12:30 peter purchased 5 12:32 paul purchased 6 12:33 paul purchased 9 12:35 paul purchased 2 12:40 peter returned -2 12:42 peter purchased 3 12:45 paul purchased 8 12:50 ahmed purchased 2 12:52 paul purchased 4 12:55 peter returned -5 satisfaction 5 3 4 2 1 2 3 4 2 1
  • 28.
  • 29. when who what stockChange 12:30 peter purchased 5 12:32 paul purchased 6 12:33 paul purchased 9 12:35 paul purchased 2 12:40 peter returned -2 12:42 peter purchased 3 12:45 paul purchased 8 12:50 ahmed purchased 2 12:52 paul purchased 4 12:55 peter returned -5 satisfaction 5 3 4 2 1 2 3 4 2 1
  • 30. when what stockChange 12:30 1 5 12:32 6 12:33 9 12:35 2 12:40 2 -2 12:42 3 12:45 8 12:50 2 12:52 4 12:55 -5 satisfactionwho 1 2 1 3 1 1 2 3 peter paul ahmed 1 2 purchased returned 2 2 2 5 3 4 2 1 2 3 4 2 1 2 2 1 1 1 1 1 1 1 1
  • 31. when what stockChange 12:30 1 5 12:32 6 12:33 9 12:35 2 12:40 2 -2 12:42 1 3 12:45 8 12:50 2 12:52 4 12:55 -5 satisfactionwho 1 2 1 3 1 1 2 3 peter paul ahmed 1 2 purchased returned 2 2 2 5 3 4 2 1 2 3 4 2 1 1 2 3 1000110001 0111001010 0000000100 1 2 1111011110 0000100001
  • 32. when what stockChange 12:30 1 5 12:32 6 12:33 9 12:35 2 12:40 2 -2 12:42 1 3 12:45 8 12:50 2 12:52 4 12:55 -5 satisfactionwho 1 2 1 3 1 2 2 2 5 3 4 2 1 2 3 4 2 1 Dimensions Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing Windowed Aggregates Metrics ~ Sets ~ Approx Count / Quantiles
  • 33. when what 1 2 1 who 1 2 1 3 1 2 2 2 stockChange 5 6 9 2 -2 3 8 2 4 -5 satisfaction 5 3 4 2 1 2 3 4 2 1 12:30 12:40 12:50 Dimensions Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing Windowed Aggregates Metrics ~ Sets ~ Approx Count / Quantiles
  • 34. satisfaction 5 3 4 2 1 2 3 4 2 1 when 12:30 12:40 12:50 what 1 2 1 2 stockChange 5 6 9 2 -2 3 8 2 4 -5 Dimensions Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing sumStkChg 22 9 1 maxSat 5 3 4 Windowed Aggregates Metrics ~ Sets ~ Approx Count / Quantiles who 1 2 1 3 1 2 2
  • 35. when 12:30 12:40 12:50 what 1 2 1 who 1 2 1 3 1 2 2 2 Dimensions Filters ~ Sorting ~ Functions ~ Aggregate ~ Top N ~ Group By ~ Bucketing whoHLL fgndnbe bsdsfdb bdfbdsf Windowed Aggregates Metrics ~ Sets ~ Approx Count / Quantiles sumStkChg 22 9 1 maxSat 5 3 4
  • 37. when who what stockChange 12:30 peter purchased 5 12:32 paul purchased 6 12:33 paul purchased 9 12:35 paul purchased 2 12:40 peter returned -2 12:42 peter purchased 3 12:45 paul purchased 8 12:50 ahmed purchased 2 12:52 paul purchased 4 12:55 peter returned -5 satisfaction 5 3 4 2 1 2 3 4 2 1 1 2 3 peter paul ahmed 1 2 purchased returned wn 12:30 12:40 12:50 wt 1 2 1 2 ssc 22 9 1 ms 5 3 4 wH fgndnbe bsdsfdb bdfbdsf
  • 40. Agenda The Druid Story A Druid Cluster Data in Druid Q&A
  • 41.
  • 42. http://druid.apache.org/ Druid Forum https://groups.google.com/forum/#!forum/druid-user Apache Distribution https://github.com/apache/druid Druid Community https://druid.apache.org/community/ Imply Distribution https://imply.io/get-started
  • 43. Agenda The Druid Story A Druid Cluster Data in Druid Q&A