State of the Druid, Dec 2016

•

0 j'aime•323 vues

gianmerlino

From https://www.meetup.com/druidio/events/235468465/

Technologie

0.9.2
• new groupBy engine (2–5x performance boost)
• ability to disable rollup at ingestion time
• ability to ﬁlter on longs
• new encoding options for long-typed columns
• performance improvements for HyperUnique (19–30%) and
DataSketches (up to 80%)
• query cache implementation based on Caffeine
• new lookup extension exposing ﬁne grained caching strategies
• support for reading ORC ﬁles
• new aggregators for variance and standard deviation
• download at http://druid.io/downloads.html

0.9.3 (so far)
• 100+ commits since 0.9.2
• target release: mid-Q1 2017
• built-in SQL
• grouping over integer columns (9x improvement on 600M
row TPC-H dataset)
• expression language for aggregations: AVG(ln(x)
• “like” ﬁlter optimized for preﬁxes: foo LIKE "bar%"
• faster indexing with less memory
• kafka indexing service improvements
• many more changes expected by release

Built-in SQL
• Apache Calcite based parser and planner
• handled by the broker
• time series: GROUP BY FLOOR(__time TO HOUR)
• metadata: SELECT * FROM metadata.COLUMNS
• explain: EXPLAIN PLAN FOR …
• distinct count with hyperloglog
• jdbc driver

Recommandé

Interactive analytics at scale with druidJulien Lavigne du Cadet

Data Analytics with DruidYousun Jeong

Real-time analytics with Druid at AppsflyerMichael Spector

Scalable Real-time analytics using DruidDataWorks Summit/Hadoop Summit

Gregorry Letribot - Druid at Criteo - NoSQL matters 2015NoSQLmatters

Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, HadoopSenthil Pandurangan

Druid at SF Big Analytics 2015-12-01gianmerlino

July 2014 HUG : Pushing the limits of Realtime Analytics using DruidYahoo Developer Network

Recommandé

Interactive analytics at scale with druidJulien Lavigne du Cadet

Data Analytics with DruidYousun Jeong

Real-time analytics with Druid at AppsflyerMichael Spector

Scalable Real-time analytics using DruidDataWorks Summit/Hadoop Summit

Gregorry Letribot - Druid at Criteo - NoSQL matters 2015NoSQLmatters

Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, HadoopSenthil Pandurangan

Druid at SF Big Analytics 2015-12-01gianmerlino

July 2014 HUG : Pushing the limits of Realtime Analytics using DruidYahoo Developer Network

Using druid for interactive count distinct queries at scale @ nmcIdo Shilon

PayPal Real Time AnalyticsAnil Madan

Programmatic Bidding Data Streams & DruidCharles Allen

Druid realtime indexingSeoeun Park

Lambda Architectures in PracticeC4Media

Case Study: Realtime Analytics with DruidSalil Kalia

Druid at Hadoop EcosystemSlim Bouguerra

Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit

Aggregated queries with Druid on terrabytes and petabytes of dataRostislav Pashuto

Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidTony Ng

OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)SANG WON PARK

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Understanding the FAA Part 107 License ..Christopher Logan Kennedy

DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Elevate Developer Efficiency & build GenAI Application with Amazon QBhuvaneswari Subramani

Contenu connexe

En vedette

Using druid for interactive count distinct queries at scale @ nmcIdo Shilon

PayPal Real Time AnalyticsAnil Madan

Programmatic Bidding Data Streams & DruidCharles Allen

Druid realtime indexingSeoeun Park

Lambda Architectures in PracticeC4Media

Case Study: Realtime Analytics with DruidSalil Kalia

Druid at Hadoop EcosystemSlim Bouguerra

Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit

Aggregated queries with Druid on terrabytes and petabytes of dataRostislav Pashuto

Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidTony Ng

OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)SANG WON PARK

En vedette (11)

Using druid for interactive count distinct queries at scale @ nmc

PayPal Real Time Analytics

Programmatic Bidding Data Streams & Druid

Druid realtime indexing

Lambda Architectures in Practice

Case Study: Realtime Analytics with Druid

Druid at Hadoop Ecosystem

Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid

Aggregated queries with Druid on terrabytes and petabytes of data

Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid

OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)

Dernier

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Understanding the FAA Part 107 License ..Christopher Logan Kennedy

DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Elevate Developer Efficiency & build GenAI Application with Amazon QBhuvaneswari Subramani

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz

Platformless Horizons for Digital AdaptabilityWSO2

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2

Why Teams call analytics are critical to your entire businesspanagenda

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Dernier (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Six Myths about Ontologies: The Basics of Formal Ontology

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Understanding the FAA Part 107 License ..

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

FWD Group - Insurer Innovation Award 2024

Boost Fertility New Invention Ups Success Rates.pdf

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Platformless Horizons for Digital Adaptability

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

WSO2's API Vision: Unifying Control, Empowering Developers

Why Teams call analytics are critical to your entire business

AWS Community Day CPH - Three problems of Terraform

CNIC Information System with Pakdata Cf In Pakistan

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

State of the Druid, Dec 2016

1. State of the Druid December 6, 2016

2. 0.9.2 • new groupBy engine (2–5x performance boost) • ability to disable rollup at ingestion time • ability to filter on longs • new encoding options for long-typed columns • performance improvements for HyperUnique (19–30%) and DataSketches (up to 80%) • query cache implementation based on Caffeine • new lookup extension exposing fine grained caching strategies • support for reading ORC files • new aggregators for variance and standard deviation • download at http://druid.io/downloads.html

3. 0.9.3 (so far) • 100+ commits since 0.9.2 • target release: mid-Q1 2017 • built-in SQL • grouping over integer columns (9x improvement on 600M row TPC-H dataset) • expression language for aggregations: AVG(ln(x) • “like” ﬁlter optimized for preﬁxes: foo LIKE "bar%" • faster indexing with less memory • kafka indexing service improvements • many more changes expected by release

4. Built-in SQL • Apache Calcite based parser and planner • handled by the broker • time series: GROUP BY FLOOR(__time TO HOUR) • metadata: SELECT * FROM metadata.COLUMNS • explain: EXPLAIN PLAN FOR … • distinct count with hyperloglog • jdbc driver