SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Dremio, the missing
link in modern Data ?
2017 Nov, 22
Me ?
Alexis Gendronneau
OVH, worldwide cloud provider
Data convergence Tech Lead
• Design customer Data Solutions
@bru_gere
https://www.linkedin.com/in/alexis-gendronneau-36066174/
Apache Dremio
Apache project since July,17
Founded by :
Jacques Nadeau, Drill MapR
Tomer Shiran, MapR Microsoft IBM
Team (part of) :
Ajay Singh, Hortonworks.
Collin Weitzman, Mesosphere and MapR, Oracle.
Kelly Stirman, MongoDB
Slogan :
“The missing link in modern data”
How to use data fast and easily ?
SQL
?
@vincentterrasi
?
?
Data is a massive engineering project today
Data Staging
• Custom ETL
• Fragile transforms
• Slow moving
SQL
@vincentterrasi
Data is a massive engineering project today
Data Staging
Data Warehouse
• High overhead
• DBA experts
SQL
@vincentterrasi
Data is a massive engineering project today
Data Staging
Data Warehouse
Cubes, BI Extracts &
Aggregation Tables
• Data sprawl
• Governance issues
• Slow to update
SQL
+
+
+
+
+
+
+
+
+
@vincentterrasi
A New Tier In Data Analytics: Data Fabric
SQL
Data Virtualization
RDBMS, MongoDB, Elasticsearch, Hadoop,, NAS,
Excel, JSON
Data Acceleration
OLAP and AdHoc queries at interactive speed,
without cubes or BI-extracts
Data Curation
Wrangle, prepare, enrich any source without
making copies of your data.
Data Catalog
Interactive Data Discovery, Enterprise and
Personal Data Assets
@vincentterrasi
A production ready architecture
Native Push-Downs
Optimized query semantics for each data source:
relational, NoSQL HDFS and more.
Universal Relational Algebra
Query Planner automatically substitutes plans to make
optimal use of cache fragments.
Scalable
From 1 to 1000+ nodes, run on dedicated infrastructure
or in your Hadoop cluster, via YARN.
Dremio ReflectionsTM
Optimized physical data structures for row and
aggregation operations,.
Dremio
optimizer
Accelerator cache
(local disks, HDFS, S3, …)
Query plan
Dremio
optimizer
Accelerator cache
(local disks, HDFS, S3, …)
Query plan
@vincentterrasi
Relying on standards open source projects
Apache Drill (forked)
Distributed data exploration service
Apache calcite
SQL parser & optimizer
Apache Arrow
In-memory columnar data processing lib
Apache Parquet
columnar data storage format
Dremio approach
Reflection
design ui
Source Storage layer
Cache
Persistance
Refresh
System
Change
detection
Relationnal
Pattern
End user
Queries
Query
planner
Data
Processing
Impersonation | Trusted Context* | Passthr*
Data Source Access Control
Dremio security architecture
LDA
P
LDAP
Kerberos*
Virtual Dataset Access Control
ODBC | JDBC | REST
SSL / TLS*
SQL
@vincentterrasi
• Keep data where it is even with
your usual tools
Discover
Curate
Accelerate
Share
Discover
● Self-service access to all sources
● First class SQL support
● Extends your LDAP and Kerberos
Share
● Collaborate with your team
● Extends your permissions
● Google Docs for your data
Curate
● Rename columns, filter results
● Extract and transform values
● Join with other data sets
Accelerate
● Make queries 1000x faster
● Works with any data source
● Automatically adapts to you
Dremio powers analyst collaboration
@vincentterrasi
Deploy on Hadoop
• Data locality
• Use Yarn containers
Deploy on cloud
• Workers on compute layer
• Parquet on storage
Demo !
Host
OVH PCI b2_120 (16 vcore 120GB RAM 400GB SSD)
Sources
Sample from dremio (local files)
ElasticSearch cloud
Tests
Create a dataset
Split column
Join datasets
Tableau view
Dataset Creation
You need a Data source
• Elasticsearch
• MongoDB
• HDFS
• RDBSM (PGSQL, MySQL, MariaDB)
• File (csv, json, …)
Or a dataset
• Use search to find the right one
Data curation/preparation
On a dataset you can apply several changes
• Modify a column (split, delete, …)
• Modify Rows (filter by columns value, ...)
• Join with other datasets (Type does not
matter)
If needed, revert to a previous step
Data queries enhancement
Define reflections on data to make it faster
• Raw reflection for low/loaded backends
• Aggregation reflection for computed data
/! Be sure to know what you do with reflections
Management
how much it is used
Where it comes from
How it is built (Enterprise)
Manage Reflection request creation
(Enterprise)
Resources creation
Apache Dremio
next ?
Open API for queries (Data serving)
New datasource integration
Your requests ! (community)
https://fr.slideshare.net/HadoopSummit/the-heterogeneous-data-lake
https://info.dataengconf.com/hubfs/slides-nyc17/jacques-dataengconf-
slides.pdf?submissionGuid=c3e64832-56bc-47bd-95ee-2afebde38540
https://fr.slideshare.net/VincentTerrasi/how-to-boost-your-datamanagement-with-dremio-80190071
This presentation was made using
(namely) :
Thank you!

Contenu connexe

Tendances

Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMatillion
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Turning Raw Data Into Gold With A Data Lakehouse.pptx
Turning Raw Data Into Gold With A Data Lakehouse.pptxTurning Raw Data Into Gold With A Data Lakehouse.pptx
Turning Raw Data Into Gold With A Data Lakehouse.pptxedwardoldham1
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowDremio Corporation
 
What is data engineering?
What is data engineering?What is data engineering?
What is data engineering?yongdam kim
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaionsridhark1981
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 

Tendances (20)

Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Turning Raw Data Into Gold With A Data Lakehouse.pptx
Turning Raw Data Into Gold With A Data Lakehouse.pptxTurning Raw Data Into Gold With A Data Lakehouse.pptx
Turning Raw Data Into Gold With A Data Lakehouse.pptx
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
What is data engineering?
What is data engineering?What is data engineering?
What is data engineering?
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaion
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 

Similaire à Dremio introduction

Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Dr. Anita Goel
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
Equinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyEquinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyPraveen Kumar
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...Amazon Web Services
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web developmentTung Nguyen
 
Azure and OSS, a match made in heaven
Azure and OSS, a match made in heavenAzure and OSS, a match made in heaven
Azure and OSS, a match made in heavenMichelangelo van Dam
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...SnapLogic
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
 

Similaire à Dremio introduction (20)

Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Equinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyEquinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journey
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
Azure and OSS, a match made in heaven
Azure and OSS, a match made in heavenAzure and OSS, a match made in heaven
Azure and OSS, a match made in heaven
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 

Dernier

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Dernier (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 

Dremio introduction

  • 1. Dremio, the missing link in modern Data ? 2017 Nov, 22
  • 2. Me ? Alexis Gendronneau OVH, worldwide cloud provider Data convergence Tech Lead • Design customer Data Solutions @bru_gere https://www.linkedin.com/in/alexis-gendronneau-36066174/
  • 3. Apache Dremio Apache project since July,17 Founded by : Jacques Nadeau, Drill MapR Tomer Shiran, MapR Microsoft IBM Team (part of) : Ajay Singh, Hortonworks. Collin Weitzman, Mesosphere and MapR, Oracle. Kelly Stirman, MongoDB Slogan : “The missing link in modern data”
  • 4. How to use data fast and easily ? SQL ? @vincentterrasi ? ?
  • 5. Data is a massive engineering project today Data Staging • Custom ETL • Fragile transforms • Slow moving SQL @vincentterrasi
  • 6. Data is a massive engineering project today Data Staging Data Warehouse • High overhead • DBA experts SQL @vincentterrasi
  • 7. Data is a massive engineering project today Data Staging Data Warehouse Cubes, BI Extracts & Aggregation Tables • Data sprawl • Governance issues • Slow to update SQL + + + + + + + + + @vincentterrasi
  • 8. A New Tier In Data Analytics: Data Fabric SQL Data Virtualization RDBMS, MongoDB, Elasticsearch, Hadoop,, NAS, Excel, JSON Data Acceleration OLAP and AdHoc queries at interactive speed, without cubes or BI-extracts Data Curation Wrangle, prepare, enrich any source without making copies of your data. Data Catalog Interactive Data Discovery, Enterprise and Personal Data Assets @vincentterrasi
  • 9. A production ready architecture Native Push-Downs Optimized query semantics for each data source: relational, NoSQL HDFS and more. Universal Relational Algebra Query Planner automatically substitutes plans to make optimal use of cache fragments. Scalable From 1 to 1000+ nodes, run on dedicated infrastructure or in your Hadoop cluster, via YARN. Dremio ReflectionsTM Optimized physical data structures for row and aggregation operations,. Dremio optimizer Accelerator cache (local disks, HDFS, S3, …) Query plan Dremio optimizer Accelerator cache (local disks, HDFS, S3, …) Query plan @vincentterrasi
  • 10. Relying on standards open source projects Apache Drill (forked) Distributed data exploration service Apache calcite SQL parser & optimizer Apache Arrow In-memory columnar data processing lib Apache Parquet columnar data storage format
  • 11. Dremio approach Reflection design ui Source Storage layer Cache Persistance Refresh System Change detection Relationnal Pattern End user Queries Query planner Data Processing
  • 12. Impersonation | Trusted Context* | Passthr* Data Source Access Control Dremio security architecture LDA P LDAP Kerberos* Virtual Dataset Access Control ODBC | JDBC | REST SSL / TLS* SQL @vincentterrasi • Keep data where it is even with your usual tools
  • 13. Discover Curate Accelerate Share Discover ● Self-service access to all sources ● First class SQL support ● Extends your LDAP and Kerberos Share ● Collaborate with your team ● Extends your permissions ● Google Docs for your data Curate ● Rename columns, filter results ● Extract and transform values ● Join with other data sets Accelerate ● Make queries 1000x faster ● Works with any data source ● Automatically adapts to you Dremio powers analyst collaboration @vincentterrasi
  • 14. Deploy on Hadoop • Data locality • Use Yarn containers
  • 15. Deploy on cloud • Workers on compute layer • Parquet on storage
  • 16. Demo ! Host OVH PCI b2_120 (16 vcore 120GB RAM 400GB SSD) Sources Sample from dremio (local files) ElasticSearch cloud Tests Create a dataset Split column Join datasets Tableau view
  • 17. Dataset Creation You need a Data source • Elasticsearch • MongoDB • HDFS • RDBSM (PGSQL, MySQL, MariaDB) • File (csv, json, …) Or a dataset • Use search to find the right one
  • 18. Data curation/preparation On a dataset you can apply several changes • Modify a column (split, delete, …) • Modify Rows (filter by columns value, ...) • Join with other datasets (Type does not matter) If needed, revert to a previous step
  • 19. Data queries enhancement Define reflections on data to make it faster • Raw reflection for low/loaded backends • Aggregation reflection for computed data /! Be sure to know what you do with reflections
  • 20. Management how much it is used Where it comes from How it is built (Enterprise) Manage Reflection request creation (Enterprise) Resources creation
  • 21. Apache Dremio next ? Open API for queries (Data serving) New datasource integration Your requests ! (community)