SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
08-May-20 7:12 AM
1
Azure Synapse es la evolución de Azure SQL Data Warehouse,
combinando big data, almacenamiento de datos e integración de datos
en un único servicio para análisis de extremo a extremo a escala de nube.
Azure Synapse Analytics
Servicio de análisis ilimitado con un tiempo inigualable para obtener información
08-May-20 7:12 AM
2
INGEST
Data warehouse moderno
PREPARE TRANSFORM
& ENRICH
SERVE
STORE
VISUALIZE
On-premises data
Cloud data
SaaS data
Integrated data platform for BI, AI and continuous intelligence
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Experience Synapse Analytics Studio
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
08-May-20 7:12 AM
3
Plataforma de datos integrada para BI, IA e inteligencia continua
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Experience Synapse Analytics Studio
Inteligencia Artificial / Aprendizaje Automático / Internet de las
cosas/ Aplicaciones inteligentes / Inteligencia empresarial
Servicios conectados
Azure Data Catalog
Azure Data Lake Storage
Azure Data Share
Azure Databricks
Azure HDInsight
Azure Machine Learning
Power BI
3rd Party Integration
Arquitecturas elásticas
Híbrido
Analizar todos los datosComputación
optimizada para cargas
de trabajo
Autoservicio gobernadoSin silos de datos
08-May-20 7:12 AM
4
Tiempo Costo Riesgo
Plataforma: Rendimiento
• Azure Synapse aprovecha el ecosistema de Azure y las
mejoras principales del motor de SQL Server para producir
mejoras masivas en el rendimiento.
• Estos beneficios no requieren ninguna configuración del
cliente y se proporcionan de fábrica para cada almacén de
datos
• Gen2 adaptive caching – utilizando unidades de estado
sólido (NVMe) de memoria no volátil para aumentar el
ancho de banda de E/S disponible para las consultas.
• Azure FPGA-accelerated networking enhancements – para
mover datos a velocidades de hasta 1 GB/s por nodo para
mejorar las consultas
• Instant data movement – aprovecha el paralelismo
multinúcleo en los servidores SQL Server subyacentes para
mover datos de forma eficiente entre nodos de proceso.
• Query Optimization –optimización de consultas
distribuidas
08-May-20 7:12 AM
5
Synapse SQL MPP componentes arquitectónicos
Tablas distribuidas por hash
08-May-20 7:12 AM
6
Tablas replicadas
08-May-20 7:12 AM
7
Gestión de la
carga de
trabajo
Scale-In Isolation
Coste predecible
Elasticidaden línea
Eficiente paracargasde trabajo impredecibles
Intra Cluster Workload Isolation
(Scale In)
Marketing
CREATE WORKLOAD GROUP Sales
WITH
(
[ MIN_PERCENTAGE_RESOURCE = 60 ]
[ CAP_PERCENTAGE_RESOURCE = 100 ]
[ MAX_CONCURRENCY = 6 ] )
40%
Compute
1000c DWU
60%
Sales
60%
100%
Seguridad integral
Category Feature
Data Protection
Data in Transit
Data Encryption at Rest
Data Discovery and Classification
Access Control
Object Level Security (Tables/Views)
Row Level Security
Column Level Security
Dynamic Data Masking
SQL Login
Authentication Azure Active Directory
Multi-Factor Authentication
Virtual Networks
Network Security Firewall
Azure ExpressRoute
Thread Detection
Threat Protection Auditing
Vulnerability Assessment
08-May-20 7:12 AM
8
Integración de
datos
Data Warehouse Reporting
Integración de datos de Synapse
Más de 90 conectores listos para usar
Sin servidor, sin infraestructura que
administrar
Ingestión sostenida de 4 GB/s
CSV, AVRO, ORC, Parquet, JSON support
08-May-20 7:12 AM
9
Integración de datos de Synapse
Code First
Code Free
GUI based
+ many more
Power BI Azure Machine Learning
Azure Data Share Ecosystem
Azure Synapse Analytics
08-May-20 7:12 AM
10
Data Integration Data Warehouse Reporting
Almacenamiento optimizado para el rendimiento
Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning
Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache
08-May-20 7:12 AM
11
Migración de tablas de base de datos
CREATE TABLE StoreSales (
[sales_city] varchar(60),
[sales_year] int,
[sales_state] char(2),
[item_sk] int,
[sales_zip] char(10),
[sales_date] date,
[customer_sk] int)
WITH(
CLUSTERED COLUMNSTORE INDEX ORDER ([customer_sk]),
DISTRIBUTION = HASH([sales_zip],[item_sk]),
PARTITION ([sales_year] RANGE RIGHT FOR VALUES (1998,1999,2000,2001,2002,2003)))
Vista de base de
datos
Migración Materialized Views
Views
08-May-20 7:12 AM
12
Migración de vista de base de
datos
Vista Vista materializada
Abstrae estructura a los usuarios YES YES
Requiere una referencia explícita YES No
Mejora el rendimiento No YES
Se requiere almacenamiento adicional No YES
Asegurable YES YES
Soporte completo de SQL
YES No
Migración de vista de base de datos
CREATE VIEW vw_TopSalesState
AS
SELECT
SubQ.StateAbbrev,
SubQ.FirstSoldDate,
(SubQ.SalesPrice / sum(SubQ.SalesPrice) OVER (order by (select null)))*100,
(1- (SalesPrice/ListPrice))*100 AS Discount,
RANK() OVER (order by (1- (SalesPrice/ListPrice))) AS StateDiscRank
FROM (
SELECT
s_state AS StateAbbrev,
MIN(d_date) AS FirstSoldDate,
SUM([ss_list_price]) AS ListPrice,
SUM([ss_sales_price]) AS SalesPrice
FROM [tpcds10TB].[store_sales2] ss
INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk]
INNER JOIN [tpcds10TB].[date_dim] d on d.[d_date_sk] = ss.ss_sold_date_sk
GROUP BY
s_state) AS SubQ
08-May-20 7:12 AM
13
Migración de la vista materializada de la base de datos
CREATE MATERIALIZED VIEW [dbo].[mvw_StoreSalesSummary]
WITH (DISTRIBUTION = HASH(ss_store_sk))
AS
SELECT
s_state,
c_birth_country,
ss_store_sk AS ss_store_sk,
ss_sold_date_sk AS ss_sold_date_sk,
SUM([ss_list_price]) AS [ss_list_price],
SUM([ss_sales_price]) AS [ss_sales_price],
count_big(*) AS cb
FROM [tpcds10TB].[store_sales2] ss
INNER JOIN [tpcds10TB].customer c ON c.[c_customer_sk] = ss.[ss_customer_sk]
INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk]
GROUP BY
s_state,c_birth_country,ss_store_sk, ss_sold_date_sk
Customer
65
Million
Rows
Store
1500
Rows
Store Sales
26
Billion
Rows
Materialized View
287
Million
Rows
Data Integration Data Warehouse Informes
08-May-20 7:12 AM
14
Synapse Connected Service: Power BI
Experiencia integrada de
creación de Power BI
Publicar en Power BI
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALZIED VIEW vw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
08-May-20 7:12 AM
15
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
ProductName ProductKey TotalSales
Product A 5453 784,943.00
Product B 763 48,723.00
… … …
FactSales Table
10B Records
DimProduct Table
1,000 Records
FactSales
DimProduct
FactInventory
Table
mvw_ProductSales
1,000 Records
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
GROUP BY
ProductName,
ProductKey
FactInventory
Escalado a
Petabytes
Result set Cache
Automaticquery matching
Implicitcreatingfrom queryactivity
Resilient to cluster elasticity
Execution2
Cache Hit
~.2 seconds
Execution1
Cache Miss
Regular Execution
08-May-20 7:12 AM
16
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALZIED VIEW vw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
ProductName ProductKey TotalSales
Product A 5453 784,943.00
Product B 763 48,723.00
… … …
FactSales Table
10B Records
DimProduct Table
1,000 Records
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
FactSales
DimProduct
FactInventory
Table
mvw_ProductSales
1,000 Records
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
GROUP BY
ProductName,
ProductKey
FactInventory
08-May-20 7:12 AM
17
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
SELECT
c_customerkey,
c_nationkey,
SUM(l_quantity),
SUM(l_extendedprice)
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
GROUP BY
c_customerkey,
c_nationkey
[dbo].[lineitem_MonthPartition] HASH(l_orderkey)
[dbo].[orders] HASH(o_orderkey)
[dbo].[customer] HASH(c_customerkey)
Table Distributions
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
LineItem Orders
Collocated Join (DistributionAligned)
Customer
Non-collocatedJoin (Shuffle Required)
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
08-May-20 7:12 AM
18
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
(Shuffle Required)
LineItem Orders
Collocated Join (DistributionAligned)
Stage 1
Customer
Stage 2
#temp (Orders + Lineitem)
Nation
Collocated Join (Replicate Aligned)
Collocated Join (DistributionAligned)
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALIZED VIEW mvw_CustomerSales
WITH (DISTRIBUTION = HASH(o_custkey))
AS
SELECT
o_custkey,
l_shipdate,
SUM(l_quantity) AS l_quantity,
SUM(l_extendedprice) AS l_extendedprice
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
WHERE
l_shipdate >= CONVERT(DATETIME, '1998-11-01', 103)
GROUP BY
o_custkey,
l_shipdate
08-May-20 7:12 AM
19
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
Legend
mvw_CustomerSales
Nation
Customer
<replicated table>
Collocated Join (DistributionAligned)
Collocated Join (Replicate Aligned)
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
275
5
0
50
100
150
200
250
300
No MaterializedView WithMaterializedView
Seconds
Query Execution Time
08-May-20 7:12 AM
20
Power BI
Materialized Views
Tables
Escalado a
Petabytes
Power BI
DirectQuery
Composite Models
Aggregation Tables

Contenu connexe

Tendances

Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Azure data factory
Azure data factoryAzure data factory
Azure data factoryBizTalk360
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekMark Kromer
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Mark Kromer
 
Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?James Serra
 
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...Cathrine Wilhelmsen
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Azure Databricks - An Introduction (by Kris Bock)
Azure Databricks - An Introduction (by Kris Bock)Azure Databricks - An Introduction (by Kris Bock)
Azure Databricks - An Introduction (by Kris Bock)Daniel Toomey
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL DatabaseJames Serra
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data FactoryHARIHARAN R
 
warner-DP-203-slides.pptx
warner-DP-203-slides.pptxwarner-DP-203-slides.pptx
warner-DP-203-slides.pptxHibaB2
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 

Tendances (20)

Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?
 
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Azure Databricks - An Introduction (by Kris Bock)
Azure Databricks - An Introduction (by Kris Bock)Azure Databricks - An Introduction (by Kris Bock)
Azure Databricks - An Introduction (by Kris Bock)
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL Database
 
Azure Synapse Analytics
Azure Synapse AnalyticsAzure Synapse Analytics
Azure Synapse Analytics
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data Factory
 
warner-DP-203-slides.pptx
warner-DP-203-slides.pptxwarner-DP-203-slides.pptx
warner-DP-203-slides.pptx
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 

Similaire à Data warehouse con azure synapse analytics

Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BIKellyn Pot'Vin-Gorman
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesCarole Gunst
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationMatthew W. Bowers
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresCCG
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Ai big dataconference_eugene_polonichko_azure data lake
Ai big dataconference_eugene_polonichko_azure data lake Ai big dataconference_eugene_polonichko_azure data lake
Ai big dataconference_eugene_polonichko_azure data lake Olga Zinkevych
 
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"DataConf
 
Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"Lviv Startup Club
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...DATAVERSITY
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseElena Lopez
 
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DBBuilding near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DBTimothy McAliley
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst
 
How to Use a Semantic Layer on Big Data to Drive AI & BI Impact
How to Use a Semantic Layer on Big Data to Drive AI & BI ImpactHow to Use a Semantic Layer on Big Data to Drive AI & BI Impact
How to Use a Semantic Layer on Big Data to Drive AI & BI ImpactDATAVERSITY
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 

Similaire à Data warehouse con azure synapse analytics (20)

Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
 
Azure SQL
Azure SQLAzure SQL
Azure SQL
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Ai big dataconference_eugene_polonichko_azure data lake
Ai big dataconference_eugene_polonichko_azure data lake Ai big dataconference_eugene_polonichko_azure data lake
Ai big dataconference_eugene_polonichko_azure data lake
 
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"
Eugene Polonichko "Azure Data Lake: what is it? why is it? where is it?"
 
Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"Eugene Polonichko "Architecture of modern data warehouse"
Eugene Polonichko "Architecture of modern data warehouse"
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DBBuilding near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
 
UNIT -IV.docx
UNIT -IV.docxUNIT -IV.docx
UNIT -IV.docx
 
Msbi
MsbiMsbi
Msbi
 
Oracle bi ee architecture
Oracle bi ee architectureOracle bi ee architecture
Oracle bi ee architecture
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
Optimiser votre infrastructure SQL Server avec Azure
Optimiser votre infrastructure SQL Server avec AzureOptimiser votre infrastructure SQL Server avec Azure
Optimiser votre infrastructure SQL Server avec Azure
 
How to Use a Semantic Layer on Big Data to Drive AI & BI Impact
How to Use a Semantic Layer on Big Data to Drive AI & BI ImpactHow to Use a Semantic Layer on Big Data to Drive AI & BI Impact
How to Use a Semantic Layer on Big Data to Drive AI & BI Impact
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 

Plus de Eduardo Castro

Introducción a polybase en SQL Server
Introducción a polybase en SQL ServerIntroducción a polybase en SQL Server
Introducción a polybase en SQL ServerEduardo Castro
 
Creando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerCreando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerEduardo Castro
 
Seguridad en SQL Azure
Seguridad en SQL AzureSeguridad en SQL Azure
Seguridad en SQL AzureEduardo Castro
 
Azure Synapse Analytics MLflow
Azure Synapse Analytics MLflowAzure Synapse Analytics MLflow
Azure Synapse Analytics MLflowEduardo Castro
 
SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022Eduardo Castro
 
Novedades en SQL Server 2022
Novedades en SQL Server 2022Novedades en SQL Server 2022
Novedades en SQL Server 2022Eduardo Castro
 
Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Eduardo Castro
 
Machine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceMachine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceEduardo Castro
 
Novedades en sql server 2022
Novedades en sql server 2022Novedades en sql server 2022
Novedades en sql server 2022Eduardo Castro
 
Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Eduardo Castro
 
Introduccion a databricks
Introduccion a databricksIntroduccion a databricks
Introduccion a databricksEduardo Castro
 
Pronosticos con sql server
Pronosticos con sql serverPronosticos con sql server
Pronosticos con sql serverEduardo Castro
 
Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Eduardo Castro
 
Introduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsIntroduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsEduardo Castro
 
Seguridad de SQL Database en Azure
Seguridad de SQL Database en AzureSeguridad de SQL Database en Azure
Seguridad de SQL Database en AzureEduardo Castro
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL ServerEduardo Castro
 
Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Eduardo Castro
 
Script de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesScript de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesEduardo Castro
 
Introducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesIntroducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesEduardo Castro
 
Que es azure sql datawarehouse
Que es azure sql datawarehouseQue es azure sql datawarehouse
Que es azure sql datawarehouseEduardo Castro
 

Plus de Eduardo Castro (20)

Introducción a polybase en SQL Server
Introducción a polybase en SQL ServerIntroducción a polybase en SQL Server
Introducción a polybase en SQL Server
 
Creando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerCreando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL Server
 
Seguridad en SQL Azure
Seguridad en SQL AzureSeguridad en SQL Azure
Seguridad en SQL Azure
 
Azure Synapse Analytics MLflow
Azure Synapse Analytics MLflowAzure Synapse Analytics MLflow
Azure Synapse Analytics MLflow
 
SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022
 
Novedades en SQL Server 2022
Novedades en SQL Server 2022Novedades en SQL Server 2022
Novedades en SQL Server 2022
 
Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Introduccion a SQL Server 2022
Introduccion a SQL Server 2022
 
Machine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceMachine Learning con Azure Managed Instance
Machine Learning con Azure Managed Instance
 
Novedades en sql server 2022
Novedades en sql server 2022Novedades en sql server 2022
Novedades en sql server 2022
 
Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022
 
Introduccion a databricks
Introduccion a databricksIntroduccion a databricks
Introduccion a databricks
 
Pronosticos con sql server
Pronosticos con sql serverPronosticos con sql server
Pronosticos con sql server
 
Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2
 
Introduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsIntroduccion a Azure Synapse Analytics
Introduccion a Azure Synapse Analytics
 
Seguridad de SQL Database en Azure
Seguridad de SQL Database en AzureSeguridad de SQL Database en Azure
Seguridad de SQL Database en Azure
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL Server
 
Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft
 
Script de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesScript de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure Enclaves
 
Introducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesIntroducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure Enclaves
 
Que es azure sql datawarehouse
Que es azure sql datawarehouseQue es azure sql datawarehouse
Que es azure sql datawarehouse
 

Dernier

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Data warehouse con azure synapse analytics

  • 1. 08-May-20 7:12 AM 1 Azure Synapse es la evolución de Azure SQL Data Warehouse, combinando big data, almacenamiento de datos e integración de datos en un único servicio para análisis de extremo a extremo a escala de nube. Azure Synapse Analytics Servicio de análisis ilimitado con un tiempo inigualable para obtener información
  • 2. 08-May-20 7:12 AM 2 INGEST Data warehouse moderno PREPARE TRANSFORM & ENRICH SERVE STORE VISUALIZE On-premises data Cloud data SaaS data Integrated data platform for BI, AI and continuous intelligence Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Experience Synapse Analytics Studio Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence
  • 3. 08-May-20 7:12 AM 3 Plataforma de datos integrada para BI, IA e inteligencia continua Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Experience Synapse Analytics Studio Inteligencia Artificial / Aprendizaje Automático / Internet de las cosas/ Aplicaciones inteligentes / Inteligencia empresarial Servicios conectados Azure Data Catalog Azure Data Lake Storage Azure Data Share Azure Databricks Azure HDInsight Azure Machine Learning Power BI 3rd Party Integration Arquitecturas elásticas Híbrido Analizar todos los datosComputación optimizada para cargas de trabajo Autoservicio gobernadoSin silos de datos
  • 4. 08-May-20 7:12 AM 4 Tiempo Costo Riesgo Plataforma: Rendimiento • Azure Synapse aprovecha el ecosistema de Azure y las mejoras principales del motor de SQL Server para producir mejoras masivas en el rendimiento. • Estos beneficios no requieren ninguna configuración del cliente y se proporcionan de fábrica para cada almacén de datos • Gen2 adaptive caching – utilizando unidades de estado sólido (NVMe) de memoria no volátil para aumentar el ancho de banda de E/S disponible para las consultas. • Azure FPGA-accelerated networking enhancements – para mover datos a velocidades de hasta 1 GB/s por nodo para mejorar las consultas • Instant data movement – aprovecha el paralelismo multinúcleo en los servidores SQL Server subyacentes para mover datos de forma eficiente entre nodos de proceso. • Query Optimization –optimización de consultas distribuidas
  • 5. 08-May-20 7:12 AM 5 Synapse SQL MPP componentes arquitectónicos Tablas distribuidas por hash
  • 7. 08-May-20 7:12 AM 7 Gestión de la carga de trabajo Scale-In Isolation Coste predecible Elasticidaden línea Eficiente paracargasde trabajo impredecibles Intra Cluster Workload Isolation (Scale In) Marketing CREATE WORKLOAD GROUP Sales WITH ( [ MIN_PERCENTAGE_RESOURCE = 60 ] [ CAP_PERCENTAGE_RESOURCE = 100 ] [ MAX_CONCURRENCY = 6 ] ) 40% Compute 1000c DWU 60% Sales 60% 100% Seguridad integral Category Feature Data Protection Data in Transit Data Encryption at Rest Data Discovery and Classification Access Control Object Level Security (Tables/Views) Row Level Security Column Level Security Dynamic Data Masking SQL Login Authentication Azure Active Directory Multi-Factor Authentication Virtual Networks Network Security Firewall Azure ExpressRoute Thread Detection Threat Protection Auditing Vulnerability Assessment
  • 8. 08-May-20 7:12 AM 8 Integración de datos Data Warehouse Reporting Integración de datos de Synapse Más de 90 conectores listos para usar Sin servidor, sin infraestructura que administrar Ingestión sostenida de 4 GB/s CSV, AVRO, ORC, Parquet, JSON support
  • 9. 08-May-20 7:12 AM 9 Integración de datos de Synapse Code First Code Free GUI based + many more Power BI Azure Machine Learning Azure Data Share Ecosystem Azure Synapse Analytics
  • 10. 08-May-20 7:12 AM 10 Data Integration Data Warehouse Reporting Almacenamiento optimizado para el rendimiento Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache
  • 11. 08-May-20 7:12 AM 11 Migración de tablas de base de datos CREATE TABLE StoreSales ( [sales_city] varchar(60), [sales_year] int, [sales_state] char(2), [item_sk] int, [sales_zip] char(10), [sales_date] date, [customer_sk] int) WITH( CLUSTERED COLUMNSTORE INDEX ORDER ([customer_sk]), DISTRIBUTION = HASH([sales_zip],[item_sk]), PARTITION ([sales_year] RANGE RIGHT FOR VALUES (1998,1999,2000,2001,2002,2003))) Vista de base de datos Migración Materialized Views Views
  • 12. 08-May-20 7:12 AM 12 Migración de vista de base de datos Vista Vista materializada Abstrae estructura a los usuarios YES YES Requiere una referencia explícita YES No Mejora el rendimiento No YES Se requiere almacenamiento adicional No YES Asegurable YES YES Soporte completo de SQL YES No Migración de vista de base de datos CREATE VIEW vw_TopSalesState AS SELECT SubQ.StateAbbrev, SubQ.FirstSoldDate, (SubQ.SalesPrice / sum(SubQ.SalesPrice) OVER (order by (select null)))*100, (1- (SalesPrice/ListPrice))*100 AS Discount, RANK() OVER (order by (1- (SalesPrice/ListPrice))) AS StateDiscRank FROM ( SELECT s_state AS StateAbbrev, MIN(d_date) AS FirstSoldDate, SUM([ss_list_price]) AS ListPrice, SUM([ss_sales_price]) AS SalesPrice FROM [tpcds10TB].[store_sales2] ss INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk] INNER JOIN [tpcds10TB].[date_dim] d on d.[d_date_sk] = ss.ss_sold_date_sk GROUP BY s_state) AS SubQ
  • 13. 08-May-20 7:12 AM 13 Migración de la vista materializada de la base de datos CREATE MATERIALIZED VIEW [dbo].[mvw_StoreSalesSummary] WITH (DISTRIBUTION = HASH(ss_store_sk)) AS SELECT s_state, c_birth_country, ss_store_sk AS ss_store_sk, ss_sold_date_sk AS ss_sold_date_sk, SUM([ss_list_price]) AS [ss_list_price], SUM([ss_sales_price]) AS [ss_sales_price], count_big(*) AS cb FROM [tpcds10TB].[store_sales2] ss INNER JOIN [tpcds10TB].customer c ON c.[c_customer_sk] = ss.[ss_customer_sk] INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk] GROUP BY s_state,c_birth_country,ss_store_sk, ss_sold_date_sk Customer 65 Million Rows Store 1500 Rows Store Sales 26 Billion Rows Materialized View 287 Million Rows Data Integration Data Warehouse Informes
  • 14. 08-May-20 7:12 AM 14 Synapse Connected Service: Power BI Experiencia integrada de creación de Power BI Publicar en Power BI Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALZIED VIEW vw_ProductSales WITH (DISTRIBUTION = HASH(ProductKey)) AS SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey GROUP BY ProductName, ProductKey
  • 15. 08-May-20 7:12 AM 15 Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching ProductName ProductKey TotalSales Product A 5453 784,943.00 Product B 763 48,723.00 … … … FactSales Table 10B Records DimProduct Table 1,000 Records FactSales DimProduct FactInventory Table mvw_ProductSales 1,000 Records SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp GROUP BY ProductName, ProductKey FactInventory Escalado a Petabytes Result set Cache Automaticquery matching Implicitcreatingfrom queryactivity Resilient to cluster elasticity Execution2 Cache Hit ~.2 seconds Execution1 Cache Miss Regular Execution
  • 16. 08-May-20 7:12 AM 16 Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALZIED VIEW vw_ProductSales WITH (DISTRIBUTION = HASH(ProductKey)) AS SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey GROUP BY ProductName, ProductKey ProductName ProductKey TotalSales Product A 5453 784,943.00 Product B 763 48,723.00 … … … FactSales Table 10B Records DimProduct Table 1,000 Records Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching FactSales DimProduct FactInventory Table mvw_ProductSales 1,000 Records SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp GROUP BY ProductName, ProductKey FactInventory
  • 17. 08-May-20 7:12 AM 17 Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching SELECT c_customerkey, c_nationkey, SUM(l_quantity), SUM(l_extendedprice) FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey GROUP BY c_customerkey, c_nationkey [dbo].[lineitem_MonthPartition] HASH(l_orderkey) [dbo].[orders] HASH(o_orderkey) [dbo].[customer] HASH(c_customerkey) Table Distributions Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching LineItem Orders Collocated Join (DistributionAligned) Customer Non-collocatedJoin (Shuffle Required) FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
  • 18. 08-May-20 7:12 AM 18 Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching (Shuffle Required) LineItem Orders Collocated Join (DistributionAligned) Stage 1 Customer Stage 2 #temp (Orders + Lineitem) Nation Collocated Join (Replicate Aligned) Collocated Join (DistributionAligned) Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALIZED VIEW mvw_CustomerSales WITH (DISTRIBUTION = HASH(o_custkey)) AS SELECT o_custkey, l_shipdate, SUM(l_quantity) AS l_quantity, SUM(l_extendedprice) AS l_extendedprice FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey WHERE l_shipdate >= CONVERT(DATETIME, '1998-11-01', 103) GROUP BY o_custkey, l_shipdate
  • 19. 08-May-20 7:12 AM 19 Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching Legend mvw_CustomerSales Nation Customer <replicated table> Collocated Join (DistributionAligned) Collocated Join (Replicate Aligned) Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching 275 5 0 50 100 150 200 250 300 No MaterializedView WithMaterializedView Seconds Query Execution Time
  • 20. 08-May-20 7:12 AM 20 Power BI Materialized Views Tables Escalado a Petabytes Power BI DirectQuery Composite Models Aggregation Tables