SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
Designing a
Modern Data Warehouse
in Azure
Antonios
Chatzipavlis
Data Solutions
Consultant & Trainer
1988 Beginning of my professional career
1996 I started working with SQL Server 6.0
1998 Certified as MCSD (3rd in Greece)
1999 Became an MCT
2010 Microsoft MVP on Data Platform
Created www.sqlschool.gr
2012 Became MCT Regional Lead by Microsoft Learning
2013 Certified as MCSE : Data Platform and
MCSE : Business Intelligence
2016 Certified as MCSE: Data Management & Analytics
2018 Certified as MCSA : Machine Learning
Recertified as MCSE: Data Management & Analytics
• Articles
• SQL Server in Greek
• SQL Nights
• Webcasts
• SQL Server News
• Downloads
• Resources
What we are doing Follow us
fb/sqlschoolgr
fb/groups/sqlschool
@antoniosch
@sqlschool
yt/c/SqlschoolGr
SQLschool.gr Group
A community for
Greek professionals
who use the
Microsoft
Data Platform
Ask your question at help@sqlschool.gr
Explore
everything
PASS has
to offer
Free Online Resources
Newsletters
PASS.org
Get involved
Free online
webinar
events
Local user groups
around the world
Free 1-day local
training events
Online special
interest user
groups
Business analytics
training
bit.ly/AAB2019Evaluation
A data warehouse is a subject-oriented,
integrated, time-variant and
non-volatile collection
of data in support of management’s
decision making process.
WHAT IS A DATA WAREHOUSE?
TRADITIONAL DATA WAREHOUSE
SELF-SERVICE DATA WAREHOUSE
TRADITIONAL DW LIMITATIONS
Data
sources
User
Competition
Scaling up
Data
platforms
CURRENT DW CHALLENGES
Timeliness Flexibility Quality Findability
RECENT RESEARCH SURVEYS
of responders reports that
they will replace their
primary DW platform and
analytics tools within 3
years
+50%
The data tsunami
WHY YOU NEED A
MODERN DATA
WAREHOUSE
Customer
experience
Quality
assurance
Operational
efficiency
Innovation
THE CRITERIA
FOR SELECTING A
MODERN DW
Meets Current
and Future
Needs
ON-PREMISES
VS.
CLOUD DW
• Evaluating Time to Value
• Accounting for Storage and Computing Costs
• Sizing, Balancing and Tuning
• Considering Data Preparation and ETL Costs
• Cost of Specialized Business Analytic Tools
• Scaling and Elasticity
• Delays and Downtime
• Cost of Security Breaches
• Data Protection and Recovery
STEPS TO
GETTING STARTED
WITH CLOUD DW
• Evaluate your data warehousing needs.
• Migrate or start fresh.
• Establish success criteria.
• Evaluate cloud data warehouse solutions.
• Calculate your total cost of ownership.
• Set up a proof of concept (POC).
Azure Modern Data Warehouse
MODERN DATA WAREHOUSE
MODERN DATA WAREHOUSE IN AZURE
ADVANCED ANALYTICS ON BIG DATA
REAL-TIME ANALYTICS
SQL SERVER 2019 BIG DATA CLUSTERS
INGEST DATA
ADF
• PaaS
• Mapping Data Flow transform data (ETL)
• Copy Data tool easily copy from source
to destination
• Templates
• Any new project
• Converting SSIS packages
• Row by row ETL can be slower
• Data needs to be moved to Databricks –
limited by compute size
• Mapping Data flow takes time to startup
SSIS
• SSDT – Visual Studio
• Very popular product
• Used for on-prem ETL for may year
• Too big of an effort to migrate existing
packages
• Skillset staying on-prem
• Change to IR in ADF
• Row by row ETL can be slower
• Data need to moved to IR
• Limited by node size/number of SSIS IR
STORE DATA
ADLS Gen 2
• PaaS
• Best features of blob
storage
• Not all features are
available yet
• Some products not support
yet
• 5TB file size limit
Blob Storage
• PaaS
• Original storage
• Most popular
• Don’t use for new projects
• Account limit 2 PB for US
and Europe
• 4,75TB file size limit
SQL Server 2019 Big Data
Cluster
• IaaS
• Combines SQL Server
database engine, Spark,
HDFS (ADLS Gen2) into a
unified data platform
• Deployed as containers on
Kubernetes
• Polybase
• Hybrid cloud
• Data virtualization
• AI Platform
PREP DATA
Azure Databricks
• PaaS
• Processing massive
amounts of data
• Training & deploy
models
• Manage workflows
• Spark & notebooks
• Integration with
ADLS, SQL DW, PBI
• Writing Code
• High learning curve
Azure HDInsight
• PaaS
• Deploys &
provisions Apache
Hadoop clusters
• No integration with
SQL DW
• Always running and
incurring cost
• Hortonworks
merged with
Cloudera
Polybase & Stored
Procedures in SQL
DW
• IaaS
• T-SQL queries via
external tables
• Tuning queries
• Increase storage
space
PowerBI Dataflow
• PowerBI service
• Power Query
• Self-service data
prep
• Individual solution
• Small workloads
• Don’t use this to
replace a DW or
ADF
MODEL & SERVE DATA
Azure SQL DW
• PaaS
• Fully managed
petabyte scale
cloud DW
• Can scale compute
and storage
independently
• Can be paused
• MPP
Azure Analysis
Services
• PaaS
• Tabular model
• Fast queries
• High concurrency
• Semantic layer
• Vertical scale-out
• High availability
• Advanced time-
calculations
• Time to process
the cube
Azure SQL
Database
• PaaS
• Suitable for small
DW
• Size limits/tier
• Optimized for
OLTP
SQL Server in
VM
• IaaS
• MDX models
Cosmos DB
• PaaS
• Globally
distributed
• Multi-model
database service
• Spark to Cosmos
DB connector for
DW aggregations
ETL vs ELT
ETL ELT
Time – Load Uses staging area and system, extra time to load data All in one system, load only once
Time – Transformation
Need to wait, especially for big data sizes - as data grows,
transformation time increases
All in one system, speed is not dependent on data size
Time – Maintenance
High maintenance - choice of data to load and transform and
must do it again if deleted or want to enhance the main data
repository
Low maintenance - all data is always available
Implementation complexity At early stage, requires less space and result is clean
Requires in-depth knowledge of tools and expert design of the
main large repository
Analysis & Processing style
Based on multiple scripts to create the views - deleting view
means deleting data
Creating adhoc views - low cost for building and maintaining
Data limitation or restriction By presuming and choosing data a priori By HW (none) and data retention policy
DW Support
Prevalent legacy model used for on-premises and relational,
structured data
Tailored to using in scalable cloud infrastructure to support
structured, unstructured such
big data sources
Data Lake Support Not part of approach Enables use of lake with unstructured data supported
Usability Fixed tables, Fixed timeline, Used mainly by IT
Ad Hoc, Agility, Flexibility, Usable by everyone from developer to
citizen integrator
Cost-effective Not cost-effective for small and medium businesses
Scalable and available to all business sizes using online SaaS
solutions
LAMBDA ARCHITECTURE
LAMBDA
ARCHITECTURE IN
AZURE
COMMON
DATA MODEL
Antonios
Chatzipavlis
Data Solutions
Consultant & Trainer
./sqlschoolgr - ./groups/sqlschool
@antoniosch - @sqlschool
yt/c/SqlschoolGr
SQLschool.gr Group
Thank you!
A community for Greek professionals who use the Microsoft Data Platform
Copyright © 2018 SQLschool.gr. All right reserved. PRESENTER MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION

Contenu connexe

Tendances

Speed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSSpeed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSData Science Milan
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarAmazon Web Services
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data FactorySlava Kokaev
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Cathrine Wilhelmsen
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Rajesh Kumar
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL DatabaseJames Serra
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseSnowflake Computing
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
Snowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for EveryoneSnowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for EveryoneAngel Abundez
 

Tendances (20)

Speed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSSpeed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWS
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL Database
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Snowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for EveryoneSnowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for Everyone
 

Similaire à Designing a modern data warehouse in azure

Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?James Serra
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsMichaela Murray
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseEric Bragas
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLDataStax
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed InstanceJames Serra
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationDenodo
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierDataStax
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesMark Kromer
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 

Similaire à Designing a modern data warehouse in azure (20)

Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT Solutions
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Exploring sql server 2016
Exploring sql server 2016Exploring sql server 2016
Exploring sql server 2016
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 

Plus de Antonios Chatzipavlis

Workload Management in SQL Server 2019
Workload Management in SQL Server 2019Workload Management in SQL Server 2019
Workload Management in SQL Server 2019Antonios Chatzipavlis
 
Loading Data into Azure SQL DW (Synapse Analytics)
Loading Data into Azure SQL DW (Synapse Analytics)Loading Data into Azure SQL DW (Synapse Analytics)
Loading Data into Azure SQL DW (Synapse Analytics)Antonios Chatzipavlis
 
Building diagnostic queries using DMVs and DMFs
Building diagnostic queries using DMVs and DMFs Building diagnostic queries using DMVs and DMFs
Building diagnostic queries using DMVs and DMFs Antonios Chatzipavlis
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Antonios Chatzipavlis
 
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Antonios Chatzipavlis
 
Introduction to Machine Learning on Azure
Introduction to Machine Learning on AzureIntroduction to Machine Learning on Azure
Introduction to Machine Learning on AzureAntonios Chatzipavlis
 

Plus de Antonios Chatzipavlis (20)

Data virtualization using polybase
Data virtualization using polybaseData virtualization using polybase
Data virtualization using polybase
 
SQL server Backup Restore Revealed
SQL server Backup Restore RevealedSQL server Backup Restore Revealed
SQL server Backup Restore Revealed
 
Migrate SQL Workloads to Azure
Migrate SQL Workloads to AzureMigrate SQL Workloads to Azure
Migrate SQL Workloads to Azure
 
Machine Learning in SQL Server 2019
Machine Learning in SQL Server 2019Machine Learning in SQL Server 2019
Machine Learning in SQL Server 2019
 
Workload Management in SQL Server 2019
Workload Management in SQL Server 2019Workload Management in SQL Server 2019
Workload Management in SQL Server 2019
 
Loading Data into Azure SQL DW (Synapse Analytics)
Loading Data into Azure SQL DW (Synapse Analytics)Loading Data into Azure SQL DW (Synapse Analytics)
Loading Data into Azure SQL DW (Synapse Analytics)
 
Introduction to DAX Language
Introduction to DAX LanguageIntroduction to DAX Language
Introduction to DAX Language
 
Building diagnostic queries using DMVs and DMFs
Building diagnostic queries using DMVs and DMFs Building diagnostic queries using DMVs and DMFs
Building diagnostic queries using DMVs and DMFs
 
Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Sqlschool 2017 recap - 2018 plans
Sqlschool 2017 recap - 2018 plansSqlschool 2017 recap - 2018 plans
Sqlschool 2017 recap - 2018 plans
 
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
 
Microsoft SQL Family and GDPR
Microsoft SQL Family and GDPRMicrosoft SQL Family and GDPR
Microsoft SQL Family and GDPR
 
Statistics and Indexes Internals
Statistics and Indexes InternalsStatistics and Indexes Internals
Statistics and Indexes Internals
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Azure SQL Data Warehouse
Azure SQL Data Warehouse Azure SQL Data Warehouse
Azure SQL Data Warehouse
 
Introduction to azure document db
Introduction to azure document dbIntroduction to azure document db
Introduction to azure document db
 
Introduction to Machine Learning on Azure
Introduction to Machine Learning on AzureIntroduction to Machine Learning on Azure
Introduction to Machine Learning on Azure
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Designing a modern data warehouse in azure

  • 1.
  • 2. Designing a Modern Data Warehouse in Azure
  • 3. Antonios Chatzipavlis Data Solutions Consultant & Trainer 1988 Beginning of my professional career 1996 I started working with SQL Server 6.0 1998 Certified as MCSD (3rd in Greece) 1999 Became an MCT 2010 Microsoft MVP on Data Platform Created www.sqlschool.gr 2012 Became MCT Regional Lead by Microsoft Learning 2013 Certified as MCSE : Data Platform and MCSE : Business Intelligence 2016 Certified as MCSE: Data Management & Analytics 2018 Certified as MCSA : Machine Learning Recertified as MCSE: Data Management & Analytics
  • 4. • Articles • SQL Server in Greek • SQL Nights • Webcasts • SQL Server News • Downloads • Resources What we are doing Follow us fb/sqlschoolgr fb/groups/sqlschool @antoniosch @sqlschool yt/c/SqlschoolGr SQLschool.gr Group A community for Greek professionals who use the Microsoft Data Platform Ask your question at help@sqlschool.gr
  • 5. Explore everything PASS has to offer Free Online Resources Newsletters PASS.org Get involved Free online webinar events Local user groups around the world Free 1-day local training events Online special interest user groups Business analytics training
  • 7. A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process. WHAT IS A DATA WAREHOUSE?
  • 11. CURRENT DW CHALLENGES Timeliness Flexibility Quality Findability
  • 12. RECENT RESEARCH SURVEYS of responders reports that they will replace their primary DW platform and analytics tools within 3 years +50%
  • 13.
  • 15. WHY YOU NEED A MODERN DATA WAREHOUSE Customer experience Quality assurance Operational efficiency Innovation
  • 16. THE CRITERIA FOR SELECTING A MODERN DW Meets Current and Future Needs
  • 17. ON-PREMISES VS. CLOUD DW • Evaluating Time to Value • Accounting for Storage and Computing Costs • Sizing, Balancing and Tuning • Considering Data Preparation and ETL Costs • Cost of Specialized Business Analytic Tools • Scaling and Elasticity • Delays and Downtime • Cost of Security Breaches • Data Protection and Recovery
  • 18. STEPS TO GETTING STARTED WITH CLOUD DW • Evaluate your data warehousing needs. • Migrate or start fresh. • Establish success criteria. • Evaluate cloud data warehouse solutions. • Calculate your total cost of ownership. • Set up a proof of concept (POC).
  • 19. Azure Modern Data Warehouse
  • 24. SQL SERVER 2019 BIG DATA CLUSTERS
  • 25. INGEST DATA ADF • PaaS • Mapping Data Flow transform data (ETL) • Copy Data tool easily copy from source to destination • Templates • Any new project • Converting SSIS packages • Row by row ETL can be slower • Data needs to be moved to Databricks – limited by compute size • Mapping Data flow takes time to startup SSIS • SSDT – Visual Studio • Very popular product • Used for on-prem ETL for may year • Too big of an effort to migrate existing packages • Skillset staying on-prem • Change to IR in ADF • Row by row ETL can be slower • Data need to moved to IR • Limited by node size/number of SSIS IR
  • 26. STORE DATA ADLS Gen 2 • PaaS • Best features of blob storage • Not all features are available yet • Some products not support yet • 5TB file size limit Blob Storage • PaaS • Original storage • Most popular • Don’t use for new projects • Account limit 2 PB for US and Europe • 4,75TB file size limit SQL Server 2019 Big Data Cluster • IaaS • Combines SQL Server database engine, Spark, HDFS (ADLS Gen2) into a unified data platform • Deployed as containers on Kubernetes • Polybase • Hybrid cloud • Data virtualization • AI Platform
  • 27. PREP DATA Azure Databricks • PaaS • Processing massive amounts of data • Training & deploy models • Manage workflows • Spark & notebooks • Integration with ADLS, SQL DW, PBI • Writing Code • High learning curve Azure HDInsight • PaaS • Deploys & provisions Apache Hadoop clusters • No integration with SQL DW • Always running and incurring cost • Hortonworks merged with Cloudera Polybase & Stored Procedures in SQL DW • IaaS • T-SQL queries via external tables • Tuning queries • Increase storage space PowerBI Dataflow • PowerBI service • Power Query • Self-service data prep • Individual solution • Small workloads • Don’t use this to replace a DW or ADF
  • 28. MODEL & SERVE DATA Azure SQL DW • PaaS • Fully managed petabyte scale cloud DW • Can scale compute and storage independently • Can be paused • MPP Azure Analysis Services • PaaS • Tabular model • Fast queries • High concurrency • Semantic layer • Vertical scale-out • High availability • Advanced time- calculations • Time to process the cube Azure SQL Database • PaaS • Suitable for small DW • Size limits/tier • Optimized for OLTP SQL Server in VM • IaaS • MDX models Cosmos DB • PaaS • Globally distributed • Multi-model database service • Spark to Cosmos DB connector for DW aggregations
  • 29.
  • 30. ETL vs ELT ETL ELT Time – Load Uses staging area and system, extra time to load data All in one system, load only once Time – Transformation Need to wait, especially for big data sizes - as data grows, transformation time increases All in one system, speed is not dependent on data size Time – Maintenance High maintenance - choice of data to load and transform and must do it again if deleted or want to enhance the main data repository Low maintenance - all data is always available Implementation complexity At early stage, requires less space and result is clean Requires in-depth knowledge of tools and expert design of the main large repository Analysis & Processing style Based on multiple scripts to create the views - deleting view means deleting data Creating adhoc views - low cost for building and maintaining Data limitation or restriction By presuming and choosing data a priori By HW (none) and data retention policy DW Support Prevalent legacy model used for on-premises and relational, structured data Tailored to using in scalable cloud infrastructure to support structured, unstructured such big data sources Data Lake Support Not part of approach Enables use of lake with unstructured data supported Usability Fixed tables, Fixed timeline, Used mainly by IT Ad Hoc, Agility, Flexibility, Usable by everyone from developer to citizen integrator Cost-effective Not cost-effective for small and medium businesses Scalable and available to all business sizes using online SaaS solutions
  • 34. Antonios Chatzipavlis Data Solutions Consultant & Trainer ./sqlschoolgr - ./groups/sqlschool @antoniosch - @sqlschool yt/c/SqlschoolGr SQLschool.gr Group Thank you!
  • 35. A community for Greek professionals who use the Microsoft Data Platform Copyright © 2018 SQLschool.gr. All right reserved. PRESENTER MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION