SlideShare une entreprise Scribd logo
1  sur  19
Modern Analytics Academy
Modeling
https://aka.ms/maa
Agenda • What is Modern Analytics Academy?
• The Team
• Modern Analytics in Azure
 Model & Serve
 Data Lake Structure
 Synapse as solution
 Demo
 Review
Senior Cloud Solution
Architect
Alex Karasek
Modern Analytics Academy Team
Principal Cloud Solution
Architect
Chris Mitchell
Principal Cloud Solution
Architect
Jason Virtue
Senior Cloud Solution
Architect
Annie Xu
Senior Cloud Solution
Architect
Brian Hitney
https://aka.ms/maa
Academy Sessions
Acquisition &
Storage
Modeling Pipelines Security &
Governance
Visualization
 Azure Data Factory
 Synapse Pipelines
 Power BI Dataflows
 Azure Stream
Analytics
 Data Lake Structure
 Synapse Spark Pools
 Synapse SQL Pools
 Synapse Serverless
SQL
 Azure Data Lake
 Azure Cosmos DB
 Azure Event Hubs
 Synapse Link
 Auditing
 Security
 Azure purview
 Power BI
 Paginated Reports
 Power BI Embedded
Modern Analytics in Azure
Advanced Analytics
INGEST PREP & TRAIN MODEL & SERVE BI + Reporting
Real Time Analytics
STORE
Big data store
EXPLORE
Query All Data Analytics Engines Data Warehouse
Data Orchestration
and Monitoring
METADATA MANAGEMENT & GOVERNANCE
Social
LOB
Graph
IoT
Image
CRM
Role of Data in Modern Analytics
Experimentation
Fast exploration
Semi-structured data
Big Data
OR
Proven security & privacy
Dependable performance
Operational data
Relational Data
Data Lake Data Warehouse
Role of Data in Modern Analytics
Data warehousing & big data analytics—all in one service
Azure Synapse Analytics
Data warehousing & big data analytics—all in one service
Azure Synapse Analytics
Introducing Azure
Synapse Analytics
A limitless analytics service with unmatched
time to insight, that delivers insights from all
your data, across data warehouses and big
data analytics systems, with blazing speed
Simply put, Azure Synapse is Azure SQL Data
Warehouse evolved
We have taken the same industry leading data
warehouse and elevated it to a whole new level of
performance and capabilities
Azure Synapse Analytics
Key Terms
Data Lake
A data lake is a storage repository that holds a large amount of data in its native, raw format. Data lake
stores are optimized for scaling to terabytes and petabytes of data. The data typically comes from
multiple heterogeneous sources, and may be structured, semi-structured, or unstructured. The idea with
a data lake is to store everything in its original, untransformed state. This approach differs from a
traditional data warehouse, which transforms and processes the data at the time of ingestion.
Azure Data Lake (ADLS Gen 2)
Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on
Azure
Logical Data Warehouse (LDW)
A relational layer built on top of Azure data sources such as Azure Data Lake storage (ADLS Gen 2),
Azure Cosmos DB analytical storage, or Azure Blob storage
Data Warehouse (DW)
A data warehouse is a centralized repository of integrated data from one or more disparate sources.
Data warehouses store current and historical data and are used for reporting and analysis of the data.
ADLS Gen 2
Azure Synapse Analytics
Rich surface area
T-SQL language for data analytics
Supporting large number of
languages and tools
Enterprise-grade security
dedicated SQL pool
Modern Data Warehouse
Indexing and caching
Import and query external data
Workload management
serverless SQL pool
Querying external data
Model raw files as virtual tables
and views
Easy data transformation
Azure Synapse Analytics
Dedicated SQL Pools (Formerly SQL DW)
Best in class price
per performance
Developer
productivity
Intelligent workload
management
Data flexibility
Up to 94% less expensive
than competitors
Prioritize resources for
the most valuable
workloads
Ingest variety of data
sources to derive the
maximum benefit
Use preferred tooling for
SQL data warehouse
development
Industry-leading
security
Defense-in-depth
security and 99.9%
financially backed
availability SLA
Azure Synapse Analytics
Dedicated SQL Pools
T-SQL Querying
• Windowing aggregates
• Approximate execution (Hyperloglog)
• JSON data support
• Score machine learning models in ONNX format
Advanced storage system
• Columnstore Indexes
• Table partitions
• Distributed tables
• Isolation
• Materialized Views
• Nonclustered Indexes
• Result-set caching
Complete SQL object model
• Tables
• Views
• Stored procedures
• Functions
Azure Synapse Analytics
Serverless SQL Pools
Quick data exploration
Easily explore schema and data in
files on Azure storage
Supports various file formats
(Parquet, CSV, JSON)
Direct connector to Azure storage
for large BI ecosystem
Logical Data Warehouse
Model raw files as virtual tables and
views
Use any tool that works with SQL to
analyze files
Use enterprise-grade security model
Easy data transformation
Transform CSV to parquet format
Move data between containers and
accounts
Save the results of queries on
external storage
Azure Synapse Analytics
Apache Spark Pools
Spark Unifies:
 Batch Processing





An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
Demo:
A tour of Azure Synapse
Demo Architecture
Thank you

Contenu connexe

Similaire à Modern Analytics Academy - Data Modeling (1).pptx

(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 

Similaire à Modern Analytics Academy - Data Modeling (1).pptx (20)

Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Azure synapse by usama whaba khan
Azure synapse by usama whaba khanAzure synapse by usama whaba khan
Azure synapse by usama whaba khan
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
 
Solucion de BI en Azure
Solucion de BI en AzureSolucion de BI en Azure
Solucion de BI en Azure
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptx
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabad
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George Grammatikos
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdf
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 

Dernier

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
amitlee9823
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
lizamodels9
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 

Dernier (20)

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture concept
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 

Modern Analytics Academy - Data Modeling (1).pptx

  • 2. Agenda • What is Modern Analytics Academy? • The Team • Modern Analytics in Azure  Model & Serve  Data Lake Structure  Synapse as solution  Demo  Review
  • 3. Senior Cloud Solution Architect Alex Karasek Modern Analytics Academy Team Principal Cloud Solution Architect Chris Mitchell Principal Cloud Solution Architect Jason Virtue Senior Cloud Solution Architect Annie Xu Senior Cloud Solution Architect Brian Hitney https://aka.ms/maa
  • 4. Academy Sessions Acquisition & Storage Modeling Pipelines Security & Governance Visualization  Azure Data Factory  Synapse Pipelines  Power BI Dataflows  Azure Stream Analytics  Data Lake Structure  Synapse Spark Pools  Synapse SQL Pools  Synapse Serverless SQL  Azure Data Lake  Azure Cosmos DB  Azure Event Hubs  Synapse Link  Auditing  Security  Azure purview  Power BI  Paginated Reports  Power BI Embedded
  • 5. Modern Analytics in Azure Advanced Analytics INGEST PREP & TRAIN MODEL & SERVE BI + Reporting Real Time Analytics STORE Big data store EXPLORE Query All Data Analytics Engines Data Warehouse Data Orchestration and Monitoring METADATA MANAGEMENT & GOVERNANCE Social LOB Graph IoT Image CRM
  • 6. Role of Data in Modern Analytics Experimentation Fast exploration Semi-structured data Big Data OR Proven security & privacy Dependable performance Operational data Relational Data Data Lake Data Warehouse
  • 7. Role of Data in Modern Analytics Data warehousing & big data analytics—all in one service Azure Synapse Analytics Data warehousing & big data analytics—all in one service Azure Synapse Analytics
  • 8. Introducing Azure Synapse Analytics A limitless analytics service with unmatched time to insight, that delivers insights from all your data, across data warehouses and big data analytics systems, with blazing speed Simply put, Azure Synapse is Azure SQL Data Warehouse evolved We have taken the same industry leading data warehouse and elevated it to a whole new level of performance and capabilities
  • 10. Key Terms Data Lake A data lake is a storage repository that holds a large amount of data in its native, raw format. Data lake stores are optimized for scaling to terabytes and petabytes of data. The data typically comes from multiple heterogeneous sources, and may be structured, semi-structured, or unstructured. The idea with a data lake is to store everything in its original, untransformed state. This approach differs from a traditional data warehouse, which transforms and processes the data at the time of ingestion. Azure Data Lake (ADLS Gen 2) Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on Azure Logical Data Warehouse (LDW) A relational layer built on top of Azure data sources such as Azure Data Lake storage (ADLS Gen 2), Azure Cosmos DB analytical storage, or Azure Blob storage Data Warehouse (DW) A data warehouse is a centralized repository of integrated data from one or more disparate sources. Data warehouses store current and historical data and are used for reporting and analysis of the data.
  • 12. Azure Synapse Analytics Rich surface area T-SQL language for data analytics Supporting large number of languages and tools Enterprise-grade security dedicated SQL pool Modern Data Warehouse Indexing and caching Import and query external data Workload management serverless SQL pool Querying external data Model raw files as virtual tables and views Easy data transformation
  • 13. Azure Synapse Analytics Dedicated SQL Pools (Formerly SQL DW) Best in class price per performance Developer productivity Intelligent workload management Data flexibility Up to 94% less expensive than competitors Prioritize resources for the most valuable workloads Ingest variety of data sources to derive the maximum benefit Use preferred tooling for SQL data warehouse development Industry-leading security Defense-in-depth security and 99.9% financially backed availability SLA
  • 14. Azure Synapse Analytics Dedicated SQL Pools T-SQL Querying • Windowing aggregates • Approximate execution (Hyperloglog) • JSON data support • Score machine learning models in ONNX format Advanced storage system • Columnstore Indexes • Table partitions • Distributed tables • Isolation • Materialized Views • Nonclustered Indexes • Result-set caching Complete SQL object model • Tables • Views • Stored procedures • Functions
  • 15. Azure Synapse Analytics Serverless SQL Pools Quick data exploration Easily explore schema and data in files on Azure storage Supports various file formats (Parquet, CSV, JSON) Direct connector to Azure storage for large BI ecosystem Logical Data Warehouse Model raw files as virtual tables and views Use any tool that works with SQL to analyze files Use enterprise-grade security model Easy data transformation Transform CSV to parquet format Move data between containers and accounts Save the results of queries on external storage
  • 16. Azure Synapse Analytics Apache Spark Pools Spark Unifies:  Batch Processing      An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation
  • 17. Demo: A tour of Azure Synapse

Notes de l'éditeur

  1. The agenda for today’s session will include: A review of the modern analytics academy framework The team behind it An overview of a modern analytics architecture in Azure Finally a deep dive into ways to model and serve data in Azure. Specific topics covered will include data lake structures for analytical use cases, how synapse fits as a solution and a demo of these concepts
  2. The MAA was built to show breadth and depth of capabilities in Azure that can be used to implement a Modern Analytics Solution in the cloud. This is the team behind this program, and we are all Cloud Solution Architectes on Microsoft’s Global Partner Solutions Team.
  3. Let’s take a quick look at the detailed sessions available. In Data Acquisition and Storage we looked at the nuances of how source systems provide data and the various mechanisms for getting that data into Azure for long term retention and analysis. In Data Modeling we’ll be looking at how to structure and store your data for the purposes of serving data to applications like reporting tools. In Data Pipelines we’ll be looking at how to move your data through a data engineering process and prepare the data for serving. In Data Security and Governance we’ll examine how to deal with topics like controlling access to your data, understanding the lineage of your data and addressing policy compliance on your data. In Data Visualization we’ll examine how visualization tools like Power BI can be used to enable users to get timely business value out of your data.
  4. Ingestion – we need to be able to connect to the data no matter where it comes from. Line of business applications, cloud services, social networks, sensor networks as well as being able to support the different cadences with which that data arrives. Storage - highly scalable and cost effect storage is critical to these solutions, with theoretically limitless amounts of data being required to support these types of solutions, choosing the right methods for storage is always critical Exploration – data processing engines and services are used more for ad hoc data exploration of data Prep & Train - New analytics and data processing engines such as Spark are being leveraged to prepare data and train models over large data volumes Model & Serve – Without giving users access to the data to do ad-hoc analysis an analytics solution is useless. We must surface the data in a useable way through modern tools. Metadata management & governance – with all this data security and regulatory concerns haven’t gone away, data must be discoverable, compliant, and trusted.
  5. Data is fundamental to the types of modern analytics solutions that we are discussing in this series. When it comes to data at this scale there are generally 2 ways to store, model, and serve to consumes. Data Lakes are generally ideal for experimentation, exploratory, or ad hoc queries over semi-structured data at any scale. For more traditional analytics and reporting scenarios, a DW is often more suitable for serving data via a relational data model with proven security and dependable performance over aggregated operational data sets.
  6. By being able to support both scenarios, Synapse Analytics becomes the natural hub for all of your modern analytical solutions
  7. Designed for analytics workloads at any scale Interfaces: Synapse Studio – web based portal for object exploration, data engineering, development, orchestration all through a single pane of glass SaaS developer experiences for code free and code first Multiple languages suited to different analytics workloads Integrated analytics runtimes available provisioned and serverless on-demand SQL Analytics offering T-SQL for batch, streaming and interactive processing Spark for big data processing with Python, Scala, R and .NET Integrated platform services for, management, security, monitoring, and metastore Data lake integrated and Common Data Model aware
  8. Raw data: This is data as it comes from the source systems. This data is typically ingested in raw format via automated streaming or batch oriented pipelines and is consumed by an analytics engine such as Spark or managed integration pipelines in ADF or Synapse to perform cleansing and enrichment operations to generate the enriched and curated data. Common formats might include csv or JSON which are optimized for efficient write operations Enriched data: This layer of data is the version where raw data (as is or aggregated) has a defined schema and also, the data is cleansed, enriched (with other sources) and is available to analytics engines to extract high value data. Data engineers generate these datasets and also proceed to extract high value/curated data from these datasets. Data in this zone might be stored in a format such as parquet which is compressed and includes schema which makes it much more suitable for read heavy and analytical workloads. Curated data: This layer of data contains the high value information that is served to the consumers of the data – the BI analysts and the data scientists. This data has structure and can be served to the consumers either as is in its native format or through a more traditional data warehouse. Data assets in this layer is usually highly governed and well documented. Data in this zone might be stored in a format like delta which is an enhanced version of parquet that leverages a transaction log to enable advanced capabilities like merge operations, time travel, and significantly better performance when using for analytical queries
  9. An interactive query service that enables you to use standard T-SQL queries over files in Azure storage.
  10. Having Spark integrated directly inside of Synapse allows users to pick the appropriate query engine for every scenario without ever needing to leave the Synapse workspace environment. Because the available Hive metastore is automatically synced with SQL Serverless engine, data can be directly integrated across platforms, and because it is embedded directly inside of Synapse, that means that all of the associated benefits such as security, administration, monitoring, etc… would apply. Some supported use cases that are made easier through the use of this embedded engine include: Batch Processing Interactive SQL Real-time processing Machine Learning Deep Learning Graph Processing