Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Lakehouse in Azure

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 16 Publicité

Lakehouse in Azure

Télécharger pour lire hors ligne

In this session, Sergio covered the Lakehouse concept and how companies implement it, from data ingestion to insight. He showed how you could use Azure Data Services to speed up your Analytics project from ingesting, modelling and delivering insights to end users.

In this session, Sergio covered the Lakehouse concept and how companies implement it, from data ingestion to insight. He showed how you could use Azure Data Services to speed up your Analytics project from ingesting, modelling and delivering insights to end users.

Publicité
Publicité

Plus De Contenu Connexe

Plus récents (20)

Publicité

Lakehouse in Azure

  1. 1. Lakehouse in Azure Sergio Zenatti Filho Sr Cloud Solution Architect - Data & Analytics @Microsoft
  2. 2. Sergio has over 20 years of experience designing and delivering Data and Analytics Solutions. He has extensive experience in the Microsoft Data and Analytics Platform in the cloud and also on-premises. Sergio is passionate about learning new technology and helping customers to define the best solution for their business. Sergio Zenatti Filho Senior Cloud Solution Architect at Microsoft Connect
  3. 3. ©Microsoft Corporation Azure Agenda • Lakehouse • Delta Lake • Ingestion and Transformation • Architecture • Power BI • Next Steps • Q&A
  4. 4. ©Microsoft Corporation Azure Data Warehouse and Data Lake • Have Powered BI for over 30 years • Purpose-built for BI and Reporting • Limited support for Semi- Structured and Unstructured data • Limited support for streaming BI Data Science Machine Learning Structured, Semi-Structured and Unstructured Data Data Lake Real-Time Database Reports Data Warehouses Data Prep and Validation ETL ETL External Data Operational Data Data Warehouses BI Reports • Powered by technological advances in data storage • Cheap to store any data • Support machine learning user cases • Poor BI Support • Complex to set up • Hard to append data Data Lake Data Warehouse
  5. 5. ©Microsoft Corporation Azure Lakehouse Data Warehouse Data Lake Streaming Analytics BI Data Science Machine Learning Structured, Semi-Structured and Unstructured Data Key features: • Transaction support • Schema enforcement and governance • Data reliability and consistency • Low query latency and high reliability for BI and advanced analytics • Optimized for machine learning and data science • Enable end-to-end streaming Lakehouse Platform combines the best elements of data lakes and data warehouses to deliver the reliability, strong governance and performance of data warehouses with the openness, flexibility and machine learning support of data lakes.
  6. 6. ©Microsoft Corporation Azure Delta Lake Key features: • ACID Transactions • Scalable Metadata • Unified Streaming and Batch • Schema Evolution / Enforcement • Time Travel • Upserts and deletes Delta Lake is an open source project that enables building a Lakehouse architecture on top of data lakes.
  7. 7. Demo Delta Lake Data Ingestion and Transformation Power BI
  8. 8. ©Microsoft Corporation Azure Data Ingestion Azure Synapse Pipeline or Azure Data Factory Databricks Other Solutions • 90+ Data Sources including files, databases, SaaS, PaaS and more • Copy activity: supports Azure Databricks Delta Lake connector to copy data from any supported source to delta lake table, and from delta lake table to any supported sink data store. • Mapping Data Flow: supports generic Delta format on Azure Storage as source and sink to read and write Delta files for code-free ETL, and runs on managed Azure Integration Runtime. • Data Formats: Delta Lake, Parquet, ORC, JSON, CSV, Avro, Text and Binary • Data Sources: SQL Server, MariaDB, MySQL, PostgreSQL, Azure Synapse Analytics, Azure Cosmos DB, MongoDB, Cassandra, Couchbase, ElasticSearch, Neo4j, Redis, Snowflake and more. • Event Hub • IoT Hub • SQL Server BCP (bulk copy program) • Polybase • SAP Data Services • Informatica • Striim • Fivetran • Qlik • Confluent
  9. 9. ©Microsoft Corporation Azure Data Transformation Databricks Synapse Spark Azure Synapse Pipeline and Azure Data Factory • Spark notebooks using Python, Scala, SQL and R • Spark Notebook using Python, Scala, Spark SQL, C# and R (Preview) • Mapping data flows: visually designed data transformations in Azure Data Factory and Azure Synapse Pipeline • External Transformations: Azure Synapse Notebook and Databricks.
  10. 10. Architecture
  11. 11. ©Microsoft Corporation Azure Lakehouse Architecture - Databricks
  12. 12. ©Microsoft Corporation Azure Lakehouse Architecture – Azure Synapse
  13. 13. ©Microsoft Corporation Azure Lakehouse Architecture – Azure Synapse and Databricks
  14. 14. ©Microsoft Corporation Azure Power BI Azure Synapse Databricks Delta Sharing • Databricks (Beta): connector for Databricks SQL Warehouse running on AWS and using OAuth • Azure Databricks: for Databricks SQL Warehouse in Azure or on AWS but not using OAuth • Authentication using Personal Access Token or OAuth • Azure Synapse Analytics SQL: connector for Lake DB (Spark), Serverless DB and Dedicated SQL Pool • Azure Synapse Analytics workspace (beta): connector for Lake DB (Spark), Serverless DB and Dedicated SQL Pool • Authentication using Microsoft Account, Windows and Database • Import Mode Only • Authentication using Token Delta.io connector (Open Source) • Reading Delta Lake tables natively in PowerBI • Support all storage systems that are supported by PowerBI https://github.com/delta- io/connectors/tree/master/powerbi
  15. 15. ©Microsoft Corporation Azure What next? • Free training - Databricks Lakehouse Fundamentals: https://www.databricks.com/learn/training/lakehouse- fundamentals • Free training - Use Delta Lake in Azure Synapse Analytics: https://learn.microsoft.com/en- us/training/modules/use-delta-lake-azure-synapse-analytics/ • Solution Accelerator for Financial Analytics: https://github.com/microsoft/Azure-Databricks-Solution- Accelerator-Financial-Analytics-Customer-Revenue-Growth-Factor • Open Education Analytics: https://github.com/microsoft/OpenEduAnalytics • Delta Lake: https://delta.io/ • Dynamics 365 Finance and Operations Apps - Export to data lake: https://github.com/microsoft/Dynamics- 365-FastTrack-Implementation-Assets/tree/master/Analytics/ArchitecturePatterns
  16. 16. © Copyright Microsoft Corporation. All rights reserved. Q&A Thank you! Sergio Zenatti Filho - Sr Cloud Solution Architect at Microsoft Email: zenatti@gmail.com LinkedIn: https://www.linkedin.com/in/sergiozenatti/ Connect

×