SlideShare une entreprise Scribd logo
1  sur  17
SQL Saturday 715
Luan Moreno M. Maciel Pythian
Big Data Consultant
• SQL Server Lead and Architect
• Big Data on Microsoft Azure and HortonWorks
Ingerindo, Processando e Analisando Dados
com Azure Data Lake Analytics – [ADLA]
Patrocinadores
CloudComputing
CloudComputing
Big Data
UnstructuredData
Informationwithout a [Pre-DefinedModel] or is not Organized
Typically is Text-Heavywith Irregularities & Ambiguities
Difficultto Understandusing TraditionalPrograms
IDC & EMC Project Estimated DataGrowth of 40 Zettabytesby 2020
DataLake
Methodof Storing Data within a Systemor Repository
DataLakeCharacteristics
- Structured= Relational Databases [Rows & Columns]
- Semi-Structured= CSV, Logs, XML, JSON
- Unstructured = Emails, Documents, Binaries, Audio, Video, PDFs
- Open-Source = Apache Hadoop [HDFS]
- Microsoft Azure = Azure Data Lake Store [ADLS]
- AmazonAWS = Amazon S3
Usedfor
- Reporting
- Visualization
- Analytics
- Machine Learning
Single Store of Data in EnterpriseRangingfromRaw Data [Copyof SourceSystem]
CentralizedData Storefor Enterprises
Structured,Processed Structured/ Semi-Structured/ Unstructured/ Raw
Data Warehouse Data Lake
Schema-On-Read
DesignedforLow-Cost Storage
HighAgile, Configure& Reconfigure
Schema-On-Write
Expensivefor Large DataVolumes
LessAgile,FixedConfiguration
BusinessProfessional
DataLake
DataScientists
Hadoop
Distributed Storage– HDFS[HadoopDistributedFileSystem]
Open-SourceSoftwareFrameworkfor BigData Solutions
Programming Language– MapReduce
Clusterof Commodity Hardware
Inspiredby Google
Benefits:
- Scalability
- Reliability
- Flexibility
- LowCost
YARN– Yet Another Resource Negotiatoron [MRv2]
DataBricks
The UnifiedAnalytics Platform
Unifying Data Science,Engineeringand Business
Accelerateperformance
with an optimized Spark platform
Increase productivity
through interactive data science
Streamline processes
from ETL to production
Reduce costand complexity
with a fully managed, cloud-native platform
Azure Data Lake Store [ADLS]
UnlimitedStorage Capability– Petabyte-Size Files& Trillions of Objects
ScaleThroughput for Massively-Parallel Analytics
HDFS[HadoopDistributedFile-System] for Cloud[MicrosoftAzure]
1 TB – R$ 120 | 10 TB – R$ 1.962| 100TB – R$ 9.628 – by MonthlyCommitment Package
Azure Data Lake Store [ADLS]
Azure Data Lake Analytics [ADLA]
On-DemandAnalyticsJob Service
Start in Seconds, ScaleInstantly and Pay per Job
DevelopMassively Parallel Programs[MPP] withSimplicity[U-SQL]
100 Hrs – R$ 332 | 500 Hrs – R$ 1.494– by Monthly Commitment Package
[HaaS] - Hadoop-as-a-Services
Demo
Azure Data Lake – Stor
Processing & AnalyzingDataSetson Microsoft Azure
Treinamento
Big Data para DBA’s e Desenvolvedores no Ecossistemado Microsoft Azure
The Core
Understanding
Cloud Computing
Big Data
SQL Server 2016/2017
NoSQL
Apache Software Foundation
Hadoop
Spark
Microsoft Azure
HDInsight
Azure Data Lake Store/Analytics
The
Programming
Languages
Hive
Drill
Pig
PolyBase
Phoenix
Scala
PySpark
Spark-SQL
The Real-State
Databases
Azure SQL DB
Azure SQL Dw
HBase
CosmosDB
The In-Memory &
Streaming Processing
Spark
DataBricks
Storm
Azure Stream Analytics [ASA]
Kafka
The On-Demand &
Analytical Processing
Azure Data Lake Store
The Data-Transfer &
Orchestrators
Sqoop
Oozie
Apache Nifi
Apache Airflow
Azure Data Factory - [ADF]
~ 24 Hrs
Local – BeloHorizonte
Sexta, Sábado e Domingo de
09:00 Hs às 18:00 Hs
- 03/08/2018
- 04/08/2018
- 05/08/2018
http://bit.do/bhbigdata

Contenu connexe

Tendances

Definitive Guide to Select Right Data Warehouse (2020)
Definitive Guide to Select Right Data Warehouse (2020)Definitive Guide to Select Right Data Warehouse (2020)
Definitive Guide to Select Right Data Warehouse (2020)Sprinkle Data Inc
 
Using Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksUsing Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksDatabricks
 
Azure data bricks by Eugene Polonichko
Azure data bricks by Eugene PolonichkoAzure data bricks by Eugene Polonichko
Azure data bricks by Eugene PolonichkoAlex Tumanoff
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020Timothy McAliley
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMark Tabladillo
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Rukmani Gopalan
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowGary Stafford
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data FactoryBizTalk360
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Spark Summit
 
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)Cathrine Wilhelmsen
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarPeter Ward
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Lace Lofranco
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Nathan Bijnens
 
Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeMSAdvAnalytics
 
Is there a way that we can build our Azure Data Factory all with parameters b...
Is there a way that we can build our Azure Data Factory all with parameters b...Is there a way that we can build our Azure Data Factory all with parameters b...
Is there a way that we can build our Azure Data Factory all with parameters b...Erwin de Kreuk
 

Tendances (20)

Definitive Guide to Select Right Data Warehouse (2020)
Definitive Guide to Select Right Data Warehouse (2020)Definitive Guide to Select Right Data Warehouse (2020)
Definitive Guide to Select Right Data Warehouse (2020)
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Using Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksUsing Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on Databricks
 
Lecture1
Lecture1Lecture1
Lecture1
 
Azure data bricks by Eugene Polonichko
Azure data bricks by Eugene PolonichkoAzure data bricks by Eugene Polonichko
Azure data bricks by Eugene Polonichko
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache Airflow
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
 
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
 
Is there a way that we can build our Azure Data Factory all with parameters b...
Is there a way that we can build our Azure Data Factory all with parameters b...Is there a way that we can build our Azure Data Factory all with parameters b...
Is there a way that we can build our Azure Data Factory all with parameters b...
 

Similaire à SQL Saturday BH - Ingerindo, Processando e Analisando Dados com Azure Data Lake Analytics

Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesCCG
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudMark Kromer
 
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...DataScienceConferenc1
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudMark Kromer
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresCCG
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptxFedoRam1
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als DatenplattformRalph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als DatenplattformInformatik Aktuell
 
Microsoft Azure update
Microsoft Azure updateMicrosoft Azure update
Microsoft Azure updateKarina Matos
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastDatabricks
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 

Similaire à SQL Saturday BH - Ingerindo, Processando e Analisando Dados com Azure Data Lake Analytics (20)

Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
AZURE Data Related Services
AZURE Data Related ServicesAZURE Data Related Services
AZURE Data Related Services
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als DatenplattformRalph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
 
Microsoft Azure update
Microsoft Azure updateMicrosoft Azure update
Microsoft Azure update
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 

Dernier

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

SQL Saturday BH - Ingerindo, Processando e Analisando Dados com Azure Data Lake Analytics

  • 1. SQL Saturday 715 Luan Moreno M. Maciel Pythian Big Data Consultant • SQL Server Lead and Architect • Big Data on Microsoft Azure and HortonWorks
  • 2. Ingerindo, Processando e Analisando Dados com Azure Data Lake Analytics – [ADLA]
  • 7.
  • 8. UnstructuredData Informationwithout a [Pre-DefinedModel] or is not Organized Typically is Text-Heavywith Irregularities & Ambiguities Difficultto Understandusing TraditionalPrograms IDC & EMC Project Estimated DataGrowth of 40 Zettabytesby 2020
  • 9. DataLake Methodof Storing Data within a Systemor Repository DataLakeCharacteristics - Structured= Relational Databases [Rows & Columns] - Semi-Structured= CSV, Logs, XML, JSON - Unstructured = Emails, Documents, Binaries, Audio, Video, PDFs - Open-Source = Apache Hadoop [HDFS] - Microsoft Azure = Azure Data Lake Store [ADLS] - AmazonAWS = Amazon S3 Usedfor - Reporting - Visualization - Analytics - Machine Learning Single Store of Data in EnterpriseRangingfromRaw Data [Copyof SourceSystem] CentralizedData Storefor Enterprises
  • 10. Structured,Processed Structured/ Semi-Structured/ Unstructured/ Raw Data Warehouse Data Lake Schema-On-Read DesignedforLow-Cost Storage HighAgile, Configure& Reconfigure Schema-On-Write Expensivefor Large DataVolumes LessAgile,FixedConfiguration BusinessProfessional DataLake DataScientists
  • 11. Hadoop Distributed Storage– HDFS[HadoopDistributedFileSystem] Open-SourceSoftwareFrameworkfor BigData Solutions Programming Language– MapReduce Clusterof Commodity Hardware Inspiredby Google Benefits: - Scalability - Reliability - Flexibility - LowCost YARN– Yet Another Resource Negotiatoron [MRv2]
  • 12. DataBricks The UnifiedAnalytics Platform Unifying Data Science,Engineeringand Business Accelerateperformance with an optimized Spark platform Increase productivity through interactive data science Streamline processes from ETL to production Reduce costand complexity with a fully managed, cloud-native platform
  • 13. Azure Data Lake Store [ADLS] UnlimitedStorage Capability– Petabyte-Size Files& Trillions of Objects ScaleThroughput for Massively-Parallel Analytics HDFS[HadoopDistributedFile-System] for Cloud[MicrosoftAzure] 1 TB – R$ 120 | 10 TB – R$ 1.962| 100TB – R$ 9.628 – by MonthlyCommitment Package
  • 14. Azure Data Lake Store [ADLS]
  • 15. Azure Data Lake Analytics [ADLA] On-DemandAnalyticsJob Service Start in Seconds, ScaleInstantly and Pay per Job DevelopMassively Parallel Programs[MPP] withSimplicity[U-SQL] 100 Hrs – R$ 332 | 500 Hrs – R$ 1.494– by Monthly Commitment Package [HaaS] - Hadoop-as-a-Services
  • 16. Demo Azure Data Lake – Stor Processing & AnalyzingDataSetson Microsoft Azure
  • 17. Treinamento Big Data para DBA’s e Desenvolvedores no Ecossistemado Microsoft Azure The Core Understanding Cloud Computing Big Data SQL Server 2016/2017 NoSQL Apache Software Foundation Hadoop Spark Microsoft Azure HDInsight Azure Data Lake Store/Analytics The Programming Languages Hive Drill Pig PolyBase Phoenix Scala PySpark Spark-SQL The Real-State Databases Azure SQL DB Azure SQL Dw HBase CosmosDB The In-Memory & Streaming Processing Spark DataBricks Storm Azure Stream Analytics [ASA] Kafka The On-Demand & Analytical Processing Azure Data Lake Store The Data-Transfer & Orchestrators Sqoop Oozie Apache Nifi Apache Airflow Azure Data Factory - [ADF] ~ 24 Hrs Local – BeloHorizonte Sexta, Sábado e Domingo de 09:00 Hs às 18:00 Hs - 03/08/2018 - 04/08/2018 - 05/08/2018 http://bit.do/bhbigdata