SlideShare a Scribd company logo
1 of 35
Download to read offline
Getting started with BigQuery
Pradeep Bhadani
Founder, Cloud Native Technologies
cntek.io
pbhadani.com
linkedin.com/in/pradeepbhadani
linkedin.com/company/cloudnativetech
22nd August 2020, Google Next OnAir Extended
About Me
IT Consultant with 9 years of experience in Big Data, Cloud & DevOps
GDE (Google Developers Expert) - Cloud
Google Cloud Authorized Trainer
HashiCorp Ambassador
Blog: pbhadani.com
Cloud Native Technologiescntek.io
Services
● Big Data Consultancy
● Cloud & DevOps Consultancy
● Tailored Training and Workshops
Cloud Native Technologiescntek.io
Agenda
● Overview
○ What is a Data Warehouse?
○ Choosing a Data Warehouse Option?
● Introduction to BigQuery
○ What is BigQuery?
○ Why BigQuery?
○ Concepts
● Best Practices
● Interacting with BigQuery
● Demo
Cloud Native Technologiescntek.io
Data Warehouse
Cloud Native Technologiescntek.io
What is a Data Warehouse?
A data warehouse is a critical component in Business Intelligence
solution which enables an organization to make a better decision.
Data warehouse offers:
● Scheduled & ad-hoc reporting
● Ad-hoc analysis
● Integrates with Visualization tools
Cloud Native Technologiescntek.io
Data Warehouse options?
Cloud Native Technologiescntek.io
Source:commons.wikimedia.org
iconfinder.com
Choosing a Data Warehouse?
Cloud Native Technologiescntek.io
BigQuery
Cloud Native Technologiescntek.io
What is BigQuery?
BigQuery is a fully-managed enterprise-grade modern data warehouse
offering on Google Cloud Platform.
cloud.google.com/bigquery
Cloud Native Technologiescntek.io
Why BigQuery?
Cloud Native Technologiescntek.io
Serverless Fast SQL Security Scalable
Data
Encryption
Managed
Storage
Flexible
Pricing
Advanced
Features
Advanced Features
Cloud Native Technologiescntek.io
BigQueryML BigQuery GIS
BigQuery Omni
(private alpha)
DataQnA
(private alpha)
Architecture
Cloud Native Technologiescntek.io
Columnar based storage
Cloud Native Technologiescntek.io
Row based Storage Column based Storage
Decoupled Storage & Compute
Cloud Native Technologiescntek.io
Storage ComputePetabit Network
Resources
Cloud Native Technologiescntek.io
● An Inside Look at Google BigQuery
https://cloud.google.com/files/BigQueryTechnicalWP.pdf
● Dremel
static.googleusercontent.com/media/research.google.com/en//pubs/archive/36632.pdf
Concepts
Cloud Native Technologiescntek.io
GCP Project is a top-level logical container to organize all the Google Cloud
Platform resources like Storage, BigQuery.
GCP Project
Cloud Native Technologiescntek.io
GCP Project
Logical container to organize the BigQuery tables.
BigQuery Datasets
Cloud Native Technologiescntek.io
GCP Project
Dataset A Dataset B
BigQuery tables contains the data and the schema that describe the data.
<project_id>.<dataset_id>.<table>
BigQuery Tables
Cloud Native Technologiescntek.io
Table 2
GCP Project
Dataset A Dataset B
Table 1
Table 2
Table 1
Table 2
● Native Tables
● External Tables
● Views
BigQuery Tables types
Cloud Native Technologiescntek.io
GCP Project
BQ Dataset
BQ Tables
A BigQuery slot is a combination of CPU, memory and network resources.
BigQuery automatically calculates the number of slots required to execute a
query based on query size and complexity.
Slots
Cloud Native Technologiescntek.io
● Interactive queries — 100 concurrent queries
● Query execution time limit — 6 hours
● Load jobs per table per day — 1,500 (including failures)
● Maximum columns per table — 10,000
● Copy jobs per destination table per day — 1,000 (including failures)
● Number of datasets per project — No limit
● Number of tables per dataset — No limit
● Maximum number of table operations per day — 1,500
● Maximum number of partitions per partitioned table — 4,000
Please refer cloud.google.com/bigquery/quotas for latest service limits
Service Limits
Cloud Native Technologiescntek.io
● On-Demand
○ $5 per TB
○ First 1TB per month is free
● Flat Rate
○ Monthly - $2000 per 100 slots
○ Annual - $1700 per 100 slots
Please refer cloud.google.com/bigquery/pricing for latest Pricing
Pricing
Cloud Native Technologiescntek.io
Interacting with
BigQuery
Cloud Native Technologiescntek.io
Ways to interact with BigQuery
● Web UI - Cloud Console, Classic UI
● Command Line - bq
● Client Libraries - Go, Python, Java, etc.
● Third-party tools
Cloud Native Technologiescntek.io
Web UI
Cloud Native Technologiescntek.io
Command Line tool
Cloud Native Technologiescntek.io
Client Libraries
Cloud Native Technologiescntek.io
Best Practices
Cloud Native Technologiescntek.io
● Avoid “SELECT *”
● Use of Partitions
● Denormalization
● Use wildcards on tables appropriately
● Use external data source appropriately
● Reduce the amount of data before JOIN
● Avoid repetitive data transformation using SQL Queries
● Use Nested and Repeated fields
Query Performance
Cloud Native Technologiescntek.io
● Use table expiration
● Avoid data duplication
● Avoid full table scan
● Only scan required columns
● Use caching feature
● Use of Partitions
● Use of Clustering
Cost Optimization
Cloud Native Technologiescntek.io
Demo
Photo by Markus Spiske on UnsplashPhoto by Alex Litvin on Unsplash
Image by TeroVesalainen from Pixabay
pbhadani.com
pradeepbhadani
pradeepbhadani
bhadanipradeep
bit.ly/cntek-youtube
cntek.io
CloudNativeTech
CloudNativeTech
cntekio
bit.ly/cntek-youtube

More Related Content

What's hot

Big Query Basics
Big Query BasicsBig Query Basics
Big Query BasicsIdo Green
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery GirdhareeSaran
 
Google BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsGoogle BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsAndreas Raible
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperMárton Kodok
 
Data Modeling and Relational to NoSQL
 Data Modeling and Relational to NoSQL  Data Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQL DATAVERSITY
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
Introduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrepIntroduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrepPaweł Mitruś
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneDataWorks Summit
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...HostedbyConfluent
 
GCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and ProcessingGCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and Processingconfluent
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Power BI Zero to Hero by Rajat Jaiswal
Power BI Zero to Hero by Rajat JaiswalPower BI Zero to Hero by Rajat Jaiswal
Power BI Zero to Hero by Rajat JaiswalIndiandotnet
 

What's hot (20)

Big Query Basics
Big Query BasicsBig Query Basics
Big Query Basics
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery
 
Google BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsGoogle BigQuery - Features & Benefits
Google BigQuery - Features & Benefits
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
Data Modeling and Relational to NoSQL
 Data Modeling and Relational to NoSQL  Data Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQL
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
Introduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrepIntroduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrep
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
 
GCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and ProcessingGCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and Processing
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Power BI Zero to Hero by Rajat Jaiswal
Power BI Zero to Hero by Rajat JaiswalPower BI Zero to Hero by Rajat Jaiswal
Power BI Zero to Hero by Rajat Jaiswal
 

Similar to Getting started with BigQuery

Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud PlatformPradeep Bhadani
 
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Ido Green
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformGoDataDriven
 
[Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy][Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy]Nicolas Bortolotti
 
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQueryMárton Kodok
 
Provision GCP resources using Terraform @ GDG Craiova
Provision GCP resources using Terraform @ GDG CraiovaProvision GCP resources using Terraform @ GDG Craiova
Provision GCP resources using Terraform @ GDG CraiovaPradeep Bhadani
 
DevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryDevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryMárton Kodok
 
Run your code serverlessly on Google's open cloud
Run your code serverlessly on Google's open cloudRun your code serverlessly on Google's open cloud
Run your code serverlessly on Google's open cloudwesley chun
 
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanelA Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanelData Science Club
 
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)Ido Green
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On DemandBogdan Kyryliuk
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
 
Exploring BigData with Google BigQuery
Exploring BigData with Google BigQueryExploring BigData with Google BigQuery
Exploring BigData with Google BigQueryDharmesh Vaya
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)bigdata trunk
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comAlex Van Boxel
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 

Similar to Getting started with BigQuery (20)

Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
 
[Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy][Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy]
 
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
Provision GCP resources using Terraform @ GDG Craiova
Provision GCP resources using Terraform @ GDG CraiovaProvision GCP resources using Terraform @ GDG Craiova
Provision GCP resources using Terraform @ GDG Craiova
 
DevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryDevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQuery
 
Run your code serverlessly on Google's open cloud
Run your code serverlessly on Google's open cloudRun your code serverlessly on Google's open cloud
Run your code serverlessly on Google's open cloud
 
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanelA Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
 
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)
Scale with a smile with Google Cloud Platform At DevConTLV (June 2014)
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
 
Exploring BigData with Google BigQuery
Exploring BigData with Google BigQueryExploring BigData with Google BigQuery
Exploring BigData with Google BigQuery
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.com
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 

More from Pradeep Bhadani

GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdf
GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdfGDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdf
GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdfPradeep Bhadani
 
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdf
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdfMiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdf
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdfPradeep Bhadani
 
Introduction to Terraform and Google Cloud Platform
Introduction to Terraform and Google Cloud PlatformIntroduction to Terraform and Google Cloud Platform
Introduction to Terraform and Google Cloud PlatformPradeep Bhadani
 
Hey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructureHey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructurePradeep Bhadani
 
Hey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructureHey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructurePradeep Bhadani
 
Cloud: Shift in the Mindset
Cloud: Shift in the MindsetCloud: Shift in the Mindset
Cloud: Shift in the MindsetPradeep Bhadani
 
GDG London Workshop: Build GCP infrastructure with Terraform
GDG London Workshop: Build GCP infrastructure with Terraform GDG London Workshop: Build GCP infrastructure with Terraform
GDG London Workshop: Build GCP infrastructure with Terraform Pradeep Bhadani
 
Terraform: Infrastructure as Code
Terraform: Infrastructure as CodeTerraform: Infrastructure as Code
Terraform: Infrastructure as CodePradeep Bhadani
 

More from Pradeep Bhadani (8)

GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdf
GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdfGDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdf
GDG_Wroclaw_ Deploying_Cloud_Infrastructure_using_Terraform.pdf
 
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdf
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdfMiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdf
MiniSPA2022_Build_and_Test_Cloud_Infrastructure_using_Terraform_Modules.pdf
 
Introduction to Terraform and Google Cloud Platform
Introduction to Terraform and Google Cloud PlatformIntroduction to Terraform and Google Cloud Platform
Introduction to Terraform and Google Cloud Platform
 
Hey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructureHey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP Infrastructure
 
Hey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP InfrastructureHey Terraform, build me GCP Infrastructure
Hey Terraform, build me GCP Infrastructure
 
Cloud: Shift in the Mindset
Cloud: Shift in the MindsetCloud: Shift in the Mindset
Cloud: Shift in the Mindset
 
GDG London Workshop: Build GCP infrastructure with Terraform
GDG London Workshop: Build GCP infrastructure with Terraform GDG London Workshop: Build GCP infrastructure with Terraform
GDG London Workshop: Build GCP infrastructure with Terraform
 
Terraform: Infrastructure as Code
Terraform: Infrastructure as CodeTerraform: Infrastructure as Code
Terraform: Infrastructure as Code
 

Recently uploaded

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

Getting started with BigQuery

  • 1. Getting started with BigQuery Pradeep Bhadani Founder, Cloud Native Technologies cntek.io pbhadani.com linkedin.com/in/pradeepbhadani linkedin.com/company/cloudnativetech 22nd August 2020, Google Next OnAir Extended
  • 2. About Me IT Consultant with 9 years of experience in Big Data, Cloud & DevOps GDE (Google Developers Expert) - Cloud Google Cloud Authorized Trainer HashiCorp Ambassador Blog: pbhadani.com Cloud Native Technologiescntek.io
  • 3. Services ● Big Data Consultancy ● Cloud & DevOps Consultancy ● Tailored Training and Workshops Cloud Native Technologiescntek.io
  • 4. Agenda ● Overview ○ What is a Data Warehouse? ○ Choosing a Data Warehouse Option? ● Introduction to BigQuery ○ What is BigQuery? ○ Why BigQuery? ○ Concepts ● Best Practices ● Interacting with BigQuery ● Demo Cloud Native Technologiescntek.io
  • 5. Data Warehouse Cloud Native Technologiescntek.io
  • 6. What is a Data Warehouse? A data warehouse is a critical component in Business Intelligence solution which enables an organization to make a better decision. Data warehouse offers: ● Scheduled & ad-hoc reporting ● Ad-hoc analysis ● Integrates with Visualization tools Cloud Native Technologiescntek.io
  • 7. Data Warehouse options? Cloud Native Technologiescntek.io Source:commons.wikimedia.org iconfinder.com
  • 8. Choosing a Data Warehouse? Cloud Native Technologiescntek.io
  • 10. What is BigQuery? BigQuery is a fully-managed enterprise-grade modern data warehouse offering on Google Cloud Platform. cloud.google.com/bigquery Cloud Native Technologiescntek.io
  • 11. Why BigQuery? Cloud Native Technologiescntek.io Serverless Fast SQL Security Scalable Data Encryption Managed Storage Flexible Pricing Advanced Features
  • 12. Advanced Features Cloud Native Technologiescntek.io BigQueryML BigQuery GIS BigQuery Omni (private alpha) DataQnA (private alpha)
  • 14. Columnar based storage Cloud Native Technologiescntek.io Row based Storage Column based Storage
  • 15. Decoupled Storage & Compute Cloud Native Technologiescntek.io Storage ComputePetabit Network
  • 16. Resources Cloud Native Technologiescntek.io ● An Inside Look at Google BigQuery https://cloud.google.com/files/BigQueryTechnicalWP.pdf ● Dremel static.googleusercontent.com/media/research.google.com/en//pubs/archive/36632.pdf
  • 18. GCP Project is a top-level logical container to organize all the Google Cloud Platform resources like Storage, BigQuery. GCP Project Cloud Native Technologiescntek.io GCP Project
  • 19. Logical container to organize the BigQuery tables. BigQuery Datasets Cloud Native Technologiescntek.io GCP Project Dataset A Dataset B
  • 20. BigQuery tables contains the data and the schema that describe the data. <project_id>.<dataset_id>.<table> BigQuery Tables Cloud Native Technologiescntek.io Table 2 GCP Project Dataset A Dataset B Table 1 Table 2 Table 1 Table 2
  • 21. ● Native Tables ● External Tables ● Views BigQuery Tables types Cloud Native Technologiescntek.io GCP Project BQ Dataset BQ Tables
  • 22. A BigQuery slot is a combination of CPU, memory and network resources. BigQuery automatically calculates the number of slots required to execute a query based on query size and complexity. Slots Cloud Native Technologiescntek.io
  • 23. ● Interactive queries — 100 concurrent queries ● Query execution time limit — 6 hours ● Load jobs per table per day — 1,500 (including failures) ● Maximum columns per table — 10,000 ● Copy jobs per destination table per day — 1,000 (including failures) ● Number of datasets per project — No limit ● Number of tables per dataset — No limit ● Maximum number of table operations per day — 1,500 ● Maximum number of partitions per partitioned table — 4,000 Please refer cloud.google.com/bigquery/quotas for latest service limits Service Limits Cloud Native Technologiescntek.io
  • 24. ● On-Demand ○ $5 per TB ○ First 1TB per month is free ● Flat Rate ○ Monthly - $2000 per 100 slots ○ Annual - $1700 per 100 slots Please refer cloud.google.com/bigquery/pricing for latest Pricing Pricing Cloud Native Technologiescntek.io
  • 26. Ways to interact with BigQuery ● Web UI - Cloud Console, Classic UI ● Command Line - bq ● Client Libraries - Go, Python, Java, etc. ● Third-party tools Cloud Native Technologiescntek.io
  • 27. Web UI Cloud Native Technologiescntek.io
  • 28. Command Line tool Cloud Native Technologiescntek.io
  • 29. Client Libraries Cloud Native Technologiescntek.io
  • 30. Best Practices Cloud Native Technologiescntek.io
  • 31. ● Avoid “SELECT *” ● Use of Partitions ● Denormalization ● Use wildcards on tables appropriately ● Use external data source appropriately ● Reduce the amount of data before JOIN ● Avoid repetitive data transformation using SQL Queries ● Use Nested and Repeated fields Query Performance Cloud Native Technologiescntek.io
  • 32. ● Use table expiration ● Avoid data duplication ● Avoid full table scan ● Only scan required columns ● Use caching feature ● Use of Partitions ● Use of Clustering Cost Optimization Cloud Native Technologiescntek.io
  • 33. Demo Photo by Markus Spiske on UnsplashPhoto by Alex Litvin on Unsplash
  • 34. Image by TeroVesalainen from Pixabay pbhadani.com pradeepbhadani pradeepbhadani bhadanipradeep bit.ly/cntek-youtube