SlideShare une entreprise Scribd logo
1  sur  83
Télécharger pour lire hors ligne
Building a Marketing Data
Warehouse from Scratch
Christopher Gutknecht | @chrisgutknecht | Bergzeit
2. Dashboards
1. Greenfield
Our Plan: The Three Phases of a Data Platform
3. Operational Analytics
What You Will Take Away from this Session
1. When and why you should invest in a Marketing DWH
3. Interesting use cases by combining data sources
5. Design outcome-oriented questions for analytics projects
2. Learn the data ecosystem and the benefits of BigQuery
4. Many tactical tips for daily use
About Chris: Head of Acquisition & Optimization
Digital Marketer
Tech nerd
Climber
1997 2008 2013 2020
Dad of 2
Big Thanks to Steffi,
our Data Scientist
Bergzeit: Combining Love for Mountains & Data
Online Store for Mountain Gear
122 M Revenue in Financial Year 20/21
14 Countries & 5 Languages
Commerce. Content. Guided Tours
Let’s Set Clear Expectations for this Session
Technical deep dive
Intro to data ecosystem
What this session IS What it’s NOT
Machine learning
Customer data focused
Google Cloud focused
Practical tips & mistakes
Ecommerce use cases
Why Is Data Knowledge Important for You?
Behavioural data -> digital success
Operational analytics on the rise
Requirement Privacy by design
Most Desirable Digital Marketing Skills 20/21
source: marketingcharts.com
The
Components of a
Modern Data
Platform
Not so fast… Why All This Complexity?
Why Do I Need a Marketing Data Warehouse?
Connectors
If your Frustration Grows with Reporting Tools
Transformations
Volume
Operational Use Cases
Complexity
More Data Sources
Don’t Be this Guy - Know When To Scale Up
But Don’t Overengineer - Gradually Scale
Infrastructure Maintenance
Long upfront planning
Past: The Old On-Premise Data Veterans
Manual Scaling
Strict Dimensional Modeling
Today: The Cool Cloud Kids on the Block
100% cloud, no Ops
Seamless scaling
Instantly ready
The Components of a Modern Data Platform
Data Warehouse
Data Ingestion
Data Catalog & Governance
Activation
Data Quality
Job Orchestration
Visualization
Transformation
Analysts are Turning into Analytics Engineers
Data Warehouse
Data Ingestion
Data Catalog & Governance
Activation
Data Quality
Job Orchestration
Visualization
Transformation
Phase 1: Greenfield
Let’s Start from The Beginning: Data Sources
Data Ingestion
Google Ads Data Transfer
GA4 Export
Google Merchant Center
Paid Connectors, e.g. Fivetran
Custom Ingestion Scripts
Google Sheets
Cost Connectors, e.g. Funnel.io
1. Navigate to Data transfers
Set up a Big Query Ads Transfer in One Minute
2. Configure the Transfer details
2. Storage & Transfer
Get Your GA4 Data into BQ in Two Minutes
1. BQ Linking in Admin UI
Easily Connect Your Sheets as BigQuery Tables
Data Loader Vendors: Easy Setup, Instant Results
Suggestions for Interesting Data Sources
Domain Data Source Available in Data Loaders?
SEO Google Search Console
SEO Pagespeed Insights & Lighthouse
SEO Google Bot Logfiles
Ecom Inventory Data & Attributes
Ecom Trusted Shops Reviews
Ecom Awin Open Orders
Social Instagram
Social Facebook
.... ...
2. Storage & Transfer
The Best Custom Way to Ingest Data into BQ
Data Source 3. BigQuery
1. Cloud Function
Data Transfer handles ingest job
Observability via alerts
Data Fetch with Python and Pandas
How Do We Batch-Ingest New Data?
Change Data Capture
Snapshots Copies
Full History Daily State
very easy
For <300k rows
rather easy
duplicate rows
storage efficient
complex architecture
Phase 2: Dashboards
We’ve Got Data: Show Me Shiny Reports!
Data Warehouse
Data Ingestion Visualization
How do Price Discounts affect Sales? Sources:
Price Discounts Data Model Sources:
date
sku
detail_views
product order value
ga sessions
date
sku
price
sale_price
diff_price_to_sale
diff_price_to_sale_grouped
products
How do Season Categories Affect Sales?
Season Category Data Model Sources:
date
sku
detail_views
product order value
ga sessions
date
sku
season_category
products
Category Revenue Share without Ratings
Category Rev & Ratings Data Model Sources:
date
sku
top_category
product order value
ga sessions
date
sku
rating_count
rating_count_grouped
products ratings
Which Category Has More Selection Orders?
Selection Orders Data Model Sources:
date
transaction_id
sub_category
is_multi_same_category
is_multi_same_color
is_multi_same_size
ga sessions
Wait… Don’t You Need to Know SQL for This?
You Need to Learn SQL: Take a BigQuery Course
Self-paced
45€ / User / Month
Takes 4-8 weeks
Certificate
WAIT! Before you build 100s of Dashboards...
Avoid code repetition
Style Guidelines
Test Coverage
Apply DEV Best Practices
SQL Version Control
Warning Signs You’re Doing Analytics Wrong
Don’t Create Datasets in the US, use EU
Always keep in EU
Can’t join with EU datasets
Avoid Saved Queries - They Don’t Scale
This Report is Broken. Who’s the Owner?
No Idea,
Help Yourself!
Don’t Use Custom Queries as Data Sources
Each Analyst uses a different SQL Code & Naming
Instead, Apply “Data Product” Thinking
Data Product Owner
Scrum Process
Data SLAs
Treat Data as a Product
#1 Pick And Implement A SQL Styleguide
https://github.com/mattm/sql-style-guide
#2 Define A Dataset & Table Naming Convention
Fields
Domains
Datasets
Tables
product
seo
seo_google_search_console
query_by_page_daily
ga_product_order_value
page
#3 Use dbt for all SQL Transformations
Build Your Data Product like a German House
Phase 3: Operational Analytics in Production
You Need Clean Data for Operational Analytics
What is Operational Analytics for Bergzeit?
ML
Products
Profit Bidding
Rule-based
Products
Data Uploads Attribution Model
Updating Affiliate Sales
Case: Upload Your Shopping Feed Every 10 Min
2. Cloud Function
with 15 lines of code
3. Schedule
Cronjob
1. Get GCS Bucket
Name
Code samples: https://gist.github.com/ChrisGutknecht/fde93092e21039299ab76715596eac01
Case: Profit Bidding & Report
More Details: https://www.slideshare.net/ChristopherGutknecht/gross-profit-bidding-for-ecommerce-smx-virtual-2021
Operational Data Errors Can be Really Costly
We Need To Prepare for Data Pipeline Errors
2. Execution
1. Source Data 3. Target Table
Test Coverage Test Coverage
Retry Policies
How Can We Solve Our Missing Data Problem?
Solution: Define a Table Freshness Alert with dbt
Define All Data Quality Tests in dbt
Non Null Values
Restricted Values
Uniqueness
Data Freshness
Simple: Cloud Scheduler for Retry Execution
Advanced: Use Airflow For Data Task Graphs
Retries on Every Task
Alerting & Monitoring
Data Tasks in DAGs
Snaspshots & Backfills
Let’s Finish With a Strategic View From the Peak
How Can We Generate Value? Focus on Actions
1. Define Actions 3. Factors
2. Success Metrics
What Will You
Do Differently If
You Have the
Data?
What Would
Success in
Metrics Look
Like?
Which Factors
Influence
Success?
4. Tests
How Can We
Test Actions on
These Factors?
Who Should Be Your Data Hire?
1. Focus:
SQL & Warehouse
3. Focus: ML Models
2. Focus: Data Pipelines
Your Takeaways from this Session
1. When and why you should invest in a Marketing DWH
3. How to explore use cases by combining data sources
5. Design outcome-oriented questions for analytics projects
2. Learn the data ecosystem and the benefits of BigQuery
4. Many tactical tips for daily use
Thanks for Your Time.
Looking Forward To Questions!
Chris Gutknecht | Teamlead A&O | Hiring a PPC!
2. Dashboards
1. Greenfield
ANNEX: The Three Phases of a Data Platform
3. Operational Analytics
Data Warehouse vs Data Lake
Structured Data
Table Schemas
Transactions
Sharded Files
Unstructured Data
Lakehouse
Why Focus On Google Cloud & Big Query?
Market Leader in Data Analytics*
Free Google Data Connectors
Seamless low-tech scaling
source:: Forrester Research 2021
The Best Cloud Data Warehouse? It Depends
Source: https://medium.com/pocket-gems/a-comparative-analysis-between-bigquery-redshift-and-snowflake-8d194fdf5693
Google Data Sources BigQuery = Google Cloud
How often do we Ingest Data?
Real-Time
Stream Processing
Batch Processing
or
Connected Sheets = ‘NoCode’ Analysis on BQ
Dom Woodman’s Search Console Downloader
https://www.pipedout.com/resources/tools/download-search-console/
Data Modeling: Star Schema & 3rd Normal Form
Third Normal
Form
Data Modeling Choices: Denormalized
Third Normal
Form
Denormalized
2. Dashboards
1. Greenfield
ANNEX: The Three Phases of a Data Platform
3. Operational Analytics
Or Pick the Official “Data Analytics” Certificate
8 Courses
4-6 Months (longer)
Intro to R
Certificate
Export Your Your GPC Costs to BigQuery
Monitor Your BigQuery Costs in a Dashboard
Monitor Expensive Queries: https://www.pascallandau.com/bigquery-snippets/monitor-query-costs/
2. Dashboards
1. Greenfield
ANNEX: The Three Phases of a Data Platform
3. Operational Analytics
How To Sync Segments: CDP vs Reverse ETL?
Customer Data Platform
Reverse ETL
(DWH = Central)

Contenu connexe

Tendances

14 2 2023 - AI & Marketing - Hugues Rey.pdf
14 2 2023 - AI & Marketing - Hugues Rey.pdf14 2 2023 - AI & Marketing - Hugues Rey.pdf
14 2 2023 - AI & Marketing - Hugues Rey.pdfHugues Rey
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksDatabricks
 
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Christopher Gutknecht
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOsAreej AbuAli
 
Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doMetehan Çetinkaya
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudMárton Kodok
 
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Koray Tugberk GUBUR
 
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...LazarinaStoyanova
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model DeploymentUnified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model DeploymentDatabricks
 
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...How to Do Anything You Want in Google Data Studio - Google Marketing Platform...
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...In Marketing We Trust
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptx
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptxMeasurefest - GA4 From Migration to Measurement - The Key To Success.pptx
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptxSam Thomas
 
First Party Conversion Tracking [SEAcamp]
First Party Conversion Tracking [SEAcamp]First Party Conversion Tracking [SEAcamp]
First Party Conversion Tracking [SEAcamp]📊 Markus Baersch
 
Lake Database Database Template Map Data in Azure Synapse Analytics
Lake Database  Database Template  Map Data in Azure Synapse AnalyticsLake Database  Database Template  Map Data in Azure Synapse Analytics
Lake Database Database Template Map Data in Azure Synapse AnalyticsErwin de Kreuk
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 

Tendances (20)

14 2 2023 - AI & Marketing - Hugues Rey.pdf
14 2 2023 - AI & Marketing - Hugues Rey.pdf14 2 2023 - AI & Marketing - Hugues Rey.pdf
14 2 2023 - AI & Marketing - Hugues Rey.pdf
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
 
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs
 
Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.do
 
GA4 LAND - Trendigital 2023
GA4 LAND - Trendigital 2023GA4 LAND - Trendigital 2023
GA4 LAND - Trendigital 2023
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Bigquery 101
Bigquery 101Bigquery 101
Bigquery 101
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
 
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
 
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model DeploymentUnified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model Deployment
 
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...How to Do Anything You Want in Google Data Studio - Google Marketing Platform...
How to Do Anything You Want in Google Data Studio - Google Marketing Platform...
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptx
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptxMeasurefest - GA4 From Migration to Measurement - The Key To Success.pptx
Measurefest - GA4 From Migration to Measurement - The Key To Success.pptx
 
First Party Conversion Tracking [SEAcamp]
First Party Conversion Tracking [SEAcamp]First Party Conversion Tracking [SEAcamp]
First Party Conversion Tracking [SEAcamp]
 
Lake Database Database Template Map Data in Azure Synapse Analytics
Lake Database  Database Template  Map Data in Azure Synapse AnalyticsLake Database  Database Template  Map Data in Azure Synapse Analytics
Lake Database Database Template Map Data in Azure Synapse Analytics
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 

Similaire à Building a Marketing Data Warehouse from Scratch - SMX Advanced 202

Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...e-dialog GmbH
 
Building a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSBuilding a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSPerficient, Inc.
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Google Cloud Machine Learning
 Google Cloud Machine Learning  Google Cloud Machine Learning
Google Cloud Machine Learning India Quotient
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Building a Marketing Data Warehouse in Google BigQuery with Supermetrics
Building a Marketing Data Warehouse in Google BigQuery with SupermetricsBuilding a Marketing Data Warehouse in Google BigQuery with Supermetrics
Building a Marketing Data Warehouse in Google BigQuery with SupermetricsIn Marketing We Trust
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
GraphSummit - Process Tempo - Build Graph Applications.pdf
GraphSummit - Process Tempo - Build Graph Applications.pdfGraphSummit - Process Tempo - Build Graph Applications.pdf
GraphSummit - Process Tempo - Build Graph Applications.pdfNeo4j
 
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Elemica
 
Google Data Studio for business
Google Data Studio for businessGoogle Data Studio for business
Google Data Studio for businessOWOX BI
 
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365Brian Culver
 
SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365Brian Culver
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsLooker
 
Atlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfAtlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfSubrat Kumar Dash
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
 
The 5 Keys to a Killer Data Lake
The 5 Keys to a Killer Data LakeThe 5 Keys to a Killer Data Lake
The 5 Keys to a Killer Data LakeDataWorks Summit
 
Enabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data VirtualizationEnabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data VirtualizationDenodo
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenChristopher Bergh
 
Introduction To SQL Server 2014
Introduction To SQL Server 2014Introduction To SQL Server 2014
Introduction To SQL Server 2014Vishal Pawar
 

Similaire à Building a Marketing Data Warehouse from Scratch - SMX Advanced 202 (20)

Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
Google Analytics Konferenz 2019_Google Cloud Platform_Carl Fernandes & Ksenia...
 
Building a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSBuilding a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICS
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Google Cloud Machine Learning
 Google Cloud Machine Learning  Google Cloud Machine Learning
Google Cloud Machine Learning
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Building a Marketing Data Warehouse in Google BigQuery with Supermetrics
Building a Marketing Data Warehouse in Google BigQuery with SupermetricsBuilding a Marketing Data Warehouse in Google BigQuery with Supermetrics
Building a Marketing Data Warehouse in Google BigQuery with Supermetrics
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
GraphSummit - Process Tempo - Build Graph Applications.pdf
GraphSummit - Process Tempo - Build Graph Applications.pdfGraphSummit - Process Tempo - Build Graph Applications.pdf
GraphSummit - Process Tempo - Build Graph Applications.pdf
 
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
 
Google Data Studio for business
Google Data Studio for businessGoogle Data Studio for business
Google Data Studio for business
 
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
 
SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
 
Atlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfAtlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdf
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
The 5 Keys to a Killer Data Lake
The 5 Keys to a Killer Data LakeThe 5 Keys to a Killer Data Lake
The 5 Keys to a Killer Data Lake
 
Enabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data VirtualizationEnabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
Introduction To SQL Server 2014
Introduction To SQL Server 2014Introduction To SQL Server 2014
Introduction To SQL Server 2014
 

Plus de Christopher Gutknecht

PMAX Product structures with BigQuery [GERMAN]
PMAX Product structures with BigQuery [GERMAN]PMAX Product structures with BigQuery [GERMAN]
PMAX Product structures with BigQuery [GERMAN]Christopher Gutknecht
 
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...Christopher Gutknecht
 
MeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtMeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtChristopher Gutknecht
 
Gross Profit Bidding for Ecommerce | SMX Virtual 2021
Gross Profit Bidding for Ecommerce | SMX Virtual 2021Gross Profit Bidding for Ecommerce | SMX Virtual 2021
Gross Profit Bidding for Ecommerce | SMX Virtual 2021Christopher Gutknecht
 
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsData Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsChristopher Gutknecht
 
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Christopher Gutknecht
 
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Christopher Gutknecht
 

Plus de Christopher Gutknecht (7)

PMAX Product structures with BigQuery [GERMAN]
PMAX Product structures with BigQuery [GERMAN]PMAX Product structures with BigQuery [GERMAN]
PMAX Product structures with BigQuery [GERMAN]
 
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
How to recover from an unsuccessful SEO relaunch by activating your data (SMX...
 
MeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtMeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbt
 
Gross Profit Bidding for Ecommerce | SMX Virtual 2021
Gross Profit Bidding for Ecommerce | SMX Virtual 2021Gross Profit Bidding for Ecommerce | SMX Virtual 2021
Gross Profit Bidding for Ecommerce | SMX Virtual 2021
 
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov ChainsData Driven Attribution in BigQuery with Shapley Values and Markov Chains
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
 
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
Questioning data quality and troubleshooting tracking gaps (version2 | Smx Su...
 
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
Questioning Data Quality and Troubleshooting Tracking Gaps (SMX Munich 2020)
 

Dernier

SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 

Dernier (20)

Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 

Building a Marketing Data Warehouse from Scratch - SMX Advanced 202

  • 1. Building a Marketing Data Warehouse from Scratch Christopher Gutknecht | @chrisgutknecht | Bergzeit
  • 2. 2. Dashboards 1. Greenfield Our Plan: The Three Phases of a Data Platform 3. Operational Analytics
  • 3. What You Will Take Away from this Session 1. When and why you should invest in a Marketing DWH 3. Interesting use cases by combining data sources 5. Design outcome-oriented questions for analytics projects 2. Learn the data ecosystem and the benefits of BigQuery 4. Many tactical tips for daily use
  • 4. About Chris: Head of Acquisition & Optimization Digital Marketer Tech nerd Climber 1997 2008 2013 2020 Dad of 2 Big Thanks to Steffi, our Data Scientist
  • 5. Bergzeit: Combining Love for Mountains & Data Online Store for Mountain Gear 122 M Revenue in Financial Year 20/21 14 Countries & 5 Languages Commerce. Content. Guided Tours
  • 6. Let’s Set Clear Expectations for this Session Technical deep dive Intro to data ecosystem What this session IS What it’s NOT Machine learning Customer data focused Google Cloud focused Practical tips & mistakes Ecommerce use cases
  • 7. Why Is Data Knowledge Important for You? Behavioural data -> digital success Operational analytics on the rise Requirement Privacy by design
  • 8. Most Desirable Digital Marketing Skills 20/21 source: marketingcharts.com
  • 10. Not so fast… Why All This Complexity?
  • 11. Why Do I Need a Marketing Data Warehouse? Connectors
  • 12. If your Frustration Grows with Reporting Tools Transformations Volume Operational Use Cases Complexity More Data Sources
  • 13. Don’t Be this Guy - Know When To Scale Up
  • 14. But Don’t Overengineer - Gradually Scale
  • 15. Infrastructure Maintenance Long upfront planning Past: The Old On-Premise Data Veterans Manual Scaling Strict Dimensional Modeling
  • 16. Today: The Cool Cloud Kids on the Block 100% cloud, no Ops Seamless scaling Instantly ready
  • 17. The Components of a Modern Data Platform Data Warehouse Data Ingestion Data Catalog & Governance Activation Data Quality Job Orchestration Visualization Transformation
  • 18. Analysts are Turning into Analytics Engineers Data Warehouse Data Ingestion Data Catalog & Governance Activation Data Quality Job Orchestration Visualization Transformation
  • 20. Let’s Start from The Beginning: Data Sources Data Ingestion Google Ads Data Transfer GA4 Export Google Merchant Center Paid Connectors, e.g. Fivetran Custom Ingestion Scripts Google Sheets Cost Connectors, e.g. Funnel.io
  • 21. 1. Navigate to Data transfers Set up a Big Query Ads Transfer in One Minute 2. Configure the Transfer details
  • 22. 2. Storage & Transfer Get Your GA4 Data into BQ in Two Minutes 1. BQ Linking in Admin UI
  • 23. Easily Connect Your Sheets as BigQuery Tables
  • 24. Data Loader Vendors: Easy Setup, Instant Results
  • 25. Suggestions for Interesting Data Sources Domain Data Source Available in Data Loaders? SEO Google Search Console SEO Pagespeed Insights & Lighthouse SEO Google Bot Logfiles Ecom Inventory Data & Attributes Ecom Trusted Shops Reviews Ecom Awin Open Orders Social Instagram Social Facebook .... ...
  • 26. 2. Storage & Transfer The Best Custom Way to Ingest Data into BQ Data Source 3. BigQuery 1. Cloud Function Data Transfer handles ingest job Observability via alerts Data Fetch with Python and Pandas
  • 27. How Do We Batch-Ingest New Data? Change Data Capture Snapshots Copies Full History Daily State very easy For <300k rows rather easy duplicate rows storage efficient complex architecture
  • 29. We’ve Got Data: Show Me Shiny Reports! Data Warehouse Data Ingestion Visualization
  • 30. How do Price Discounts affect Sales? Sources:
  • 31. Price Discounts Data Model Sources: date sku detail_views product order value ga sessions date sku price sale_price diff_price_to_sale diff_price_to_sale_grouped products
  • 32. How do Season Categories Affect Sales?
  • 33. Season Category Data Model Sources: date sku detail_views product order value ga sessions date sku season_category products
  • 34. Category Revenue Share without Ratings
  • 35. Category Rev & Ratings Data Model Sources: date sku top_category product order value ga sessions date sku rating_count rating_count_grouped products ratings
  • 36. Which Category Has More Selection Orders?
  • 37. Selection Orders Data Model Sources: date transaction_id sub_category is_multi_same_category is_multi_same_color is_multi_same_size ga sessions
  • 38. Wait… Don’t You Need to Know SQL for This?
  • 39. You Need to Learn SQL: Take a BigQuery Course Self-paced 45€ / User / Month Takes 4-8 weeks Certificate
  • 40. WAIT! Before you build 100s of Dashboards... Avoid code repetition Style Guidelines Test Coverage Apply DEV Best Practices SQL Version Control
  • 41. Warning Signs You’re Doing Analytics Wrong
  • 42. Don’t Create Datasets in the US, use EU Always keep in EU Can’t join with EU datasets
  • 43. Avoid Saved Queries - They Don’t Scale
  • 44. This Report is Broken. Who’s the Owner? No Idea, Help Yourself!
  • 45. Don’t Use Custom Queries as Data Sources
  • 46. Each Analyst uses a different SQL Code & Naming
  • 47. Instead, Apply “Data Product” Thinking Data Product Owner Scrum Process Data SLAs Treat Data as a Product
  • 48. #1 Pick And Implement A SQL Styleguide https://github.com/mattm/sql-style-guide
  • 49. #2 Define A Dataset & Table Naming Convention Fields Domains Datasets Tables product seo seo_google_search_console query_by_page_daily ga_product_order_value page
  • 50. #3 Use dbt for all SQL Transformations
  • 51. Build Your Data Product like a German House
  • 52. Phase 3: Operational Analytics in Production
  • 53. You Need Clean Data for Operational Analytics
  • 54. What is Operational Analytics for Bergzeit? ML Products Profit Bidding Rule-based Products Data Uploads Attribution Model Updating Affiliate Sales
  • 55. Case: Upload Your Shopping Feed Every 10 Min 2. Cloud Function with 15 lines of code 3. Schedule Cronjob 1. Get GCS Bucket Name Code samples: https://gist.github.com/ChrisGutknecht/fde93092e21039299ab76715596eac01
  • 56. Case: Profit Bidding & Report More Details: https://www.slideshare.net/ChristopherGutknecht/gross-profit-bidding-for-ecommerce-smx-virtual-2021
  • 57. Operational Data Errors Can be Really Costly
  • 58. We Need To Prepare for Data Pipeline Errors 2. Execution 1. Source Data 3. Target Table Test Coverage Test Coverage Retry Policies
  • 59. How Can We Solve Our Missing Data Problem?
  • 60. Solution: Define a Table Freshness Alert with dbt
  • 61. Define All Data Quality Tests in dbt Non Null Values Restricted Values Uniqueness Data Freshness
  • 62. Simple: Cloud Scheduler for Retry Execution
  • 63. Advanced: Use Airflow For Data Task Graphs Retries on Every Task Alerting & Monitoring Data Tasks in DAGs Snaspshots & Backfills
  • 64. Let’s Finish With a Strategic View From the Peak
  • 65. How Can We Generate Value? Focus on Actions 1. Define Actions 3. Factors 2. Success Metrics What Will You Do Differently If You Have the Data? What Would Success in Metrics Look Like? Which Factors Influence Success? 4. Tests How Can We Test Actions on These Factors?
  • 66. Who Should Be Your Data Hire? 1. Focus: SQL & Warehouse 3. Focus: ML Models 2. Focus: Data Pipelines
  • 67. Your Takeaways from this Session 1. When and why you should invest in a Marketing DWH 3. How to explore use cases by combining data sources 5. Design outcome-oriented questions for analytics projects 2. Learn the data ecosystem and the benefits of BigQuery 4. Many tactical tips for daily use
  • 68. Thanks for Your Time. Looking Forward To Questions! Chris Gutknecht | Teamlead A&O | Hiring a PPC!
  • 69. 2. Dashboards 1. Greenfield ANNEX: The Three Phases of a Data Platform 3. Operational Analytics
  • 70. Data Warehouse vs Data Lake Structured Data Table Schemas Transactions Sharded Files Unstructured Data Lakehouse
  • 71. Why Focus On Google Cloud & Big Query? Market Leader in Data Analytics* Free Google Data Connectors Seamless low-tech scaling source:: Forrester Research 2021
  • 72. The Best Cloud Data Warehouse? It Depends Source: https://medium.com/pocket-gems/a-comparative-analysis-between-bigquery-redshift-and-snowflake-8d194fdf5693 Google Data Sources BigQuery = Google Cloud
  • 73. How often do we Ingest Data? Real-Time Stream Processing Batch Processing or
  • 74. Connected Sheets = ‘NoCode’ Analysis on BQ
  • 75. Dom Woodman’s Search Console Downloader https://www.pipedout.com/resources/tools/download-search-console/
  • 76. Data Modeling: Star Schema & 3rd Normal Form Third Normal Form
  • 77. Data Modeling Choices: Denormalized Third Normal Form Denormalized
  • 78. 2. Dashboards 1. Greenfield ANNEX: The Three Phases of a Data Platform 3. Operational Analytics
  • 79. Or Pick the Official “Data Analytics” Certificate 8 Courses 4-6 Months (longer) Intro to R Certificate
  • 80. Export Your Your GPC Costs to BigQuery
  • 81. Monitor Your BigQuery Costs in a Dashboard Monitor Expensive Queries: https://www.pascallandau.com/bigquery-snippets/monitor-query-costs/
  • 82. 2. Dashboards 1. Greenfield ANNEX: The Three Phases of a Data Platform 3. Operational Analytics
  • 83. How To Sync Segments: CDP vs Reverse ETL? Customer Data Platform Reverse ETL (DWH = Central)