Data-driven companies have a need to make their data easily accessible to those who analyze it. Many organizations have adopted the Looker application, LookML on AWS, a centralized analytical database with a user-friendly interface that allows employees to ask and answer their own questions to make informed business decisions.
Join our webinar to learn how our customer, Casper, an online mattress retailer, made the switch from a transactional database to Looker’s data analytics program on Amazon Redshift. Looker on Amazon Redshift can help you greatly reduce your analytics lifecycle with a simplified infrastructure and rapid cloud scaling.
Join us to learn:
• How to utilize LookML to build reusable definitions and logic for your data
• Best practices for architecting a centralized analytical database
• How Casper leveraged Looker and Amazon Redshift to provide all their employees access to their data and metrics
Who should attend: Heads of Analytics, Heads of BI, Analytics Managers, BI Teams, Senior Analysts
1. A Data Culture with Embedded
Analytics in Action
Dave Rocamora • Solutions Architect, AWS
Erin Franz • Senior Analyst, Alliances, Looker
Scott Breitenother • VP, Data and Analytics, Casper
2. Data is Growing
of new data will be
created every second for
every human being on
the planet by 2020
1.7MB
compound annual growth
rate of 58% surpassing $1
billion by 2020 forecasted
for the Hadoop market
58%
of all data is ever
analyzed and used at the
moment
0.5%<
http://www.ap-institute.com/big-data-articles/big-data-
what-is-hadoop-%E2%80%93-an-explanation-for-
absolutely-anyone.aspx
http://www.marketanalysis.com/?p=279
http://www.technologyreview.com/news/514346/the-
data-made-me-do-it/
http://www.whizpr.be/upload/medialab/21/company/M
edia_Presentation_2012_DigiUniverseFINAL1.pdf
3. Big Data is for Everyone
The market for Big Data technologies is growing more than six times faster than the
information technology market as a whole…
…and those companies who use their data will win.
4. Why AWS for Big Data?
Immediately
Available
Broad and Deep
Capabilities
Trusted and Secure Scalable
5. Collect, Store, Analyze, and Visualize
It’s easy to get data to AWS, store it securely, and analyze it with the engine of your
choice, without any long-term commitment or vendor lock-in
Collect
AWS Import/Export
AWS Snowball
Direct Connect
VM Import/Export
Store
Amazon S3
Amazon EMR
Amazon Glacier
Amazon Redshift
DynamoDB
Amazon Aurora
Analyze
Amazon Kinesis
AWS Lambda
Amazon EMR
Amazon EC2
6. AWS Provides the Most Complete Platform
for Big Data
What can you do with Big Data on AWS?
Big Data Repositories Clickstream Analysis ETL Offload
Machine Learning Online Ad Serving BI Applications
7. A Data Culture with Embedded
Analytics in Action
Erin Franz • Senior Analyst, Alliances, Looker
8. Make it easy for everyone
to find, explore and
understand the data that
drives your business
9. Looker: A Self-Service Data Platform
Find, explore and understand the data
Explore Everything
Find, explore and
understand all the data
Create Standards
Define your data and
business metrics
Any SQL Database
Analyze all of your data
where it is stored
Build a Data Culture
Anyone can ask and
answer questions
10. Looker for Amazon Web Services
RDS Redshift EMR Aurora
Deployment
Easy deployment on Amazon EC2
Data Sources
Connect to Amazon RDS, Amazon Redshift,
Amazon Aurora and Amazon EMR
(Spark SQL and Presto)
Data Modeling Layer
Define your data and business metrics
Explore
Find, explore and understand your data
11. The Technical Pillars that Make it Possible
100% in Database
Leverage all your data
Avoid summarizing or
moving it
Modern Web
Architecture
Access from anywhere
Share and collaborate
Extend to anyone
LookML Intelligent
Modeling Layer
Describe the data
Create reusable and
shareable business logic
12. Looker/Redshift Integration Highlights
In-Database Architecture
The power of Amazon Redshift is
directly leveraged by Looker
because all transformation is
done in-database
Looker: A Standard for
Amazon Redshift
Some of the most demanding
Amazon Redshift deployments
choose Looker for data
exploration, including:
Highest Level of
Looker Features
We’ve invested in providing
Looker features for Amazon
Redshift to make the best
experience possible, including:
As real-time as data in
Amazon Redshift
Shared compute,
scalability, caching all
utilized by Looker
Persistent derived tables
Symmetric aggregates
Query killing
Lat/Long location
Sony
Lyft
Yahoo!
Kohler
Docker
13. Companies Winning with Redshift + Looker
eCommerce Technology Marketplaces Fin Services Media/Ad Tech
14. A Data Culture with Embedded
Analytics in Action
Scott Breitenother • VP, Data and Analytics, Casper
16. Data Powers Everything that We Do
Data Team Mission:
Enable better, faster
decisions through
information visibility
and analytical expertise
17. Until We Outgrew Our Data Infrastructure
Required data refresh by the data team
File speed and data size limitations
Intimidating presentation of information
Analysis is siloed in files
Cannot query across sources, must download
data to join
Difficult to manage ad hoc queries
No one place holds all the information
Inconsistent definitions (and a lot of work if you
make a small change!)
Production
Databases
Solution was not efficient or scalable
Big Excel Files
18. Enter Looker & AWS
Central warehouse for all data
Join previously siloed data for
better analysis
Dialect is very similar to Postgresql
We use AWS ecosystem (AWS
Lambda, Amazon RDS, Amazon EC2)
Efficient data modeling
Easy to manage source of truth
Visualization layer
Intuitive UI for business users
No SQL for business users!!
Amazon Redshift
19. We Implemented in Phases
Copy
Batch copy production
databases
1
Copy Faster
Frequent, faster and
incremental copy
2
ELT
Build specific data marts
3
20. Phase 1: Copy
Open source project from
DonorsChoose.org
Bash script with regex translations
from Postgres to Redshift
Full refresh with up to 40 min load
time
Whitelist of tables to copy for each
database
Results
Data updated every 6 hours
Missing certain key aggregations
Not read performant
Unwieldy to manage
Poor UX on Looker front end
How We Did It
21. Stitch (formerly RJ Metrics
Pipeline)
Integrates with Postgres as well as
other common third party sources
30 minute refresh cycle
Point and click to add tables and
integrations
Easy to use UI
Incremental copy
Pre-existing integrations and expertise
(multiple engineers, customer support)
Fully managed and relatively
inexpensive
Transparent logging (rows
replicated, errors)
Phase 2: Copy Faster
ResultsHow We Did It
22. Phase 3: ELT
How We Did It
Data Build Tool (dbt) from Fishtown Analytics
Looker like abstraction of tables and views
SQL that references other SQL
Manages dependency graph
Options for materializing SQL (CTE,
view, table)
Set sort and distribution keys
Simple repo deployed to EC2 tiny
Results
Pre-aggregated tables
Marts: de-normalized table for an area of
the business
Lookups: attributes for a product, location
Rollups: time-series aggregations for
summary reporting
Facts: aggregations on key “business
objects” (orders, customers)
Updates every 30 minutes
24. This Is What Success Looks Like
Access for
Business Users
Find many answers themselves
Easy to filter, pivot and visualize
Access to all existing analysis
Data is refreshed and up-to-date
Multiple ways to consume
(web, email, links via slack)
Simple Management
for Data Team
Single source of truth
Insight into usage
Centralized business logic
Git managed, easy collaboration
Keep pace with evolving business
(new countries and products)
25. Success Story: Supply Chain
Solution
Monitoring 2 KPIS:
Operational Metric – daily Days on Hand (DOH)
Success Metric – weekly Order to Ship SLA
Challenge
Operations team
needed to ensure fast
delivery of our highly
in-demand products
Order to Ship SLAMattress Inventory DOH
26. Success Story: Executive Reporting
Solution
Created a dashboard that highlights key metrics
from each department
Each metric has a goal and includes a weekly,
MTD, QTD and trend view
Challenge
Executive team needed
actionable metrics and
the ability to track against
goals while seeing trends
over time
27. There Will Always Be More Questions
Increased
Access to Data
More Sophisticated
Clients
Tougher
Questions
28. Q&A
Dave Rocamora • Solutions Architect, AWS
Erin Franz • Senior Analyst, Alliances, Looker
Scott Breitenother • VP, Data and Analytics, Casper