SlideShare une entreprise Scribd logo
1  sur  32
Data Science at Pebble
Analyzing Data to Make Smarter Watches
June 2, 2015
Today’s speakers
Scott Ward
Solutions Architect
Amazon Web Services
Kiyoto Tamura
Head of Marketing
Treasure Data
Susan Holcomb
Head of Analytics
Pebble
Data at Pebble
What is Pebble?
• Customizable smart
watch with crowd-
pleasing history
• $10.3MM on Kickstarter
with first product
• In March, $20MM on
Kickstarter with new
product
Pebble Data Team: Then vs. Now
One year
ago…
No data
team
No analytics
infrastructure
Barely any
data
Barely any
insights
Today… 5-person
team (&
growing!)
Scalable analytics
infrastructure via
Treasure Data
~60MM
records per
day
New product
influenced by
data insights
Data Science Workflow
Define the
problem
Acquire the
data
Fit the
model
the work the hype
Pebble’s First Problem
How should we measure
product success?
Engagement Definition
• How can we tell someone likes the watch?
– Button presses?
– Apps downloaded / launched?
– Minimized SW bugs?
– A crazy formula combining these?
• Simplest: They are wearing the watch
– Use accelerometer
Accessing Data
60 MM records
per day Scheduled jobs
in TD to post-
process &
aggregate data
Ad hoc queries in
TD to explore data
(Presto, Hive)
Dashboards
Standardized
output
Process: ~30
queries to get
one result
Accelerometer noise threshold
• Accelerometer picks up gestures, net motion (so we
can enable cool features)
• Sensitive enough to pick up vibrations of passing train
• Goal: Determine threshold for noise so we can assess
when watch is really in use
Accelerometer noise threshold
First result
???
Raising the threshold
peaks shift left spike remains
backlight data matches original threshold!!
Further validated by survey of users
Why this worked
• Rapid, repeated ad hoc querying lets you get an intuitive picture of the data
– What is the range?
– Where are the errors?
– Where are the inflection points?
• Few analytics infrastructure tools optimize for this
– Too focused on standardized reporting
– Want to sell you black box that spits out “insights”
Problems 2-n
• Building scalable reporting system
• Delivering insights that shaped interface for new product
• Discovering signals on user attrition
• Designing models to segment use cases
• Analyzing dozens of product elements to improve
product experience
thanks <3
Product Overview
Kiyoto Tamura
Director of Developer Relations
Event Data is Everywhere…
Smartphones Websites Home
Automation
Wearable
Devices
Connected
Vehicles
Event Data is Everywhere…
Smartphones Websites Home
Automation
Wearable
Devices
Connected
Vehicles
{
“timestamp”: “2015-05-22T13:50:00-0600”,
“event”: “tap”,
“object”: “button_32”,
“user”: {
“name”: “Luca”,
“email”: “luca@treasuredata.com”,
“twitter”: “luckymethod”
}
}
Connecting the (big) data dots is hard
credit: Matt Turck @ FirstMark Capital
We provide a simple solution
Ingest Analyze Distribute
and more…
• Streaming or Batch
ingestion (or both) with
Treasure Agent and Embulk
• Don’t worry about changing
the way you send data,
Treasure Data handles it all
• 99.99% uptime, our team
takes care of running the
show so you don’t have to
• Query all your data using
SQL, no schema required
• Control Treasure Data
through our Console, our
Command Line Interface or
Luigi-TD for complex
automated data pipelines
• Choose Hive or Presto
• Run machine learning at
scale with Hivemall
• Expansive collection of
export plugins: send data to
Google Docs, Tableau,
Excel, PostgreSQL…
• Connect your favorite BI
tool
• Fine grained user access
control to your data
Why is Treasure Data better?
Ingest Analyze Distribute
CommerceTechnologyGaming Media & Ad Tech
Our growing customer base
Energy
Company
IoT
• API Servers
(c3.2xlarge)
• Hadoop workers
(c3.8xlarge)
• Generic workers
(c3.4xlarge)
• Powers our schema-
free, columnar store
• 50 billion events/day
• No capacity planning
needed!
• Both MySQL &
PostgreSQL
• Reduced ops cost
• No dedicated devops
for 2.5 years
Treasure Data on AWS
EC2 S3 RDS
Amazon Relational Database Service (RDS)
Amazon RDS is a fully managed relational DB service that is:
– Simple to deploy
– Easy to scale
– Reliable
– Cost-effective
Ease of deployment and patching
Push-button scalability
Choice of DB Engines
Automated backups
User snapshots and cloning
Monitoring and auto. host replacement
POSTGRE
Amazon RDS for Aurora (Preview)
Amazon RDS - Multi-Availability Zone Configuration
• Configure your RDS environment for high availability and DR
• Primary database running in one Availability Zone with Standby in
another
• DNS Name changes due to unhealthy RDS instance or Availability Zone
Availability Zone #1
Web
Tier
RDPGW
App
Tier
Web
Tier
App
Tier
Auto Scaling group
Auto Scaling group
Availability Zone #2
Web
Tier
App
Tier
Web
Tier
App
Tier
Auto Scaling group
Auto Scaling group
RDS Multi-Availability Zone Architecture
Amazon RDS - Read Replicas
Region #1 Region #2
Questions?
Treasure Data
Kiyoto Tamura
@kiyototamura
treasuredata.com
Pebble
Susan Holcomb
getpebble.com
AWS
Scott Ward
aws.amazon.com
Contact us to learn more

Contenu connexe

Tendances

Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analyticsAmazon Web Services
 
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...Amazon Web Services
 
Big Data, Analytics, and Content Recommendations on AWS
Big Data, Analytics, and Content Recommendations on AWSBig Data, Analytics, and Content Recommendations on AWS
Big Data, Analytics, and Content Recommendations on AWSAmazon Web Services
 
Optimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot InstancesOptimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot InstancesAmazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWSAmazon Web Services
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...Amazon Web Services
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)Amazon Web Services
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantageAmazon Web Services
 
The New Normal - AWSome Day Zurich 112016
The New Normal - AWSome Day Zurich 112016The New Normal - AWSome Day Zurich 112016
The New Normal - AWSome Day Zurich 112016Amazon Web Services
 
Getting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesGetting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesAmazon Web Services
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012infolive
 
Welcome Keynote - AWS Summit Stockholm
Welcome Keynote - AWS Summit Stockholm Welcome Keynote - AWS Summit Stockholm
Welcome Keynote - AWS Summit Stockholm Amazon Web Services
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GooglePatrick Pierson
 
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)Amazon Web Services
 
AWS APAC Webinar Week - 2015 An Amazing Year in AWS
AWS APAC Webinar Week - 2015 An Amazing Year in AWSAWS APAC Webinar Week - 2015 An Amazing Year in AWS
AWS APAC Webinar Week - 2015 An Amazing Year in AWSAmazon Web Services
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksAmazon Web Services
 

Tendances (20)

Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
 
Big Data, Analytics, and Content Recommendations on AWS
Big Data, Analytics, and Content Recommendations on AWSBig Data, Analytics, and Content Recommendations on AWS
Big Data, Analytics, and Content Recommendations on AWS
 
Optimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot InstancesOptimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot Instances
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
 
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)
AWS re:Invent 2016: High Performance Cinematic Production in the Cloud (MAE304)
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
The New Normal - AWSome Day Zurich 112016
The New Normal - AWSome Day Zurich 112016The New Normal - AWSome Day Zurich 112016
The New Normal - AWSome Day Zurich 112016
 
Getting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesGetting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpaces
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012
 
Welcome Keynote - AWS Summit Stockholm
Welcome Keynote - AWS Summit Stockholm Welcome Keynote - AWS Summit Stockholm
Welcome Keynote - AWS Summit Stockholm
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)
AWS re:Invent 2016: Workshop: Addressing Your Business Needs with AWS (ARC210)
 
AWS APAC Webinar Week - 2015 An Amazing Year in AWS
AWS APAC Webinar Week - 2015 An Amazing Year in AWSAWS APAC Webinar Week - 2015 An Amazing Year in AWS
AWS APAC Webinar Week - 2015 An Amazing Year in AWS
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 

En vedette

Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemTreasure Data, Inc.
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...Treasure Data, Inc.
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallTreasure Data, Inc.
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataTreasure Data, Inc.
 
The overview of Server-ide Bulk Loader
 The overview of Server-ide Bulk Loader The overview of Server-ide Bulk Loader
The overview of Server-ide Bulk LoaderTreasure Data, Inc.
 
Useful Design Background, Process and Execution
Useful Design Background, Process and ExecutionUseful Design Background, Process and Execution
Useful Design Background, Process and ExecutionTreasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudTreasure Data, Inc.
 
Frontend Application Architecture, Patterns, and Workflows
Frontend Application Architecture, Patterns, and WorkflowsFrontend Application Architecture, Patterns, and Workflows
Frontend Application Architecture, Patterns, and WorkflowsTreasure Data, Inc.
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSTreasure Data, Inc.
 
Hadoop meets Cloud with Multi-Tenancy
Hadoop meets Cloud with Multi-TenancyHadoop meets Cloud with Multi-Tenancy
Hadoop meets Cloud with Multi-TenancyTreasure Data, Inc.
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Treasure Data, Inc.
 

En vedette (20)

Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
 
Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 
The overview of Server-ide Bulk Loader
 The overview of Server-ide Bulk Loader The overview of Server-ide Bulk Loader
The overview of Server-ide Bulk Loader
 
Useful Design Background, Process and Execution
Useful Design Background, Process and ExecutionUseful Design Background, Process and Execution
Useful Design Background, Process and Execution
 
Prototipando con Indigo Studio
Prototipando con Indigo StudioPrototipando con Indigo Studio
Prototipando con Indigo Studio
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
Frontend Application Architecture, Patterns, and Workflows
Frontend Application Architecture, Patterns, and WorkflowsFrontend Application Architecture, Patterns, and Workflows
Frontend Application Architecture, Patterns, and Workflows
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWS
 
hotdog a TD tool for DD
hotdog a TD tool for DDhotdog a TD tool for DD
hotdog a TD tool for DD
 
Treasure Data and Fluentd
Treasure Data and FluentdTreasure Data and Fluentd
Treasure Data and Fluentd
 
Hadoop meets Cloud with Multi-Tenancy
Hadoop meets Cloud with Multi-TenancyHadoop meets Cloud with Multi-Tenancy
Hadoop meets Cloud with Multi-Tenancy
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -
 
Treasure Data Cloud Strategy
Treasure Data Cloud StrategyTreasure Data Cloud Strategy
Treasure Data Cloud Strategy
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
 

Similaire à Partner webinar presentation aws pebble_treasure_data

Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studydeep.bi
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructureSimon Belak
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with AnalyticsWSO2
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Datasabnees
 
Digital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdfDigital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdfssuserd23711
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스Amazon Web Services Korea
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Crate.io
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwarePanorama Software
 
Ingesting Click Data for Analytics
Ingesting Click Data for AnalyticsIngesting Click Data for Analytics
Ingesting Click Data for AnalyticsClickMeter
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar Caroline Stewart
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
 
Machine Learning on dirty data - Dataiku - Forum du GFII 2014
Machine Learning on dirty data - Dataiku - Forum du GFII 2014Machine Learning on dirty data - Dataiku - Forum du GFII 2014
Machine Learning on dirty data - Dataiku - Forum du GFII 2014Le_GFII
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfahmedibrahimghnnam01
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAmazon Web Services
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionAmazon Web Services
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with DatabricksGrega Kespret
 
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...BAINIDA
 

Similaire à Partner webinar presentation aws pebble_treasure_data (20)

Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
Digital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdfDigital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdf
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
IoT – The reality of real world solutions
IoT – The reality of real world solutions IoT – The reality of real world solutions
IoT – The reality of real world solutions
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama Software
 
Ingesting click events for analytics
Ingesting click events for analyticsIngesting click events for analytics
Ingesting click events for analytics
 
Ingesting Click Data for Analytics
Ingesting Click Data for AnalyticsIngesting Click Data for Analytics
Ingesting Click Data for Analytics
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Machine Learning on dirty data - Dataiku - Forum du GFII 2014
Machine Learning on dirty data - Dataiku - Forum du GFII 2014Machine Learning on dirty data - Dataiku - Forum du GFII 2014
Machine Learning on dirty data - Dataiku - Forum du GFII 2014
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
 

Plus de Treasure Data, Inc.

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersTreasure Data, Inc.
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketTreasure Data, Inc.
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data PlatformsTreasure Data, Inc.
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowTreasure Data, Inc.
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsTreasure Data, Inc.
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataTreasure Data, Inc.
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataTreasure Data, Inc.
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data DotsTreasure Data, Inc.
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessTreasure Data, Inc.
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Treasure Data, Inc.
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)Treasure Data, Inc.
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to RedshiftTreasure Data, Inc.
 
Fluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerFluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerTreasure Data, Inc.
 
Augmenting Mongo DB with Treasure Data
Augmenting Mongo DB with Treasure DataAugmenting Mongo DB with Treasure Data
Augmenting Mongo DB with Treasure DataTreasure Data, Inc.
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataTreasure Data, Inc.
 
Fluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerFluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerTreasure Data, Inc.
 
What is support_engineer_in_treasuredata
What is support_engineer_in_treasuredataWhat is support_engineer_in_treasuredata
What is support_engineer_in_treasuredataTreasure Data, Inc.
 

Plus de Treasure Data, Inc. (20)

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
 
Fluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerFluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker container
 
Augmenting Mongo DB with Treasure Data
Augmenting Mongo DB with Treasure DataAugmenting Mongo DB with Treasure Data
Augmenting Mongo DB with Treasure Data
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure data
 
Fluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker containerFluentd and Docker - running fluentd within a docker container
Fluentd and Docker - running fluentd within a docker container
 
Fluentd - Unified logging layer
Fluentd -  Unified logging layerFluentd -  Unified logging layer
Fluentd - Unified logging layer
 
What is support_engineer_in_treasuredata
What is support_engineer_in_treasuredataWhat is support_engineer_in_treasuredata
What is support_engineer_in_treasuredata
 

Partner webinar presentation aws pebble_treasure_data

  • 1. Data Science at Pebble Analyzing Data to Make Smarter Watches June 2, 2015
  • 2. Today’s speakers Scott Ward Solutions Architect Amazon Web Services Kiyoto Tamura Head of Marketing Treasure Data Susan Holcomb Head of Analytics Pebble
  • 4. What is Pebble? • Customizable smart watch with crowd- pleasing history • $10.3MM on Kickstarter with first product • In March, $20MM on Kickstarter with new product
  • 5. Pebble Data Team: Then vs. Now One year ago… No data team No analytics infrastructure Barely any data Barely any insights Today… 5-person team (& growing!) Scalable analytics infrastructure via Treasure Data ~60MM records per day New product influenced by data insights
  • 6. Data Science Workflow Define the problem Acquire the data Fit the model the work the hype
  • 7. Pebble’s First Problem How should we measure product success?
  • 8. Engagement Definition • How can we tell someone likes the watch? – Button presses? – Apps downloaded / launched? – Minimized SW bugs? – A crazy formula combining these? • Simplest: They are wearing the watch – Use accelerometer
  • 9. Accessing Data 60 MM records per day Scheduled jobs in TD to post- process & aggregate data Ad hoc queries in TD to explore data (Presto, Hive) Dashboards Standardized output Process: ~30 queries to get one result
  • 10. Accelerometer noise threshold • Accelerometer picks up gestures, net motion (so we can enable cool features) • Sensitive enough to pick up vibrations of passing train • Goal: Determine threshold for noise so we can assess when watch is really in use
  • 13. Raising the threshold peaks shift left spike remains backlight data matches original threshold!! Further validated by survey of users
  • 14. Why this worked • Rapid, repeated ad hoc querying lets you get an intuitive picture of the data – What is the range? – Where are the errors? – Where are the inflection points? • Few analytics infrastructure tools optimize for this – Too focused on standardized reporting – Want to sell you black box that spits out “insights”
  • 15. Problems 2-n • Building scalable reporting system • Delivering insights that shaped interface for new product • Discovering signals on user attrition • Designing models to segment use cases • Analyzing dozens of product elements to improve product experience
  • 17. Product Overview Kiyoto Tamura Director of Developer Relations
  • 18. Event Data is Everywhere… Smartphones Websites Home Automation Wearable Devices Connected Vehicles
  • 19. Event Data is Everywhere… Smartphones Websites Home Automation Wearable Devices Connected Vehicles { “timestamp”: “2015-05-22T13:50:00-0600”, “event”: “tap”, “object”: “button_32”, “user”: { “name”: “Luca”, “email”: “luca@treasuredata.com”, “twitter”: “luckymethod” } }
  • 20. Connecting the (big) data dots is hard credit: Matt Turck @ FirstMark Capital
  • 21. We provide a simple solution Ingest Analyze Distribute and more…
  • 22. • Streaming or Batch ingestion (or both) with Treasure Agent and Embulk • Don’t worry about changing the way you send data, Treasure Data handles it all • 99.99% uptime, our team takes care of running the show so you don’t have to • Query all your data using SQL, no schema required • Control Treasure Data through our Console, our Command Line Interface or Luigi-TD for complex automated data pipelines • Choose Hive or Presto • Run machine learning at scale with Hivemall • Expansive collection of export plugins: send data to Google Docs, Tableau, Excel, PostgreSQL… • Connect your favorite BI tool • Fine grained user access control to your data Why is Treasure Data better? Ingest Analyze Distribute
  • 23. CommerceTechnologyGaming Media & Ad Tech Our growing customer base Energy Company IoT
  • 24. • API Servers (c3.2xlarge) • Hadoop workers (c3.8xlarge) • Generic workers (c3.4xlarge) • Powers our schema- free, columnar store • 50 billion events/day • No capacity planning needed! • Both MySQL & PostgreSQL • Reduced ops cost • No dedicated devops for 2.5 years Treasure Data on AWS EC2 S3 RDS
  • 25.
  • 26.
  • 27. Amazon Relational Database Service (RDS) Amazon RDS is a fully managed relational DB service that is: – Simple to deploy – Easy to scale – Reliable – Cost-effective Ease of deployment and patching Push-button scalability Choice of DB Engines Automated backups User snapshots and cloning Monitoring and auto. host replacement POSTGRE Amazon RDS for Aurora (Preview)
  • 28. Amazon RDS - Multi-Availability Zone Configuration • Configure your RDS environment for high availability and DR • Primary database running in one Availability Zone with Standby in another • DNS Name changes due to unhealthy RDS instance or Availability Zone
  • 29. Availability Zone #1 Web Tier RDPGW App Tier Web Tier App Tier Auto Scaling group Auto Scaling group Availability Zone #2 Web Tier App Tier Web Tier App Tier Auto Scaling group Auto Scaling group RDS Multi-Availability Zone Architecture
  • 30. Amazon RDS - Read Replicas Region #1 Region #2
  • 31.
  • 32. Questions? Treasure Data Kiyoto Tamura @kiyototamura treasuredata.com Pebble Susan Holcomb getpebble.com AWS Scott Ward aws.amazon.com Contact us to learn more

Notes de l'éditeur

  1. Add Pebble logo
  2. KEY MESSAGES Looker is one of the fastest growing data and analytics companies in history—both in terms of customer growth and revenue growth Organizations that use Looker see incredible levels of engagement by both data analysts and business users
  3. KEY MESSAGES Looker is one of the fastest growing data and analytics companies in history—both in terms of customer growth and revenue growth Organizations that use Looker see incredible levels of engagement by both data analysts and business users
  4. KEY MESSAGES Looker is one of the fastest growing data and analytics companies in history—both in terms of customer growth and revenue growth Organizations that use Looker see incredible levels of engagement by both data analysts and business users
  5. KEY MESSAGES Looker is one of the fastest growing data and analytics companies in history—both in terms of customer growth and revenue growth Organizations that use Looker see incredible levels of engagement by both data analysts and business users
  6. KEY MESSAGES This is a completely new architecture that fundamentally changes the way your connect, describe, and explore your data
  7. KEY MESSAGE We’re seeing growth across industries—each of which has their own unique use cases for the tool
  8. KEY MESSAGES This is a completely new architecture that fundamentally changes the way your connect, describe, and explore your data
  9. Start out
  10. To summarise what Amazon RDS offers, across three ‘flavours’ of the service, you can think about the feature set in three main areas: Deployment A choice of database engines and overall application compatibility Ease of deployment with pre-configured parameters and settings Management Automated backups and disaster recovery User snapshots and cloning, plus software patching and upgrades Scaling Push button scaling through the AWS management console
  11. Here we are focusing on a multi Availability Zone configuration as it relates to RDS. What a multi AZ configuration allows you to do is have a master database running in one AZ and a copy of the data kept in synch for another instance in another AZ of the region you are operating in. Once there is a problem detected with the RDS instance or the production AZ the DNS records are switched to use the Standby database and your applications are now working against the standby and when the production one comes up that is now the standby.
  12. This is functionality that exists for MySQL, PostGres, and Aurora With some databases there is a need to support lots of read only operations against the database. Running all these reads against the same production database where you are doing all your writes can negatively impact your database and slow down all operations. This is where it may be appropriate to run a read replica version of your database in order to take the load of reads off the production database.
  13. Closing