SlideShare a Scribd company logo
1 of 25
MetaScale is a subsidiary of
Sears Holdings Corporation
MetaScale is a subsidiary of
Sears Holdings Corporation
Ankur Gupta
General Manager
Big Data Business Wins: Real-Time
Inventory Tracking with Hadoop
MetaScale
A big data technology solution
provider and subsidiary of
Sears Holdings that delivers a
full spectrum of services
focused on big data and
Hadoop
MetaScale is a ‘big data
accelerator’. We provide the
technology, talent, and
solutions to accelerate the
value from big data
Offices in Chicago, San Jose,
and Pune, India
A Fortune 100 company, nearly $40
billion in annual revenue
The nation’s fourth largest broad
line retailer with almost 2,500 full-
line and specialty retail stores in the
US and Canada
A front runner in big data efforts
including driving personalized
marketing and generating savings
from legacy migration
Running one of the biggest rewards
programs that captures and
analyzes very large number of
customer transactions quickly
Our Parent Company
Sears Holdings
2
 What tools can be used to migrate Point-of-Sales
(POS) data from different legacy systems to Hadoop
 Establishing an Enterprise Data Hub with Hadoop in
order to create a single version of truth
 What is a reference architecture for near real-time
inventory tracking
3
Objectives
From a recent Wikibon survey:
 Enterprise practitioners believe the potential value of
Big Data is significant
 However, many are struggling to derive maximum
value from their big data investments
• 46% of Big Data practitioners report that they have only
realized partial value from their Big Data deployments
• 2% declared their Big Data deployments total failures, with no
value achieved
Challenge of Achieving Big Data ROI
Source: Enterprises Struggling to Derive Maximum Value from Big Data, Wikibon, Sep 2013
http://wikibon.org/wiki/v/Enterprises_Struggling_to_Derive_Maximum_Value_from_Big_Data
4
According to Wikibon, three compelling reasons for this
struggle to achieve maximum business value from big
data…
1. A lack of skilled Big Data practitioners
2. "Raw" and relatively immature technology
3. A lack of compelling business use case
Challenge of Achieving Big Data ROI
Source: Enterprises Struggling to Derive Maximum Value from Big Data, Wikibon, Sep 2013
http://wikibon.org/wiki/v/Enterprises_Struggling_to_Derive_Maximum_Value_from_Big_Data
5
Making Business Decisions Quickly
6
 The Hadoop ecosystem gives
business the ability to create value
from its data by being able to process
and store vast amounts of data from
disparate sources.
 Hadoop enables faster processing on larger data sets
for analytics and deep analytics.
 Storm, Kafka and Cassandra provide the technology for
real-time analytics to make business agile.
Keys for Achieving Big Data Success
7
 Bring IT and Business together
 Define realistic success criteria
 Ask “what are you really trying to accomplish?”
 Understand how Hadoop will fit into your environment
 See the end results first before you start your journey
 Discover your big data use case!
Real-Time Inventory Management
8
Real-Time Analytics with Cassandra
By implementing Hadoop and Cassandra into a
traditional environment, Business Intelligence teams
are able to provide more accurate and real-time
inventory, pricing, sales and return data as well as
predicting ideal floor plans.
Managing inventory with up-to-the-second data...
9
In-Store
Purchases
Online
Purchases
Real-time
inventory data
ensures that
items ordered
are in-stock.
 POS data was stored in different formats in different
legacy systems (Mainframe and Teradata)
 No single version of truth
 No real-time capability
Inventory
Batch File Sent
ONCE A DAY
CHALLENGE
This latency resulted in potential loss of sales and customer
dissatisfaction when items are ordered that are no longer in stock.
10
Real-Time Analytics with Cassandra
POS Volume
 Average 100,000 message per day
 Peak 77,000 messages in 1 hour at
4:00am the day after Thanksgiving
SOLUTION – Phase 1
 Condense all POS data from different legacy
systems and applications into Hadoop
Enterprise Data Hub
 Create a Single Version of Truth
11
Real-Time Analytics with Cassandra
Hadoop enables a single version of truth for deep analytics,
but there is still no real-time capability…
SOLUTION – Phase 2
12
Real-Time Analytics with Cassandra
 Use Kafka to extract messages from
POS queue
 Kafka sends messages to Cassandra
for real-time processing
SOLUTION – End-to-End
Messages are sent from Cassandra to
Hadoop for back-end, deep analytics.
13
Real-Time Analytics with Cassandra
4 Node 4 Node
11 Node
Faster decision making…
Business Intelligence Teams
are able to provide more
accurate and real-time
inventory, pricing, sales and
return data.
BEFORE Cassandra
Real-Time Solution:
Inventory Batch File
Sent Once a Day
Real-Time Analytics with Cassandra
AFTER Cassandra
Real-Time Solution:
Inventory Data Sent
in Sub-Milliseconds
14
RESULT
Increased sales by improving item
availability.
Real-Time Analytics with Cassandra
15
Value for the Organization
Increased customer satisfaction
because customer is able to get
what was ordered.
Real-Time Analytics with Cassandra
16
Value for the Organization
Cost savings from reduced
customer service center calls.
Aha Moments
Cost savings from reduced truck
load times.
Additional Components
17
Hadoop Enterprise Data Hub gives business users access to
more data from more sources for deep analytics.
Hadoop Enterprise Data Hub
18
Single Version of Truth
Firewall Issues
Normally, Storm or Kafka can be
used to send POS messages to
Cassandra.
In certain situations where a firewall
exists between data source and
processing cluster - such as created
by mergers or spin-outs – both
Storm and Kafka can be used to
send messages over the firewall.
19
Unique Challenge for a Complex Enterprise
Real-Time Over Firewall
20
Unique Challenge for a Complex Enterprise
3 Node
Storm Cluster
Advanced Analytics
21
Inventory forecasting with
Machine Learning on data from
Weather Reports
Data-Driven Decision Making
Once the Hadoop / Cassandra framework is in place, data
from virtually any source can be consumed in the Enterprise
Data Hub for Advanced Analytics.
New ways to use Social, Geo,
Sensor data to develop
predictive models…
KEY TAKEAWAYS
22
 Enterprise Data Hub and single version of truth for all data
 Hadoop can help you answer questions that were difficult
or cost prohibitive to answer before
 Hadoop can transform your organization’s approach to
how you use data and ask questions you never even
thought of
 Must have a clear strategy and
long-term plan
 Leverage the right partnerships to
achieve your goals
23
Big Data Business Wins
Q & A
Questions?
24
Your One-Stop Big Data Helpline
phone:
email:
visit:
1-800-234-8769
contact@metascale.com
www.metascale.com

More Related Content

What's hot

Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 

What's hot (20)

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 

Viewers also liked

Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
m_hepburn
 
Retail Reference Architecture
Retail Reference ArchitectureRetail Reference Architecture
Retail Reference Architecture
MongoDB
 

Viewers also liked (16)

The 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueThe 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and value
 
Hadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantHadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the Elephant
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraBig Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
 
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsights
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsightsUse cases for Hadoop and Big Data Analytics - InfoSphere BigInsights
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsights
 
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
 
IBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use CasesIBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use Cases
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Retail Reference Architecture Part 2: Real-Time, Geo Distributed Inventory
Retail Reference Architecture Part 2: Real-Time, Geo Distributed InventoryRetail Reference Architecture Part 2: Real-Time, Geo Distributed Inventory
Retail Reference Architecture Part 2: Real-Time, Geo Distributed Inventory
 
Retail Reference Architecture
Retail Reference ArchitectureRetail Reference Architecture
Retail Reference Architecture
 
Big Data and Advanced Analytics
Big Data and Advanced AnalyticsBig Data and Advanced Analytics
Big Data and Advanced Analytics
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Customer Journey Analytics and Big Data
Customer Journey Analytics and Big DataCustomer Journey Analytics and Big Data
Customer Journey Analytics and Big Data
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similar to Big Data Business Wins: Real-time Inventory Tracking with Hadoop

Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
Jyrki Määttä
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-Time
DataWorks Summit
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 

Similar to Big Data Business Wins: Real-time Inventory Tracking with Hadoop (20)

Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-Time
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution Brief
 
Madhu
MadhuMadhu
Madhu
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Intro to Big Data Analytics and the Hybrid Cloud
Intro to Big Data Analytics and the Hybrid CloudIntro to Big Data Analytics and the Hybrid Cloud
Intro to Big Data Analytics and the Hybrid Cloud
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7
 

More from DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Big Data Business Wins: Real-time Inventory Tracking with Hadoop

  • 1. MetaScale is a subsidiary of Sears Holdings Corporation MetaScale is a subsidiary of Sears Holdings Corporation Ankur Gupta General Manager Big Data Business Wins: Real-Time Inventory Tracking with Hadoop
  • 2. MetaScale A big data technology solution provider and subsidiary of Sears Holdings that delivers a full spectrum of services focused on big data and Hadoop MetaScale is a ‘big data accelerator’. We provide the technology, talent, and solutions to accelerate the value from big data Offices in Chicago, San Jose, and Pune, India A Fortune 100 company, nearly $40 billion in annual revenue The nation’s fourth largest broad line retailer with almost 2,500 full- line and specialty retail stores in the US and Canada A front runner in big data efforts including driving personalized marketing and generating savings from legacy migration Running one of the biggest rewards programs that captures and analyzes very large number of customer transactions quickly Our Parent Company Sears Holdings 2
  • 3.  What tools can be used to migrate Point-of-Sales (POS) data from different legacy systems to Hadoop  Establishing an Enterprise Data Hub with Hadoop in order to create a single version of truth  What is a reference architecture for near real-time inventory tracking 3 Objectives
  • 4. From a recent Wikibon survey:  Enterprise practitioners believe the potential value of Big Data is significant  However, many are struggling to derive maximum value from their big data investments • 46% of Big Data practitioners report that they have only realized partial value from their Big Data deployments • 2% declared their Big Data deployments total failures, with no value achieved Challenge of Achieving Big Data ROI Source: Enterprises Struggling to Derive Maximum Value from Big Data, Wikibon, Sep 2013 http://wikibon.org/wiki/v/Enterprises_Struggling_to_Derive_Maximum_Value_from_Big_Data 4
  • 5. According to Wikibon, three compelling reasons for this struggle to achieve maximum business value from big data… 1. A lack of skilled Big Data practitioners 2. "Raw" and relatively immature technology 3. A lack of compelling business use case Challenge of Achieving Big Data ROI Source: Enterprises Struggling to Derive Maximum Value from Big Data, Wikibon, Sep 2013 http://wikibon.org/wiki/v/Enterprises_Struggling_to_Derive_Maximum_Value_from_Big_Data 5
  • 6. Making Business Decisions Quickly 6  The Hadoop ecosystem gives business the ability to create value from its data by being able to process and store vast amounts of data from disparate sources.  Hadoop enables faster processing on larger data sets for analytics and deep analytics.  Storm, Kafka and Cassandra provide the technology for real-time analytics to make business agile.
  • 7. Keys for Achieving Big Data Success 7  Bring IT and Business together  Define realistic success criteria  Ask “what are you really trying to accomplish?”  Understand how Hadoop will fit into your environment  See the end results first before you start your journey  Discover your big data use case!
  • 9. Real-Time Analytics with Cassandra By implementing Hadoop and Cassandra into a traditional environment, Business Intelligence teams are able to provide more accurate and real-time inventory, pricing, sales and return data as well as predicting ideal floor plans. Managing inventory with up-to-the-second data... 9 In-Store Purchases Online Purchases Real-time inventory data ensures that items ordered are in-stock.
  • 10.  POS data was stored in different formats in different legacy systems (Mainframe and Teradata)  No single version of truth  No real-time capability Inventory Batch File Sent ONCE A DAY CHALLENGE This latency resulted in potential loss of sales and customer dissatisfaction when items are ordered that are no longer in stock. 10 Real-Time Analytics with Cassandra POS Volume  Average 100,000 message per day  Peak 77,000 messages in 1 hour at 4:00am the day after Thanksgiving
  • 11. SOLUTION – Phase 1  Condense all POS data from different legacy systems and applications into Hadoop Enterprise Data Hub  Create a Single Version of Truth 11 Real-Time Analytics with Cassandra Hadoop enables a single version of truth for deep analytics, but there is still no real-time capability…
  • 12. SOLUTION – Phase 2 12 Real-Time Analytics with Cassandra  Use Kafka to extract messages from POS queue  Kafka sends messages to Cassandra for real-time processing
  • 13. SOLUTION – End-to-End Messages are sent from Cassandra to Hadoop for back-end, deep analytics. 13 Real-Time Analytics with Cassandra 4 Node 4 Node 11 Node
  • 14. Faster decision making… Business Intelligence Teams are able to provide more accurate and real-time inventory, pricing, sales and return data. BEFORE Cassandra Real-Time Solution: Inventory Batch File Sent Once a Day Real-Time Analytics with Cassandra AFTER Cassandra Real-Time Solution: Inventory Data Sent in Sub-Milliseconds 14 RESULT
  • 15. Increased sales by improving item availability. Real-Time Analytics with Cassandra 15 Value for the Organization Increased customer satisfaction because customer is able to get what was ordered.
  • 16. Real-Time Analytics with Cassandra 16 Value for the Organization Cost savings from reduced customer service center calls. Aha Moments Cost savings from reduced truck load times.
  • 18. Hadoop Enterprise Data Hub gives business users access to more data from more sources for deep analytics. Hadoop Enterprise Data Hub 18 Single Version of Truth
  • 19. Firewall Issues Normally, Storm or Kafka can be used to send POS messages to Cassandra. In certain situations where a firewall exists between data source and processing cluster - such as created by mergers or spin-outs – both Storm and Kafka can be used to send messages over the firewall. 19 Unique Challenge for a Complex Enterprise
  • 20. Real-Time Over Firewall 20 Unique Challenge for a Complex Enterprise 3 Node Storm Cluster
  • 21. Advanced Analytics 21 Inventory forecasting with Machine Learning on data from Weather Reports Data-Driven Decision Making Once the Hadoop / Cassandra framework is in place, data from virtually any source can be consumed in the Enterprise Data Hub for Advanced Analytics. New ways to use Social, Geo, Sensor data to develop predictive models…
  • 23.  Enterprise Data Hub and single version of truth for all data  Hadoop can help you answer questions that were difficult or cost prohibitive to answer before  Hadoop can transform your organization’s approach to how you use data and ask questions you never even thought of  Must have a clear strategy and long-term plan  Leverage the right partnerships to achieve your goals 23 Big Data Business Wins
  • 25. Your One-Stop Big Data Helpline phone: email: visit: 1-800-234-8769 contact@metascale.com www.metascale.com