Intended to give an overview of analytics and big data in practice. With set of industry use cases from different domains. Would be useful for someone who is trying to understand Analytics and Big Data.
5. Analytics
Is the process of iterative,
methodical exploration of
an organization’s data
with emphasis on
statistical Analysis. To
enable data-driven
decision making.
7. Why do companies care…?
● Digital innovation and disruptions
○ Netflix vs Blockbuster
○ Amazon’s disruptive innovation
○ Google vs GPS
○ Traditional advertising vs social media
advertising
○ Competitive advantage
8. Why now?
● Storage has become cheaper
● Availability of infrastructure at
cloud
● Open source
● Data Science and Machine
learning moving beyond
research
9. Data everywhere in every domain
❖ Web - content, link structure, clicks
❖ Retail - customer details, point of sale, inventory
❖ Medical - literature, patient history, drug details …
❖ Financial - stocks, currencies, financial news, commodities
❖ Insurance - customer history, claim details …
❖ Telecom - call detail records, customer history & profile …
❖ Banking - customer transactions, profile …
❖ Travel & Hospitality - travel itinerary, schedule …
10. Industries
● Medical, Healthcare and Life
Sciences
● Automobile and Manufacturing
● Travel and Hospitality
● Retail and Ecommerce
● Web, Social Media and Digital
Media
● Telecommunication
● Banking, Finance and Insurance
● Energy
● Sports, Media and Entertainment
● Niche areas like autonomous
driving, image video processing,
etc,.
11. Medical, Healthcare and Life Sciences
● Cancer research with pattern recognition on
cells
● Clinical trials with millions of compositions for
drugs
● Prediction of diseases with tests and
probabilistic studies ex: Diabetes and Down
syndrome prediction
● Collection and storage of test results like scan
reports, blood test reports, etc,.
● Image processing, text processing and
complex pattern recognition analysis etc,.
● Analyzing literatures and patents to find out
cure for diseases
12. Automobile and manufacturing
● One of the frontrunners of adopting big data and
analytics even before the cloud computing (during the
cluster computing days)
● Analyzing vast amount,
○ Customer feedbacks
○ Inventory data
○ Repair and life of parts report
○ Competitive information
○ Market research data
● To come out with best design that will sustain long time
in the market
● Some of these analysis could run for months together
● Design arrived at will be tested under simulation
environment
13. Travel and Hospitality
● Revenue management was one of the
technique that resurrected the airline
industry that was close to its death during
early 90’s
● Similar techniques are used with
hospitality industry as well with increasing
number of hotels and the kind of
competitive market it has became
● Growing number of Online portals shows
the amount of competition in this industry
● Data generated and consumed in this
industry really huge
14. Retail and Ecommerce
● Inventory tracking across franchises
● Relationship between inventory overrun and
discounts
● Recommending right products in subseconds to
close the purchase lifecycle of the customers
appropriately
● Imagine the scaling problems faced by online
retailers like Amazon, Flipkart, etc,. With
millions of products and millions customers to
handle
● The capability to handle the price elasticity in
the market
● Example use case of Best buy vs Amazon
15. Web, Social media and Digital media
● With the amount of tweets and posts that twitter and
facebook handle it is daunting task for them to be notifying
the right set of people
● The kind of job recommendation and PYMK does by
Linkedin is a really hard problem to handle at that scale
● Advertisement industry in the digital media has a really
complicated ecosystem,
○ With so many publishers, agencies and advertisements
○ To satisfy so many parameters like number of impressions, CTR,
conversion, etc,.
● Such a complicated ecosystem is handling online bidding
at micro seconds to choose the advertisement to show for
each page
16. Telecommunication
● More than 16 players in India running under
a very tight margin in call rate
● For them to get revenue they have to
squeeze out interest through every single
customer,
○ By targeting them with right offer and promotion at
right time
○ They operate at micro segments of size 1000s out of
their 160 Million customers
● Huge number mobile subscribers moving all
over and making lot of calls
● All these generate a lot of data in the form
CDRs, etc,.
● And all of these needs to be processed,
stored, analyzed and archived appropriately
17. Banking, Finance and Insurance
● Banks run lot of promotions in the form of sending emails,
sms, etc,. To its customers
● They get profit for every single conversion out of these
campaigns
● Imaging how hard it is to choose the right set of customers
to target with right set offers to maximize the revenues out
of these campaigns
● People who work in finance industry like stock market etc,.
Has a large volume of data in wide variety of forms to
consume to mine for any meaningful insights to come out
with right strategy for investment
● Processing claims and detection of frauds is really hard
problem to solve at scale
● Insurance firms have started utilizing sophisticated
techniques like text processing on the claim statements to
detect frauds
18. Energy
● Amount of image processing in
analyzing satellite images to locate
the point of energy source is
humongous
● Any small amount of precision of
error can also introduce a huge loss
● Hence the results need to be
optimized with huge number of
iterations to minimize the error
19. Sports, Media and Entertainment
● Football clubs and IPL franchises have started
modeling the players to arrive at optimal
strategy to play with
● For example NZ cricket team at some point of
time started utilizing the systems to an extent
to automate the team selection
● Media and Entertainment needs to be up to
date with social media to compete with them
and against their peers
24. Why should I bother…?
● Industry growing rapidly
● More organizations adopting
● Technology trends
● Skill gap and projection
● Skills getting obsolete
25.
26.
27. Industry 4.0
Cyber-Physical
Systems (CPS) are
integrations of
computation,
networking, and
physical processes.
Embedded computers
and networks monitor
and control the
physical processes,
with feedback loops
where physical
processes affect
computations and
vice versa.