SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
LECTURE L21
L21 Big Data and Analytics
Big Data
With the computer revolution, digital data becomes possible

Over the years, data has grown exponentially
“Big Data” has become a
platform by itself with new
possibilties
Global Data is Growing Fast
Data in Digital Universe vs. Data Storage Cost, 2010-2015
Source: Mary Meeker, KPCB
Data is a New Growth Platform
The
Network
The

Software
The

Infrastructure
The

Data
Large investments in fiber optic & last-mile cable create connectivity that
facilitated the early Internet growth
Optimising the network with software became far more capital efficient than
additional capital expenditure buildouts, ultimately resulting in the creation of
pervasive networks (Siloed DCs -> AWS) and pervasive software (Siebel ->
Salesforce)
Emergence of pervasive software created the need to optimise the
performance of the network and store extraordinary amounts of data at
extremely low prices
Next Big Wave: Leveraging this unlimited connectivity and storage to collect /
aggregate / correlate / interpret all of this data to improve people’s live and
enable enterprises to operate more efficiently
Evolution of Data Platform
Source: Mary Meeker, KPCB
Data Generators
Source: Mary Meeker, KPCB
Improve people’s live and enable
enterprises to operate more efficiently
“Data is moving from something you use outside
the workstream to becoming a part of the
business app itself.”
— Frank Bien, CEO of Looker
Big Data Examples
Big Data Examples
Macy's Inc. and real-time pricing
The retailer adjusts pricing in near-real time for 73 million
items, based on demand and inventory.
Source:Ten big data case studies in a nutshell
Big Data Examples
Tipp24 AG, a platform for placing bets
The company uses software to analyse billions of
transactions and hundreds of customer attributes, and to
develop predictive models that target customers and
personalise marketing messages on the fly.
Source:Ten big data case studies in a nutshell
Big Data Examples
Wal-Mart Stores Inc. and search
The mega-retailer's latest search engine for Walmart.com
includes semantic data. A platform that was designed in-
house, relies on text analysis, machine learning and even
synonym mining to produce relevant search results.

Wal-Mart says adding semantic search has improved online
shoppers completing a purchase by 10% to 15%.
Source:Ten big data case studies in a nutshell
Big Data Examples
PredPol Inc. and repurposing
The Los Angeles and Santa Cruz police departments, a team
of educators and a company called PredPol have taken an
algorithm used to predict earthquakes, tweaked it and
started feeding it crime data.

The software can predict where crimes are likely to occur
down to 500 square feet. In LA, there's been a 33%
reduction in burglaries and 21% reduction in violent crimes
in areas where the software is being used.
Source:Ten big data case studies in a nutshell
Big Data Examples
American Express and business intelligence
AmEx started looking for indicators that could really predict
loyalty and developed sophisticated predictive models to
analyse historical transactions and 115 variables to forecast
potential churn

The company believes it can now identify 24% of Australian
accounts that will close within the next four months.
Source:Ten big data case studies in a nutshell
Big Data Examples
A Bank and IBM
A large US bank uses IBM machine learning technologies to
analyse credit card transactions.
Using machine learning and stream computing to detect financial fraud
TEDxUofM - Jameson Toole - Big Data for Tomorrow
What is Big Data?
What is Big Data?
Big data is high-volume, high-velocity and/or high-variety
information assets that demand cost-effective, innovative
forms of information processing that enable enhanced
insight, decision making, and process automation. 

Gartner
What is Big Data?
Big data refers to a process that is used when traditional
data mining and handling techniques cannot uncover the
insights and meaning of the underlying data. Data that is
unstructured or time sensitive or simply very large cannot be
processed by relational database engines. This type of data
requires a different processing approach called big data,
which uses massive parallelism on readily-available
hardware. 

Techopedia
What is Big Data?
“Big data is the oil of the 21st century and analytics is the
combustion engine.”
—Peter Sondergaard, Gartner Reseach
What is Big Data?
Byte: one rice
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice

Terabyte: Containership full of rice

Petabyte: Covers Manhattan
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice

Terabyte: Containership full of rice

Petabyte: Covers Manhattan

Exabyte: Covers the west coast of US
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice

Terabyte: Containership full of rice

Petabyte: Covers Manhattan

Exabyte: Covers the west coast of US

Zettabyte: Fills the Pasific

David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice

Terabyte: Containership full of rice

Petabyte: Covers Manhattan

Exabyte: Covers the west coast of US

Zettabyte: Fills the Pacific

Yottabyte: Earth size riceball
David Wellman: What is Big Data?
What is Big Data?
Byte: one rice

Kilobyte: handful of rice

Megabyte: Big pot of rice

Gigabyte: Truck full of rice

Terabyte: Containership full of rice

Petabyte: Covers Manhattan

Exabyte: Covers the west coast of US

Zettabyte: Fills the Pacific

Yottabyte: Earth size riceball
David Wellman: What is Big Data?
Big Data
Internet
Computers
Early computers
What is Big Data?
Big Data is not about the size of the
date, it’s about the value within the
data

This value can be used for marketing,
businesses optimisation, getting
insights, improving health, security
etc.
Data Analytics
Why Big Data Analytics?
Understand the data the company has

Process data to see patterns,
corrections and information that can
be used to make better decisions

Obtain insights that are otherwise not
known
Data Analytics
TRADITIONAL APPROACH
Structured and Repeatable Analays
BIG DATA APPROACH
Iternative adn Exloratory Analays
Business users
Business users
Determine what
questions to ask
IT
Structures the data
to answer the
question
IT
Delivers a platform
to enable creative
discovery
Explores what
questions could be
asked
Tools for Data Analytics
NoSQL databases: MongoDB, Cassandra, Hbase, Hypertable

Storage: S3, Hadoop Distributed File System

Servers: EC2, Google App Engine, Heroku

MapReduce: Hadoop, Hive, Pig, Cascading, S4, MapR • 

Processing: R, Yahoo! Pipes, Solr/Lucene, BigSheets,
Two Types of Data Analysis Problems
Supervised Learning:
 Learn from data but we have labels
for all the data we’ve seen so far

Example: Determining Spam Emails
Learn from data but we don’t have
any labels

Example: Grouping Emails
Unsupervised Learning:
Learning is about discovering hidden patterns in data
Clustering
One of the oldest problems in unsupervised data analysis

In clustering the goal is to group data according to similarity

Algorithms such as K-means are used for clustering
Clustering
For each artifact found, the
location to N and E from
the Marker is recorded

That is a Data Set

Before the dig, a historian
has said that three families
lived in the location
Clustering
Similar: close in physical
distance

You assign each data point
to one and only one group

The groups are called
clusters
Clustering
Clustering them is the unsupervised learning problem
where you take your data and assign each data point to
exactly one group, or cluster

Uses unlabelled data
Clustering
We may have collection data but we don’t know what to
do with it

We might want to explore the data without a particular
end goal in mind

Perhaps the data will suggest interesting avenues for
further analysis

In this case, we say that we're performing exploratory
data analysis
Exploratory data analysis
We don’t know what we are looking for

Data point = color of pixel and location of pixel

Dissimilarity is the distance in color
Exploratory data analysis
In some cases
labelling is too
expensive

For example,
news change
every day and
there are too
much of them
Using Big Data to Influence People
Alexander Nix, CEO Cambridge Analytica
Data Analysis as a Platform
THEN NOW
Complex tools operated by Data Analysts

Chaos of data silos accross the company
Real-time data analytics platform like Looker
Customer Data as a Platform
THEN NOW
Difficult to customise, lack of
automated customer insights
Real-time Intelligent that automatically tracks
and analysis interaction wiht customer
Mapping Data as a Platform
THEN NOW
Difficult and expensive to collect data
Limited in-app digital map useage
Mapping platforms like Mapbox
Cloud Data Monitoring as a Platform
THEN NOW
Expensive and clunky point solution

Lengthy implementation cycles
Only used by System Administrators
Cloud monitoring platforms like Datadog
Next
Games and gamification
Why you should play video games and why it is important
that your kids play video games

Contenu connexe

Tendances

IOT and Application Performance Monitoring
IOT and Application Performance MonitoringIOT and Application Performance Monitoring
IOT and Application Performance MonitoringSupongkiba Kichu
 
IoT for a Better World by Syam Madanapalli
IoT for a Better World by Syam MadanapalliIoT for a Better World by Syam Madanapalli
IoT for a Better World by Syam MadanapalliSyam Madanapalli
 
20180204 wf iot tutorial - small
20180204 wf iot tutorial - small20180204 wf iot tutorial - small
20180204 wf iot tutorial - smallRaffaele Giaffreda
 
Internet of Things
Internet of ThingsInternet of Things
Internet of ThingsMphasis
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017paul young cpa, cga
 
Practical Internet of Things Now -- What it is and six requirements for your ...
Practical Internet of Things Now -- What it is and six requirements for your ...Practical Internet of Things Now -- What it is and six requirements for your ...
Practical Internet of Things Now -- What it is and six requirements for your ...ReidCarlberg
 
NIEC DELHI IoT Guest Lecture
NIEC DELHI IoT Guest LectureNIEC DELHI IoT Guest Lecture
NIEC DELHI IoT Guest LectureHitesh
 
Internet of Things (IOT) - The Tipping Point
Internet of Things (IOT) - The Tipping PointInternet of Things (IOT) - The Tipping Point
Internet of Things (IOT) - The Tipping PointDr. Mazlan Abbas
 
Internet of Things, Innovation and India by Syam Madanapalli
Internet of Things, Innovation and India by Syam MadanapalliInternet of Things, Innovation and India by Syam Madanapalli
Internet of Things, Innovation and India by Syam MadanapalliSyam Madanapalli
 
Internet of things: New Technology and its Impact on Business Models
Internet of things: New Technology and its Impact on Business ModelsInternet of things: New Technology and its Impact on Business Models
Internet of things: New Technology and its Impact on Business ModelsKate Carruthers
 
Internet of things (IOT)
Internet of things (IOT)Internet of things (IOT)
Internet of things (IOT)Oshin Kandpal
 
تعلم الانترنيت
تعلم الانترنيتتعلم الانترنيت
تعلم الانترنيتjinanAlmousawy
 
How iot is changing construction
How iot is changing construction How iot is changing construction
How iot is changing construction Vish Nandlall
 
Internet of things
Internet of thingsInternet of things
Internet of thingsNaiyer Khan
 
The future of big data analytics
The future of big data analyticsThe future of big data analytics
The future of big data analyticsAhmed Banafa
 

Tendances (20)

Knowledge of IoT
Knowledge of IoTKnowledge of IoT
Knowledge of IoT
 
IOT and Application Performance Monitoring
IOT and Application Performance MonitoringIOT and Application Performance Monitoring
IOT and Application Performance Monitoring
 
what is IoT
what is IoTwhat is IoT
what is IoT
 
The internet of things
The internet of thingsThe internet of things
The internet of things
 
IoT for a Better World by Syam Madanapalli
IoT for a Better World by Syam MadanapalliIoT for a Better World by Syam Madanapalli
IoT for a Better World by Syam Madanapalli
 
20180204 wf iot tutorial - small
20180204 wf iot tutorial - small20180204 wf iot tutorial - small
20180204 wf iot tutorial - small
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017
 
Practical Internet of Things Now -- What it is and six requirements for your ...
Practical Internet of Things Now -- What it is and six requirements for your ...Practical Internet of Things Now -- What it is and six requirements for your ...
Practical Internet of Things Now -- What it is and six requirements for your ...
 
NIEC DELHI IoT Guest Lecture
NIEC DELHI IoT Guest LectureNIEC DELHI IoT Guest Lecture
NIEC DELHI IoT Guest Lecture
 
Internet of Things (IOT) - The Tipping Point
Internet of Things (IOT) - The Tipping PointInternet of Things (IOT) - The Tipping Point
Internet of Things (IOT) - The Tipping Point
 
Internet of Things, Innovation and India by Syam Madanapalli
Internet of Things, Innovation and India by Syam MadanapalliInternet of Things, Innovation and India by Syam Madanapalli
Internet of Things, Innovation and India by Syam Madanapalli
 
Internet of things: New Technology and its Impact on Business Models
Internet of things: New Technology and its Impact on Business ModelsInternet of things: New Technology and its Impact on Business Models
Internet of things: New Technology and its Impact on Business Models
 
Internet of things (IOT)
Internet of things (IOT)Internet of things (IOT)
Internet of things (IOT)
 
تعلم الانترنيت
تعلم الانترنيتتعلم الانترنيت
تعلم الانترنيت
 
How iot is changing construction
How iot is changing construction How iot is changing construction
How iot is changing construction
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
The Internet of Things
The Internet of ThingsThe Internet of Things
The Internet of Things
 
The future of big data analytics
The future of big data analyticsThe future of big data analytics
The future of big data analytics
 

En vedette (16)

L12 Digital Transformation
L12 Digital TransformationL12 Digital Transformation
L12 Digital Transformation
 
Tækni- og hugverkaþingi SI: Fjórða iðnbyltingin
Tækni- og hugverkaþingi SI: Fjórða iðnbyltinginTækni- og hugverkaþingi SI: Fjórða iðnbyltingin
Tækni- og hugverkaþingi SI: Fjórða iðnbyltingin
 
L08 Becoming Invisible
L08 Becoming InvisibleL08 Becoming Invisible
L08 Becoming Invisible
 
L20 Local
L20 LocalL20 Local
L20 Local
 
L06 Diffusion of Innovation
L06 Diffusion of InnovationL06 Diffusion of Innovation
L06 Diffusion of Innovation
 
L16 A World Wide Network
L16 A World Wide NetworkL16 A World Wide Network
L16 A World Wide Network
 
L14 Software and AI
L14 Software and AIL14 Software and AI
L14 Software and AI
 
L11 The Broadcast Century
L11 The Broadcast CenturyL11 The Broadcast Century
L11 The Broadcast Century
 
L09 Disruptive Technology
L09 Disruptive TechnologyL09 Disruptive Technology
L09 Disruptive Technology
 
L19 Social
L19 SocialL19 Social
L19 Social
 
L18 The Mobile Revolution
L18 The Mobile RevolutionL18 The Mobile Revolution
L18 The Mobile Revolution
 
L22 Games and Gamification
L22 Games and GamificationL22 Games and Gamification
L22 Games and Gamification
 
L10 The Innovator's Dilemma
L10 The Innovator's DilemmaL10 The Innovator's Dilemma
L10 The Innovator's Dilemma
 
L13 The Rise of the Machine
L13 The Rise of the MachineL13 The Rise of the Machine
L13 The Rise of the Machine
 
L17 Internet of Things
L17 Internet of ThingsL17 Internet of Things
L17 Internet of Things
 
L15 Augmented and Virtual Reality
L15 Augmented and Virtual RealityL15 Augmented and Virtual Reality
L15 Augmented and Virtual Reality
 

Similaire à L21 Big Data and Analytics

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...SayantanRoy14
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data FundamentalsSmarak Das
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataHari Priya
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICSNAGARAJAGIDDE
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 

Similaire à L21 Big Data and Analytics (20)

L18 Big Data and Analytics
L18 Big Data and AnalyticsL18 Big Data and Analytics
L18 Big Data and Analytics
 
L18 Big Data and Analytics
L18 Big Data and AnalyticsL18 Big Data and Analytics
L18 Big Data and Analytics
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
"Big Data Dreams"
"Big Data Dreams""Big Data Dreams"
"Big Data Dreams"
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Bigdata
BigdataBigdata
Bigdata
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 

Plus de Ólafur Andri Ragnarsson

New Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionNew Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionÓlafur Andri Ragnarsson
 
New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine Ólafur Andri Ragnarsson
 

Plus de Ólafur Andri Ragnarsson (20)

Nýsköpun - Leiðin til framfara
Nýsköpun - Leiðin til framfaraNýsköpun - Leiðin til framfara
Nýsköpun - Leiðin til framfara
 
Nýjast tækni og framtíðin
Nýjast tækni og framtíðinNýjast tækni og framtíðin
Nýjast tækni og framtíðin
 
New Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionNew Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course Introduction
 
L01 Introduction
L01 IntroductionL01 Introduction
L01 Introduction
 
L23 Robotics and Drones
L23 Robotics and Drones L23 Robotics and Drones
L23 Robotics and Drones
 
L22 Augmented and Virtual Reality
L22 Augmented and Virtual RealityL22 Augmented and Virtual Reality
L22 Augmented and Virtual Reality
 
L20 Personalised World
L20 Personalised WorldL20 Personalised World
L20 Personalised World
 
L19 Network Platforms
L19 Network PlatformsL19 Network Platforms
L19 Network Platforms
 
L17 Algorithms and AI
L17 Algorithms and AIL17 Algorithms and AI
L17 Algorithms and AI
 
L16 Internet of Things
L16 Internet of ThingsL16 Internet of Things
L16 Internet of Things
 
L14 From the Internet to Blockchain
L14 From the Internet to BlockchainL14 From the Internet to Blockchain
L14 From the Internet to Blockchain
 
L14 The Mobile Revolution
L14 The Mobile RevolutionL14 The Mobile Revolution
L14 The Mobile Revolution
 
New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine
 
L12 digital transformation
L12 digital transformationL12 digital transformation
L12 digital transformation
 
L10 The Innovator's Dilemma
L10 The Innovator's DilemmaL10 The Innovator's Dilemma
L10 The Innovator's Dilemma
 
L09 Disruptive Technology
L09 Disruptive TechnologyL09 Disruptive Technology
L09 Disruptive Technology
 
L09 Technological Revolutions
L09 Technological RevolutionsL09 Technological Revolutions
L09 Technological Revolutions
 
L07 Becoming Invisible
L07 Becoming InvisibleL07 Becoming Invisible
L07 Becoming Invisible
 
L06 Diffusion of Innovation
L06 Diffusion of InnovationL06 Diffusion of Innovation
L06 Diffusion of Innovation
 
L05 Innovation
L05 InnovationL05 Innovation
L05 Innovation
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

L21 Big Data and Analytics

  • 1. LECTURE L21 L21 Big Data and Analytics
  • 2. Big Data With the computer revolution, digital data becomes possible Over the years, data has grown exponentially “Big Data” has become a platform by itself with new possibilties
  • 3. Global Data is Growing Fast Data in Digital Universe vs. Data Storage Cost, 2010-2015 Source: Mary Meeker, KPCB
  • 4. Data is a New Growth Platform The Network The
 Software The
 Infrastructure The
 Data Large investments in fiber optic & last-mile cable create connectivity that facilitated the early Internet growth Optimising the network with software became far more capital efficient than additional capital expenditure buildouts, ultimately resulting in the creation of pervasive networks (Siloed DCs -> AWS) and pervasive software (Siebel -> Salesforce) Emergence of pervasive software created the need to optimise the performance of the network and store extraordinary amounts of data at extremely low prices Next Big Wave: Leveraging this unlimited connectivity and storage to collect / aggregate / correlate / interpret all of this data to improve people’s live and enable enterprises to operate more efficiently
  • 5. Evolution of Data Platform Source: Mary Meeker, KPCB
  • 7. Improve people’s live and enable enterprises to operate more efficiently
  • 8. “Data is moving from something you use outside the workstream to becoming a part of the business app itself.” — Frank Bien, CEO of Looker
  • 10. Big Data Examples Macy's Inc. and real-time pricing The retailer adjusts pricing in near-real time for 73 million items, based on demand and inventory. Source:Ten big data case studies in a nutshell
  • 11. Big Data Examples Tipp24 AG, a platform for placing bets The company uses software to analyse billions of transactions and hundreds of customer attributes, and to develop predictive models that target customers and personalise marketing messages on the fly. Source:Ten big data case studies in a nutshell
  • 12. Big Data Examples Wal-Mart Stores Inc. and search The mega-retailer's latest search engine for Walmart.com includes semantic data. A platform that was designed in- house, relies on text analysis, machine learning and even synonym mining to produce relevant search results. Wal-Mart says adding semantic search has improved online shoppers completing a purchase by 10% to 15%. Source:Ten big data case studies in a nutshell
  • 13. Big Data Examples PredPol Inc. and repurposing The Los Angeles and Santa Cruz police departments, a team of educators and a company called PredPol have taken an algorithm used to predict earthquakes, tweaked it and started feeding it crime data. The software can predict where crimes are likely to occur down to 500 square feet. In LA, there's been a 33% reduction in burglaries and 21% reduction in violent crimes in areas where the software is being used. Source:Ten big data case studies in a nutshell
  • 14. Big Data Examples American Express and business intelligence AmEx started looking for indicators that could really predict loyalty and developed sophisticated predictive models to analyse historical transactions and 115 variables to forecast potential churn The company believes it can now identify 24% of Australian accounts that will close within the next four months. Source:Ten big data case studies in a nutshell
  • 15. Big Data Examples A Bank and IBM A large US bank uses IBM machine learning technologies to analyse credit card transactions. Using machine learning and stream computing to detect financial fraud
  • 16. TEDxUofM - Jameson Toole - Big Data for Tomorrow
  • 17.
  • 18. What is Big Data?
  • 19. What is Big Data? Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation. Gartner
  • 20. What is Big Data? Big data refers to a process that is used when traditional data mining and handling techniques cannot uncover the insights and meaning of the underlying data. Data that is unstructured or time sensitive or simply very large cannot be processed by relational database engines. This type of data requires a different processing approach called big data, which uses massive parallelism on readily-available hardware. Techopedia
  • 21. What is Big Data? “Big data is the oil of the 21st century and analytics is the combustion engine.” —Peter Sondergaard, Gartner Reseach
  • 22. What is Big Data? Byte: one rice David Wellman: What is Big Data?
  • 23. What is Big Data? Byte: one rice Kilobyte: handful of rice David Wellman: What is Big Data?
  • 24. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice David Wellman: What is Big Data?
  • 25. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice David Wellman: What is Big Data?
  • 26. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice Terabyte: Containership full of rice Petabyte: Covers Manhattan David Wellman: What is Big Data?
  • 27. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice Terabyte: Containership full of rice Petabyte: Covers Manhattan Exabyte: Covers the west coast of US David Wellman: What is Big Data?
  • 28. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice Terabyte: Containership full of rice Petabyte: Covers Manhattan Exabyte: Covers the west coast of US Zettabyte: Fills the Pasific David Wellman: What is Big Data?
  • 29. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice Terabyte: Containership full of rice Petabyte: Covers Manhattan Exabyte: Covers the west coast of US Zettabyte: Fills the Pacific Yottabyte: Earth size riceball David Wellman: What is Big Data?
  • 30. What is Big Data? Byte: one rice Kilobyte: handful of rice Megabyte: Big pot of rice Gigabyte: Truck full of rice Terabyte: Containership full of rice Petabyte: Covers Manhattan Exabyte: Covers the west coast of US Zettabyte: Fills the Pacific Yottabyte: Earth size riceball David Wellman: What is Big Data? Big Data Internet Computers Early computers
  • 31. What is Big Data? Big Data is not about the size of the date, it’s about the value within the data This value can be used for marketing, businesses optimisation, getting insights, improving health, security etc.
  • 33. Why Big Data Analytics? Understand the data the company has Process data to see patterns, corrections and information that can be used to make better decisions Obtain insights that are otherwise not known
  • 34. Data Analytics TRADITIONAL APPROACH Structured and Repeatable Analays BIG DATA APPROACH Iternative adn Exloratory Analays Business users Business users Determine what questions to ask IT Structures the data to answer the question IT Delivers a platform to enable creative discovery Explores what questions could be asked
  • 35. Tools for Data Analytics NoSQL databases: MongoDB, Cassandra, Hbase, Hypertable Storage: S3, Hadoop Distributed File System Servers: EC2, Google App Engine, Heroku MapReduce: Hadoop, Hive, Pig, Cascading, S4, MapR • Processing: R, Yahoo! Pipes, Solr/Lucene, BigSheets,
  • 36. Two Types of Data Analysis Problems Supervised Learning: Learn from data but we have labels for all the data we’ve seen so far Example: Determining Spam Emails Learn from data but we don’t have any labels Example: Grouping Emails Unsupervised Learning: Learning is about discovering hidden patterns in data
  • 37. Clustering One of the oldest problems in unsupervised data analysis In clustering the goal is to group data according to similarity Algorithms such as K-means are used for clustering
  • 38. Clustering For each artifact found, the location to N and E from the Marker is recorded That is a Data Set Before the dig, a historian has said that three families lived in the location
  • 39. Clustering Similar: close in physical distance You assign each data point to one and only one group The groups are called clusters
  • 40. Clustering Clustering them is the unsupervised learning problem where you take your data and assign each data point to exactly one group, or cluster Uses unlabelled data
  • 41. Clustering We may have collection data but we don’t know what to do with it We might want to explore the data without a particular end goal in mind Perhaps the data will suggest interesting avenues for further analysis In this case, we say that we're performing exploratory data analysis
  • 42. Exploratory data analysis We don’t know what we are looking for Data point = color of pixel and location of pixel Dissimilarity is the distance in color
  • 43. Exploratory data analysis In some cases labelling is too expensive For example, news change every day and there are too much of them
  • 44. Using Big Data to Influence People
  • 45. Alexander Nix, CEO Cambridge Analytica
  • 46.
  • 47. Data Analysis as a Platform THEN NOW Complex tools operated by Data Analysts
 Chaos of data silos accross the company Real-time data analytics platform like Looker
  • 48. Customer Data as a Platform THEN NOW Difficult to customise, lack of automated customer insights Real-time Intelligent that automatically tracks and analysis interaction wiht customer
  • 49. Mapping Data as a Platform THEN NOW Difficult and expensive to collect data Limited in-app digital map useage Mapping platforms like Mapbox
  • 50. Cloud Data Monitoring as a Platform THEN NOW Expensive and clunky point solution
 Lengthy implementation cycles Only used by System Administrators Cloud monitoring platforms like Datadog
  • 51. Next Games and gamification Why you should play video games and why it is important that your kids play video games