SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
A LITTLE BEE BOOK
What is
Big Data?
This book belongs to:
A LITTLE BEE BOOK
What is
Big Data?
from a blog post by Mike Ferguson
with additional material from Bob Yelland
For more copies of this book, or to read others in the series, visit: littlebeelibrary.com
BACK NEXT
4
Everywhere you look today, people are looking
down at a mobile device. They are online: browsing,
collaborating, shopping for goods and services, and
transacting business.
And it’s not just the consumer. It is happening in
business-to-business as well.
In today’s on-line world, customers and prospects
have become all powerful.
Social networks, comparison websites, and review
websites have allowed them to quickly become well
informed before they buy, and with information on
their side, they will sacrifice loyalty in the blink of an
eye or the click of a mouse if quality and service are
not good.
BACK NEXT
6
In such a fast-moving, online, mobile economy, where
company lifespans are shrinking, it is not surprising
that company CEOs are so customer focused.
This is about business survival where companies
need to:
•	 Retain and grow customers
•	 Optimise business operations
•	 Reduce risk
•	 Improve financial management
Business survival is becoming as much about
customer retention as it is about growth and if
companies are going to survive they need to know as
much as possible about their customers.
BACK NEXT
8
So the question is what do you know about them?
Most of what companies know is typically held
in a data warehouse – a database that collects
transactions and looks at customer transaction
activity over time to understand who is buying what
through which channel.
So unless there is a transaction we know very little,
and with competition breathing down your neck the
next question is can you get more data, analyse it
and enrich what you already know about customers?
The answer, of course, is yes.
BACK NEXT
10
We can get a lot of data from a wide range of new
sources. These include:
1.	Click stream data
(all the clicks of every visitor on your website(s))
Analysing this type of data allows you to
understand site navigation behaviour, the paths
people take to buying products and services, what
else they looked at on the way to buying, paths
that led to abandonment, etc.
This helps improve customer experience and
conversion. It may also be possible to associate
clicks with customers and prospects.
2.	Shopping cart data from your website
This lets you see what people are putting into and
taking out of shopping carts en route to your online
checkout.
BACK NEXT
12
3. Social networks data
	 e.g. Twitter, Facebook, LinkedIn
Analysing this type of data allows you to get
additional information about customers that you
don’t yet have and identify sentiment, i.e. what
people are saying about your products, your
customer service, your brand, and their likes and
dislikes.
Analysis also allows you to identify who the
influencers are in the network and how people are
connected across multiple communities.
Targeting influencers with marketing campaigns
could significantly boost sales.
BACK NEXT
14
4. Sensor data
This is data from smart products (e.g. GPS
sensors in phones) to give you information on
product usage or location.
Sensor data may also exist to help monitor
production lines, asset performance, medical
equipment, supply chains and distribution
chains, e.g. to see if customers are getting
deliveries on time.
The challenge with these four data types is that
they don’t all fit in the traditional tables of rows and
columns that we are used to in relational databases.
BACK NEXT
16
A key reason for this is because the format varies
by data type. Text, for example, is unstructured data
but we still want to analyse it to extract details about
people, products, locations, monetary amounts,
dates, and times and to understand sentiment.
Also data volumes can be very large, and there is an
increasing requirement to capture and analyse high
velocity data in real time.
These three attributes of big data: volume, variety
and velocity, have necessitated a new technological
approach to both the storage and analysis of data.
Companies need a single approach to analyse
structured (e.g. transactions), semi-structured (e.g.
JSON, XML), and unstructured (e.g. text, image) data.
BACK NEXT
18
Because big data is more complex, larger in volume
and arriving very quickly, new types of analysis have
emerged. These new analytical workloads include:
•	 Analysis of data in motion (streaming analytics)
•	 Analytics at the edge – for sensor data
•	 Complex analysis of structured data
•	 Machine learning to find patterns and
correlations in data
•	 Exploratory analysis of un-modelled
multi‑structured data such as Twitter data for
sentiment analytics
•	 Graph analysis, e.g. social networks, fraud
detection and real-time recommendation engines
BACK NEXT
20
The result has been the emergence of new platforms
more suited to these new analytical workloads, that
extend the analytical environment beyond the data
warehouse into multiple types of data store.
NoSQL databases are data stores for unstructured
or semi-structured data.
Apache Hadoop is an open-source software
framework used for distributed storage and
processing of very large data sets.
Apache Spark is a fast, in-memory data
processing engine for streaming and machine
learning workloads.
These will be discussed further in other Little Bee
books.
BACK NEXT
22
The IBM Watson Data Platform is a cloud-based
data and analytics platform designed to integrate
all types of data and enable artificial intelligence-
powered decision-making, such as natural language-
based discovery, machine learning and cognitive
analytics services.
The Watson Data Platform is based on Apache
Spark technology and encompasses relational
databases, document stores, graph and Hadoop
environments.
The goal: to make it simple for business leaders and
data professionals to collect, organise, govern and
secure all data, so they can get the insights needed
to become a cognitive business.
Why not sign up for a free trial?
BACK NEXT
24
© Copyright IBM Corporation 2017. All Rights Reserved.
IBM, the IBM logo and ibm.com are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both.
Other product, company or service names may be trademarks or service marks of others.

Contenu connexe

Tendances

Machine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhyaMachine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhya
AnthonyBennet
 
Playbook The Ins and Outs of First Party Data_final
Playbook The Ins and Outs of First Party Data_finalPlaybook The Ins and Outs of First Party Data_final
Playbook The Ins and Outs of First Party Data_final
Donna Chang
 
Making sense-of-the-chaos
Making sense-of-the-chaosMaking sense-of-the-chaos
Making sense-of-the-chaos
swaipnew
 

Tendances (20)

Ques data mining
Ques data miningQues data mining
Ques data mining
 
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...
 
Data mining
Data miningData mining
Data mining
 
Wall streetjournal
Wall streetjournalWall streetjournal
Wall streetjournal
 
InsideView Data Overview
InsideView Data OverviewInsideView Data Overview
InsideView Data Overview
 
How the Game is Changing: Big Data in Retail
How the Game is Changing: Big Data in RetailHow the Game is Changing: Big Data in Retail
How the Game is Changing: Big Data in Retail
 
Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...
 
What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?
 
The Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its ApplicationsThe Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its Applications
 
Data mining
Data miningData mining
Data mining
 
How Enterprises Can Incorporate Big Data And Analytics
How Enterprises Can Incorporate Big Data And AnalyticsHow Enterprises Can Incorporate Big Data And Analytics
How Enterprises Can Incorporate Big Data And Analytics
 
Machine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhyaMachine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhya
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 
Rulex big data and analytics
Rulex big data and analyticsRulex big data and analytics
Rulex big data and analytics
 
Playbook The Ins and Outs of First Party Data_final
Playbook The Ins and Outs of First Party Data_finalPlaybook The Ins and Outs of First Party Data_final
Playbook The Ins and Outs of First Party Data_final
 
InsideView Clean Data
InsideView Clean DataInsideView Clean Data
InsideView Clean Data
 
Dark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsDark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential Benefits
 
The Big Data Revolution in Retail
The Big Data Revolution in RetailThe Big Data Revolution in Retail
The Big Data Revolution in Retail
 
Definitions for Real World of Big Data Marketing
Definitions for Real World of Big Data MarketingDefinitions for Real World of Big Data Marketing
Definitions for Real World of Big Data Marketing
 
Making sense-of-the-chaos
Making sense-of-the-chaosMaking sense-of-the-chaos
Making sense-of-the-chaos
 

Similaire à Big Data

PresentationThe capability of enormous information - or the new .pdf
PresentationThe capability of enormous information - or the new .pdfPresentationThe capability of enormous information - or the new .pdf
PresentationThe capability of enormous information - or the new .pdf
aradhana9856
 

Similaire à Big Data (20)

PresentationThe capability of enormous information - or the new .pdf
PresentationThe capability of enormous information - or the new .pdfPresentationThe capability of enormous information - or the new .pdf
PresentationThe capability of enormous information - or the new .pdf
 
Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
What is big data
What is big dataWhat is big data
What is big data
 
Big data for small businesses
Big data for small businessesBig data for small businesses
Big data for small businesses
 
A Primer for a layman about Big Data, Business Analytics and Cloud
A Primer for a layman  about Big Data, Business Analytics and CloudA Primer for a layman  about Big Data, Business Analytics and Cloud
A Primer for a layman about Big Data, Business Analytics and Cloud
 
Building data science teams
Building data science teamsBuilding data science teams
Building data science teams
 
Guide to big data analytics
Guide to big data analyticsGuide to big data analytics
Guide to big data analytics
 
Data Driven Marketing: the DNA of customer orientated companies
Data Driven Marketing: the DNA of customer orientated companiesData Driven Marketing: the DNA of customer orientated companies
Data Driven Marketing: the DNA of customer orientated companies
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
How does big data impact you
How does big data impact youHow does big data impact you
How does big data impact you
 
Six Trends in Retail Analytics
Six Trends in Retail Analytics Six Trends in Retail Analytics
Six Trends in Retail Analytics
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGYBIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
 
Thinking Small: Bringing the Power of Big Data to the Masses
Thinking Small:  Bringing the Power of Big Data to the MassesThinking Small:  Bringing the Power of Big Data to the Masses
Thinking Small: Bringing the Power of Big Data to the Masses
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Data
 
How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Big Data

  • 1. A LITTLE BEE BOOK What is Big Data?
  • 2. This book belongs to: A LITTLE BEE BOOK What is Big Data? from a blog post by Mike Ferguson with additional material from Bob Yelland For more copies of this book, or to read others in the series, visit: littlebeelibrary.com BACK NEXT
  • 3. 4 Everywhere you look today, people are looking down at a mobile device. They are online: browsing, collaborating, shopping for goods and services, and transacting business. And it’s not just the consumer. It is happening in business-to-business as well. In today’s on-line world, customers and prospects have become all powerful. Social networks, comparison websites, and review websites have allowed them to quickly become well informed before they buy, and with information on their side, they will sacrifice loyalty in the blink of an eye or the click of a mouse if quality and service are not good. BACK NEXT
  • 4. 6 In such a fast-moving, online, mobile economy, where company lifespans are shrinking, it is not surprising that company CEOs are so customer focused. This is about business survival where companies need to: • Retain and grow customers • Optimise business operations • Reduce risk • Improve financial management Business survival is becoming as much about customer retention as it is about growth and if companies are going to survive they need to know as much as possible about their customers. BACK NEXT
  • 5. 8 So the question is what do you know about them? Most of what companies know is typically held in a data warehouse – a database that collects transactions and looks at customer transaction activity over time to understand who is buying what through which channel. So unless there is a transaction we know very little, and with competition breathing down your neck the next question is can you get more data, analyse it and enrich what you already know about customers? The answer, of course, is yes. BACK NEXT
  • 6. 10 We can get a lot of data from a wide range of new sources. These include: 1. Click stream data (all the clicks of every visitor on your website(s)) Analysing this type of data allows you to understand site navigation behaviour, the paths people take to buying products and services, what else they looked at on the way to buying, paths that led to abandonment, etc. This helps improve customer experience and conversion. It may also be possible to associate clicks with customers and prospects. 2. Shopping cart data from your website This lets you see what people are putting into and taking out of shopping carts en route to your online checkout. BACK NEXT
  • 7. 12 3. Social networks data e.g. Twitter, Facebook, LinkedIn Analysing this type of data allows you to get additional information about customers that you don’t yet have and identify sentiment, i.e. what people are saying about your products, your customer service, your brand, and their likes and dislikes. Analysis also allows you to identify who the influencers are in the network and how people are connected across multiple communities. Targeting influencers with marketing campaigns could significantly boost sales. BACK NEXT
  • 8. 14 4. Sensor data This is data from smart products (e.g. GPS sensors in phones) to give you information on product usage or location. Sensor data may also exist to help monitor production lines, asset performance, medical equipment, supply chains and distribution chains, e.g. to see if customers are getting deliveries on time. The challenge with these four data types is that they don’t all fit in the traditional tables of rows and columns that we are used to in relational databases. BACK NEXT
  • 9. 16 A key reason for this is because the format varies by data type. Text, for example, is unstructured data but we still want to analyse it to extract details about people, products, locations, monetary amounts, dates, and times and to understand sentiment. Also data volumes can be very large, and there is an increasing requirement to capture and analyse high velocity data in real time. These three attributes of big data: volume, variety and velocity, have necessitated a new technological approach to both the storage and analysis of data. Companies need a single approach to analyse structured (e.g. transactions), semi-structured (e.g. JSON, XML), and unstructured (e.g. text, image) data. BACK NEXT
  • 10. 18 Because big data is more complex, larger in volume and arriving very quickly, new types of analysis have emerged. These new analytical workloads include: • Analysis of data in motion (streaming analytics) • Analytics at the edge – for sensor data • Complex analysis of structured data • Machine learning to find patterns and correlations in data • Exploratory analysis of un-modelled multi‑structured data such as Twitter data for sentiment analytics • Graph analysis, e.g. social networks, fraud detection and real-time recommendation engines BACK NEXT
  • 11. 20 The result has been the emergence of new platforms more suited to these new analytical workloads, that extend the analytical environment beyond the data warehouse into multiple types of data store. NoSQL databases are data stores for unstructured or semi-structured data. Apache Hadoop is an open-source software framework used for distributed storage and processing of very large data sets. Apache Spark is a fast, in-memory data processing engine for streaming and machine learning workloads. These will be discussed further in other Little Bee books. BACK NEXT
  • 12. 22 The IBM Watson Data Platform is a cloud-based data and analytics platform designed to integrate all types of data and enable artificial intelligence- powered decision-making, such as natural language- based discovery, machine learning and cognitive analytics services. The Watson Data Platform is based on Apache Spark technology and encompasses relational databases, document stores, graph and Hadoop environments. The goal: to make it simple for business leaders and data professionals to collect, organise, govern and secure all data, so they can get the insights needed to become a cognitive business. Why not sign up for a free trial? BACK NEXT
  • 13. 24 © Copyright IBM Corporation 2017. All Rights Reserved. IBM, the IBM logo and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Other product, company or service names may be trademarks or service marks of others.