2. Internet
of
things
Ubiquitous
compu4ng
Big
data
Data
management
Data
science
Be9er
decisions
Big data in context
Social
media
Data
genera4on
Data
storage
and
management
The
cloud
Data
analysis
Data
visualiza4on
Data
analysis
and
presenta4on
Vidgen,
R.,
(2014).
Big
data:
an
introduc4on.
The
BigDataScience
blog.
h9p://datasciencebusiness.wordpress.com/
3. Big data
• Big data is a general term used to describe the
voluminous amount of unstructured and semi-
structured data a company creates -- data that would
take too much time and cost too much money to load
into a relational database for analysis
• Although Big data doesn't refer to any specific
quantity, the term is often used when speaking about
petabytes and exabytes of data
h9p://searchcloudcompu4ng.techtarget.com/defini4on/big-‐data-‐Big-‐Data
4. Data volumes
• 1 Gigabyte = 1000 megabytes
• 1 Terabyte = 1000 gigabytes
• 1 Petabyte = 1000 terabytes
• 1 Exabyte = 1000 petabytes
• 1 Zettabyte = 1000 exabytes
• 1 Yottabyte = 1000 zettabytes
Big
data
The
Large
Hadron
Collider
generates
15
petabytes
of
data
p.a.
Big
is
only
big
in
a
context
it
is
not
just
about
gigabytes
–
what
counts
is
how
data
can
be
used
to
create
value
for
individuals,
organisa4ons
and
society
but
…
5. “The
‘big’
there
is
purely
marke4ng,”
Mr.
Reed
said.
“This
is
all
fear
…
This
is
about
you
buying
big
expensive
servers
and
whatnot.”
“The
exci4ng
thing
is
you
can
get
a
lot
of
this
stuff
done
just
in
Excel,”
he
said.
“You
don’t
need
these
big
pla`orms.
You
don’t
need
all
this
big
fancy
stuff.
If
anyone
says
‘big’
in
front
of
it,
you
should
look
at
them
very
skep4cally
…
You
can
tell
charlatans
when
they
say
‘big’
in
front
of
everything.”
h9p://chronicle.com/blogs/wiredcampus/big-‐data-‐is-‐bunk-‐obama-‐campaigns-‐tech-‐guru-‐tells-‐university-‐leaders/47885
Hype?
6. Inter-‐
connectedness
Big data is not just a technical problem – it is part of a
complex sociotechnical entanglement …
Regulatory
and
legal
aspects
Technologies
Ethical
implica4ons
Stakeholders
Problems
and
“solu4ons”
Socio-‐poli4cal-‐
economic
factors
… with unintended consequences
8. Big data – what’s special about it?
• Zikopoulos et al. (2012), in an IBM publication,
describe ‘Big Data’ as consisting of:
– Volume - increasing amounts of data over
traditional settings.
– Velocity - information is being generated at a rate
that exceeds those of traditional systems.
– Variety - multiple emerging forms of data that are
of interest to enterprises, such as social media data
Zikopoulos
P,
Eaton
C,
DeRoos
D,
Deutsch
T,
Lapis
G.
2012.
Understanding
Big
Data:
Analy4cs
for
Enterprise
Class
Hadoop
and
Streaming
Data.
McGraw-‐Hill.
9. A technical challenge
• “As data is increasingly becoming more varied, more
complex and less structured, it has become imperative
to process it quickly. Meeting such demanding
requirements poses an enormous challenge for
traditional databases and scale-up infrastructures. . . .
Big Data refers to new scale-out architectures that
address these needs. Big Data is fundamentally about
massively distributed architectures and massively
parallel processing using commodity building blocks to
manage and analyze data.”
EMC.
2012.
Big
data-‐as-‐a-‐service:
a
market
and
technology
perspec4ve,
h9p://www.emc.com/collateral/sojware/
white-‐papers/h10839-‐big-‐data-‐as-‐a-‐service-‐perspt.pdf,
July
(accessed
January
2013).
10. Solution - the cloud
• Cloud computing is a general term for anything that involves
delivering hosted services over the Internet
• A cloud service has three distinct characteristics that differentiate
it from traditional hosting:
– It is sold on demand, typically by the minute or the hour
– It is elastic -- a user can have as much or as little of a service as
they want at any given time
– The service is fully managed by the provider (the consumer
needs nothing but a personal computer and Internet access)
• These services are broadly divided into three categories:
– Infrastructure-as-a-Service (IaaS)
– Platform-as-a-Service (PaaS)
– Software-as-a-Service (SaaS)
• The cloud can be public or private
h9p://searchcloudcompu4ng.techtarget.com/defini4on/cloud-‐compu4ng
11. h9p://www.bbc.co.uk/news/business-‐25773266
“IBM
believes
the
cloud
services
market
could
be
worth
$200bn
by
2020.Businesses
are
increasingly
leasing
data
storage,
compu4ng
power
and
web
hos4ng
services
from
a
growing
number
of
specialist
cloud
companies
-‐
effec4vely
outsourcing
their
IT
needs
to
cut
costs
and
improve
efficiency.”
12. Internet of Things (IoT)
• Although the concept wasn't named until 1999, the
Internet of Things has been in development for
decades
• The first Internet appliance was a Coke machine at
Carnegie Melon University in the early 1980s. The
programmers could connect to the machine over the
Internet, check the status of the machine and
determine whether or not there would be a cold drink
awaiting them, should they decide to make the trip
down to the machine
h9p://wha4s.techtarget.com/defini4on/Internet-‐of-‐Things
13. Internet of Things (IoT)
• The Internet of Things (IoT) is a scenario in which
objects, animals or people are provided with unique
identifiers and the ability to automatically transfer
data over a network without requiring human-to-
human or human-to-computer interaction
• So far, the Internet of Things has been most closely
associated with machine-to-machine (M2M)
communication in manufacturing and power, oil and
gas utilities. Products built with M2M communication
capabilities are often referred to as being smart, (e.g.,
smart meter)
h9p://wha4s.techtarget.com/defini4on/Internet-‐of-‐Things
14. Things
• A thing, in the Internet of Things, can be:
– a person with a heart monitor implant (physio
sensing)
– A person with a brain scanner (neuro sensing)
– a farm animal with a biochip transponder
– an automobile that has built-in sensors to alert the
driver when tire pressure is low
– … or any other natural or man-made object that can
be assigned an IP address and provided with the
ability to transfer data over a network
h9p://wha4s.techtarget.com/defini4on/Internet-‐of-‐Things
16. Mr
Cameron
said
the
UK
and
Germany
could
find
themselves
on
the
forefront
of
a
new
"industrial
revolu4on".
"I
see
the
internet
of
things
as
a
huge
transforma4ve
development
-‐
a
way
of
boos4ng
produc4vity,
of
keeping
us
healthier,
making
transport
more
efficient,
reducing
energy
needs,
tackling
climate
change,"
he
said.
BBC
NEWS
9
March
2014
17. Ubiquitous computing
• Ubiquitous computing is the growing trend towards
embedding microprocessors in everyday objects so they can
communicate information
• Ubiquitous mean "existing everywhere“ - ubiquitous
computing devices are completely connected and constantly
available
• Ubiquitous computing relies on the convergence of wireless
technologies, advanced electronics and the Internet
• The goal of researchers working in ubiquitous computing is
to create smart products that communicate unobtrusively
(e.g., wearable computers, Google glass, smart meters)
h9p://searchnetworking.techtarget.com/defini4on/pervasive-‐compu4ng
19. Big
data
Data
science
Be9er
decisions
Analysis and outcomes
Data
analysis
Data
visualiza4on
Data
analysis
and
presenta4on
Vidgen,
R.,
(2014).
Big
data:
an
introduc4on.
The
BigDataScience
blog.
h9p://datasciencebusiness.wordpress.com/
21. Better decisions - predictive analytics
• A predictive model that calculates strawberry
purchases based on:
– Weather forecast
– Store temperature
– Freezer sensor data
– Remaining stock per shelf life
– Sales transaction point of sale feeds
– Web searches, social mentions
h9p://www.slideshare.net/datasciencelondon/big-‐data-‐sorry-‐data-‐science-‐what-‐does-‐a-‐data-‐scien4st-‐do
22. Predictive analytics
• For example, what data might help us predict which students will drop out?
– Assessment grades at University
– Prior education attainment
– Social background
– Distance of home from University
– Friendship circles and networks (e.g., sports club memberships)
– Attendance at lectures and tutorials
– Interaction in lectures and tutorials
– Time spent on campus
– Time spent in library
– Number of accesses to electronic learning resources
– Text books purchased
– Engagement in subject-related forums
– Sentiment of social media posts
– Etc.
24. Some of the techniques data scientists use
• Classification
• Clustering
• Association rules
• Decision trees
• Regression
• Genetic algorithms
• Neural networks and
support vector
machines
• Machine learning
• Natural language
processing
• Sentiment analysis
• Artificial intelligence
• Time series analysis
• Simulations
• Social network
analysis
25. Technologies for data analysis: usage rates
King,
J.,
&
R.
Magoulas
(2013).
Data
Science
Salary
Survey.
O’Reilly
Media.
R
and
Python
programming
languages
come
above
Excel
Enterprise
products
bo9om
of
the
heap
26. Data
visualiza4on
Correla4on
matrix
based
on
MPG,
horsepower,
engine
size,
number
of
cylinders,
weight,
etc.
h9ps://boraberan.wordpress.com/2013/12/09/crea4ng-‐a-‐correla4on-‐matrix-‐in-‐tableau-‐using-‐r-‐or-‐table-‐calcula4ons/
(Masera4
is
like
a
Ferrari;
Lotus
is
not
like
a
Cadillac)
27. “According
to
a
recent
Gartner
report,
64%
of
enterprises
surveyed
indicate
that
they're
deploying
or
planning
Big
Data
projects.
Yet
even
more
acknowledge
that
they
s4ll
don't
know
what
to
do
with
Big
Data.”
Gartner
On
Big
Data:
Everyone's
Doing
It,
No
One
Knows
Why
Challenges of big data
h9p://readwrite.com/2013/09/18/gartner-‐on-‐big-‐data-‐everyones-‐doing-‐it-‐no-‐one-‐knows-‐why#awesm=~ost43oe8yXjDzr
28. Big data: it's about iteration
• Start small when tackling big data
• Go open source software
• Train existing employees who know the business
rather than hunt for data talent
• Iterate on your project as you learn which data sources
are valuable, and which questions yield real insights
• You don't have to know the end from the beginning,
but you should have a clearer view of what you hope to
achieve with Big Data than the Gartner report seems to
indicate most have
h9p://readwrite.com/2013/09/18/gartner-‐on-‐big-‐data-‐everyones-‐doing-‐it-‐no-‐one-‐knows-‐why#awesm=~ost43oe8yXjDzr
29. Resources
McKinsey (2011). Big data: The next frontier for innovation, competition,
and productivity
http://www.mckinsey.com/insights/business_technology/
big_data_the_next_frontier_for_innovation
Sogetti. Various reports on data analytics, privacy, legal aspects, predicting
behaviour http://vint.sogeti.com/download-big-data-reports/
The Economist (2012). Big data: Lessons from the leaders
http://www.economistinsights.com/sites/default/files/downloads/
EIU_SAS_BigData_4.pdf