SlideShare a Scribd company logo
1 of 34
SMALL DATAINFORMED DECISION MAKING
Neil Dunlop / Ben Foster
BI Boss, Leeds, September 2015
PROBLEM
We have a
“Data, in the hands of the right people,
is the most important asset an
organisation has..”
Informatica.com
DATA
Too much
Fire HoseExponential Tsunami
WRONG SORT
It’s the
of data
NO SENSE
It makes
to me
(and telling me
about yesterday,
tomorrow does
not help today!)
EVERYTHING
Recording
gives you
NOTHING
DIFFICULT
Conclusions are
(and we’re not entirely sure
what to do about that)
BIG
(to the rescue)
DATA
< volume – velocity – variety >
use a TOOL you fool
DATA SCIENTIST
Get yourself a
WAIT
SkillTime Money Data
< we – need - more >
STRATEGY
Phase 2 Phase 3Phase 1
It’s all about the
SMALL DATA
It’s time for
INFORMATION
We want
DATA
Not
SMALL DATA
Definition of
“Connects people with timely,
meaningful information, organised and
packaged for human consumption”
- Allen Bonde
SMALL DATA
A better definition of
“Connects people with timely, meaningful
information, organised and packaged for
human consumption and empowers them
to take action”
- Ben Foster
REAL-TIME
Having
INFORMATION
makes all the difference
NOISE
SIGNAL
more
less
HUMAN
Packaged for
CONSUMPTION
ANALYSIS
ACTION
a little more
a little less
please
and
Turning Data into Information
• Translation – “What language is this?”
• Detection – “What just happened?”
• Relevance – “Are you talking to me?”
• Importance – “Do I care?”
• Information – “I can make a decision!”
• Action – “Don’t just stand there, do something!”
DEMONSTRATION
Time for a
ROCKET SCIENCE
Try it, It’s not exactly
(but there might eventually
be some data science)
BIG
DON’T FORGET
DATA
SMALL DATAis about people
BIG DATAis about machines
”
“
information > data
people > process
action > analysis
ThanksWe hope you enjoyed it!
Neil Dunlop / Ben Foster
BI Boss, Leeds, September 2015
Notes Slide
• Check comments

More Related Content

What's hot

What's hot (19)

How does big data impact you
How does big data impact youHow does big data impact you
How does big data impact you
 
Is big data just a buzzword -Big data simply explained
Is big data just a buzzword -Big data simply explainedIs big data just a buzzword -Big data simply explained
Is big data just a buzzword -Big data simply explained
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Big data - What is It?
Big data - What is It?Big data - What is It?
Big data - What is It?
 
Big data-analytics-ebook
Big data-analytics-ebookBig data-analytics-ebook
Big data-analytics-ebook
 
Big Data can be fun!
Big Data can be fun!Big Data can be fun!
Big Data can be fun!
 
Smart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart dataSmart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart data
 
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest MindsWhitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for Beginners
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you
Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you
Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
"Big Data Dreams"
"Big Data Dreams""Big Data Dreams"
"Big Data Dreams"
 
Democratizing Data
Democratizing DataDemocratizing Data
Democratizing Data
 
3 Mitos de Big Data revelados
3 Mitos de Big Data revelados 3 Mitos de Big Data revelados
3 Mitos de Big Data revelados
 
From AI to Z: How AI is changing the relationship between people and data
From AI to Z: How AI is changing the relationship between people and dataFrom AI to Z: How AI is changing the relationship between people and data
From AI to Z: How AI is changing the relationship between people and data
 
Policy paper need for focussed big data & analytics skillset building throu...
Policy  paper  need for focussed big data & analytics skillset building throu...Policy  paper  need for focussed big data & analytics skillset building throu...
Policy paper need for focussed big data & analytics skillset building throu...
 
Summiting the Mountain of Big Data
Summiting the Mountain of Big DataSummiting the Mountain of Big Data
Summiting the Mountain of Big Data
 

Viewers also liked

Viewers also liked (6)

Big Data vs. Small Data...what's the difference?
Big Data vs. Small Data...what's the difference?Big Data vs. Small Data...what's the difference?
Big Data vs. Small Data...what's the difference?
 
Big data hadoop
Big data hadoopBig data hadoop
Big data hadoop
 
Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015
 
Martin Lindstrom - Small Data - full day presentation part 1 of 4 handout
Martin Lindstrom - Small Data - full day presentation part 1 of 4 handoutMartin Lindstrom - Small Data - full day presentation part 1 of 4 handout
Martin Lindstrom - Small Data - full day presentation part 1 of 4 handout
 
Smart Customers, Stupid Companies
Smart Customers, Stupid CompaniesSmart Customers, Stupid Companies
Smart Customers, Stupid Companies
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 

Similar to It's about Small Data, stupid.

Few data visualization-extending_the_analytical_horizon
Few data visualization-extending_the_analytical_horizonFew data visualization-extending_the_analytical_horizon
Few data visualization-extending_the_analytical_horizon
Elsa von Licy
 
The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data Ingredient
Rich Cooper
 
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docx
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docxModule 1 - CaseInformation Networking as Technology Tools, Uses, .docx
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docx
bunnyfinney
 

Similar to It's about Small Data, stupid. (20)

Mind and the machine
Mind and the machineMind and the machine
Mind and the machine
 
What do we do with all the Big Data
What do we do with all the Big DataWhat do we do with all the Big Data
What do we do with all the Big Data
 
Few data visualization-extending_the_analytical_horizon
Few data visualization-extending_the_analytical_horizonFew data visualization-extending_the_analytical_horizon
Few data visualization-extending_the_analytical_horizon
 
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
 
Trust in the (BIG) DATA Era
Trust in the (BIG) DATA EraTrust in the (BIG) DATA Era
Trust in the (BIG) DATA Era
 
Data-Driven Postmortems - SRV208 - Toronto AWS Summit
Data-Driven Postmortems - SRV208 - Toronto AWS SummitData-Driven Postmortems - SRV208 - Toronto AWS Summit
Data-Driven Postmortems - SRV208 - Toronto AWS Summit
 
20170313 mr - gss presentation
20170313   mr - gss presentation20170313   mr - gss presentation
20170313 mr - gss presentation
 
Data analytics with managerial application ass 3
Data analytics with managerial application ass 3Data analytics with managerial application ass 3
Data analytics with managerial application ass 3
 
Dell Solutions Tour 2015- Open Stack Cloud: How UH-SKY have approached gettin...
Dell Solutions Tour 2015- Open Stack Cloud: How UH-SKY have approached gettin...Dell Solutions Tour 2015- Open Stack Cloud: How UH-SKY have approached gettin...
Dell Solutions Tour 2015- Open Stack Cloud: How UH-SKY have approached gettin...
 
Essay Information
Essay InformationEssay Information
Essay Information
 
Big Data-Job 2
Big Data-Job 2Big Data-Job 2
Big Data-Job 2
 
Final (Big data)-.docx
Final (Big data)-.docxFinal (Big data)-.docx
Final (Big data)-.docx
 
"Trust" in the (Big) Data Era(Christian Racca ,TOP-IX / TOrino Piemonte Inter...
"Trust" in the (Big) Data Era(Christian Racca ,TOP-IX / TOrino Piemonte Inter..."Trust" in the (Big) Data Era(Christian Racca ,TOP-IX / TOrino Piemonte Inter...
"Trust" in the (Big) Data Era(Christian Racca ,TOP-IX / TOrino Piemonte Inter...
 
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
 
Big Data is Here and Now
Big Data is Here and NowBig Data is Here and Now
Big Data is Here and Now
 
Using Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay VinzeUsing Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay Vinze
 
Big Data Analytics - The New Cold War
Big Data Analytics - The New Cold WarBig Data Analytics - The New Cold War
Big Data Analytics - The New Cold War
 
The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data Ingredient
 
Data-driven journalism (GIJC, Geneva April 2010) #ddj
Data-driven journalism (GIJC, Geneva April 2010) #ddjData-driven journalism (GIJC, Geneva April 2010) #ddj
Data-driven journalism (GIJC, Geneva April 2010) #ddj
 
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docx
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docxModule 1 - CaseInformation Networking as Technology Tools, Uses, .docx
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docx
 

More from Corecom Consulting

More from Corecom Consulting (20)

How to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortuneHow to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortune
 
TestBoss Manchester Nov 2019 - What's Wrong with Accessibility
TestBoss Manchester Nov 2019 - What's Wrong with AccessibilityTestBoss Manchester Nov 2019 - What's Wrong with Accessibility
TestBoss Manchester Nov 2019 - What's Wrong with Accessibility
 
TestBoss Manchester Nov 2019 - What's Wrong with Accessibility
TestBoss Manchester Nov 2019 - What's Wrong with AccessibilityTestBoss Manchester Nov 2019 - What's Wrong with Accessibility
TestBoss Manchester Nov 2019 - What's Wrong with Accessibility
 
TestBoss October 2019
TestBoss October 2019TestBoss October 2019
TestBoss October 2019
 
BIBoss: The Data Science Behind Personalisation & AI
BIBoss: The Data Science Behind Personalisation & AIBIBoss: The Data Science Behind Personalisation & AI
BIBoss: The Data Science Behind Personalisation & AI
 
DevBoss May 2019 Presentation
DevBoss May 2019 Presentation DevBoss May 2019 Presentation
DevBoss May 2019 Presentation
 
TestBoss April 2019 Discussion Notes
TestBoss April 2019 Discussion NotesTestBoss April 2019 Discussion Notes
TestBoss April 2019 Discussion Notes
 
TestBoss Manchester March 2019 - Automation in Testing: The missing piece
TestBoss Manchester March 2019 - Automation in Testing: The missing pieceTestBoss Manchester March 2019 - Automation in Testing: The missing piece
TestBoss Manchester March 2019 - Automation in Testing: The missing piece
 
Professional Networking Lecture
Professional Networking LectureProfessional Networking Lecture
Professional Networking Lecture
 
University of Leeds Professional Networking Lecture
University of Leeds Professional Networking LectureUniversity of Leeds Professional Networking Lecture
University of Leeds Professional Networking Lecture
 
TestBoss November 2018 - Ghost in the machine, how hackers break software
TestBoss November 2018 - Ghost in the machine, how hackers break softwareTestBoss November 2018 - Ghost in the machine, how hackers break software
TestBoss November 2018 - Ghost in the machine, how hackers break software
 
BaBoss October 2018
BaBoss October 2018BaBoss October 2018
BaBoss October 2018
 
Welcome to the team, Adam
Welcome to the team, AdamWelcome to the team, Adam
Welcome to the team, Adam
 
Welcome to the team
Welcome to the team Welcome to the team
Welcome to the team
 
WITBoss June 2018 - Confidence - if you can't make it, fake it
WITBoss June 2018 - Confidence - if you can't make it, fake itWITBoss June 2018 - Confidence - if you can't make it, fake it
WITBoss June 2018 - Confidence - if you can't make it, fake it
 
TestBoss May 2018 - 'How to win with automation and influence people'
TestBoss May 2018 - 'How to win with automation and influence people'TestBoss May 2018 - 'How to win with automation and influence people'
TestBoss May 2018 - 'How to win with automation and influence people'
 
TestBoss Manchester March 2018 - 'GDPR: The battles in store for Test Bosses'
TestBoss Manchester March 2018 - 'GDPR: The battles in store for Test Bosses'TestBoss Manchester March 2018 - 'GDPR: The battles in store for Test Bosses'
TestBoss Manchester March 2018 - 'GDPR: The battles in store for Test Bosses'
 
BABoss February 2018
BABoss February 2018BABoss February 2018
BABoss February 2018
 
The best bits of 2017
The best bits of 2017The best bits of 2017
The best bits of 2017
 
TestBoss: Leaders in Software Testing
TestBoss: Leaders in Software TestingTestBoss: Leaders in Software Testing
TestBoss: Leaders in Software Testing
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

It's about Small Data, stupid.

Editor's Notes

  1. Always willing to listen to the opinions of minds greater than my own.. Often that’s quite a wide field. Informatica are pretty big players in the data and information world, so they should know a thing or two about the topic and they have this to say. “……” I would agree with that statement… mostly.. The way I see it, we have a problem, you could say it’s a *big* problem.
  2. If we want to get data into the hands of the right people, we need to consider what we are going to give them. In the ever more connected world of ‘Internet of Things, Data is being produced at exponential rates – just about everything is connected and everything produces data, and its getting worse. So those ‘right people’ are going to be Faced with a Tsunami of Data. Its just going to be a huge towering wall of data coming at them. “Drinking from the Firehose" is often used to characterise the rate and volume of data available… I’m not sure that’s a good thing. It doesn’t sound like a lot of fun to me. Put quite simply, there is too much data for anyone to cope with.
  3. Our problems don’t stop there, not only do we have masses of data being served up to us in a continuous stream, what we do get is the wrong sort of data, and its wrong in a couple of ways: Unsuitable Granularity – often the data is too small to be of any significance. No Context – often we don’t know what business process or domain object or real ‘thing’ the data relates to. We don’t know when it was recorded, in short we have not context. No Relevance – If we have no context, its hard to know if this is data that is relevant to us. It could be somebody else’s data, it could be something totally unrelated to what we care about. Its hard to know! Too hard to understand – Its often in a very raw format and very hard to understand without prior knowledge and some translation work. Not fit for human consumption – Basically, this isn’t for humans, in this form, its just a wave of difficult to understand, difficult to interpret data. The kind of data that the IOT and other connected systems provide us with is the wrong sort of data. This data is very fine grained, it has no context, no relevance, no meaning Its too hard to understand what its trying to tell us and we certainly can’t deal with it at high speed and high volume. In short, this data isn’t for humans.
  4. Given we have this slightly meaningless, hard to understand data we need to spend time actually making it make sense. This processing and interpretation of data takes time, sometimes it takes lots of time. This time delay between raw data and useful data that humans can deal with means that the data loses its real value. Tells us why something happened, yesterday isn’t much use really. Sure, we can change processes after the fact and we can move thresholds but ultimately, the situation has passed, the crisis has either happened or been averted, no thanks to data. When we process this raw data to turn it into something that makes any kind of sense to a human, the processing takes time. Often, in this processing time, the data loses its value.
  5. Thinking critically about the situation, because we perhaps don’t really know what we want from our data, we have a temptation to ‘record everything and figure it out later’ with an added splash of ‘throw some computing power at it’. However, when we really think about it, our approach of recording everything, actually gives us nothing of any real value… sure, we have a LOT of no real value.. But lots of nothing useful is still.. Nothing useful.
  6. When we are overloaded with data, the wrong kind of data, at the wrong time, its very difficult to come to any kind of conclusion about anything And when you can’t reach a conclusion, you will struggle to make any kind of decision And if we can make a decision, we probably wont be able to DO very much of anything And after all, that what most businesses rely on employees doing… making decisions… preferably, informed decisions.
  7. So, we need to tame the data deluge, we need to make sense of it all. Big Data seems a likely candidate! Big data is a term used to describe the collection, processing and availability of huge volumes of streaming data in realtime. - More data at your disposal! - Faster! - Remove the silo’s, break down the walls! Three V’s are used to characterise this change in approach “volume, velocity, variety” - Doug Laney Combine the streams to identify: Correlation Causation Statistically Valid Models And make more accurate decisions. Ok, so we have a way forward, it sounds like we can use big data to solve our problems …. I’m pretty sure the ghostbusters said crossing the streams was a bad idea!
  8. Retitle to ‘Pick a Tool.. Any tool!’ or ‘Choose your Weapon!’ Many Tools Crossed Purposes Unclear application There are perhaps as many big data tools as there are articles about big data. The array is dizzying.. Too many to choose from. No one size fits all, some are complimentary, some don’t play well together, some have overlapping features. So what should we do?
  9. When things get complicated, we used to call a consultant, now we need a Data Scientist. All the cool kids are doing it, add a splash of predictive analytics and good things happen. If all goes well, your big data project *might* bear fruit.. “Leverage the hidden connections in your data for new competitive advantages” .. Maybe In scientific analysis, running ALL the tests WILL allow you to find SOMETHING… but it probably wasn’t what you were looking for, and you don’t understand why its significant. Running ALL the tests is generally frowned upon. “It is not enough to do your best, you must know what to do, and then do your best”. Mention the Data Science twitter account – it spews out masses of tweets about big data, data science and many related things all day every day. Its like its trying to become its own source of big data. Lots of noise, very little signal… and if these are the people who are supposed to champion Big Data… what hope do we actually have! There is so much hyperbole and a lot of it is created by the people we look to for clarity! @BigDataScience – 95 tweets in 24 hours. 29000 tweets over all time…what!!!!!! If you DO get a Data Scientist.. Get a good one!
  10. Many things are needed for a successful big data project. Time to understand the problem, and the potential uses for your business.. No one size fits all! Money to build the infrastructure and deploy the tooling Skills to make use of the tooling your deploy and apply it to your data And actually… you probably don’t have enough data to call it true ‘Big Data’…..
  11. And this reminds me of the The Big Data strategy of many companies. “Get the data and we’ll figure out what to do with it later.” “Right Tools, Right Skills… what was the question again?” Big Data is over hyped Its confusing Its generating more data than we know what to do with Its about tech and machines, not people and doing Right now, Its abstract, not practical It can tell you WHY something is happening, but its usually after the fact.. And for most people, after the fact is near useless.. Most businesses are not that mature Maybe its time to take a breath and think about an alternative…
  12. In many ways, Small data is better defined by what we don’t want.
  13. I don’t WANT “everything”. I don’t want every possible piece of data about every conceivable thing. I’m not dealing with “might be interesting”, or “may look at it later” or “could be related”. My questions are much more specific – I’m not really concerned with hypothesis.
  14. I don’t WANT to “interpret”. I don’t want to go dot to dot
  15. I don’t want to infer context, I don’t want to guess about what is happening
  16. I WANT bigger bits of USEFUL data I want it digestable I want it readable I want it meaningful
  17. I WANT to focus on WHAT, NOT why. Immediacy is important I’m focussed on what is happening right now, I don’t want to consider why things are happening, at least not yet. There is much more value in knowing where I am right now rather than in understanding how I got here and where I might be going in the long term. I want to concentrate on the things that are important to me, the things that will support my decision making, now.
  18. In actual fact, we want INFORMATION, not data. We want to create a drinking fountain instead of a fire hose. We want to reduce the pressure. We want to control what we drink and how often we drink it. We want it accessible, timely, manageable and meaningful.
  19. Small Data, as maybe you would hope, started out as a couple of principles and practices that we found emerging as we worked on a couple of data related projects. When we started, it didn’t have a name, it was just things we did that made our work more useful and our projects more meaningful. After a while we started to look for some kind of formal definition because, you know, everyone likes to have a name for their ideas and everyone loves a bit of validation around they way they work. After a bit of digging we found this definition by Allen Bonde, which seemed to fit the bill really nicely. It fits with a few things that I often mumble when I’m working.
  20. … and of course, there is always that one guy that is never happy… In my case, that guy, quite regularly is Ben Foster. When we are working, we often ask ourselves the question “So What?”, we’ve got all this data, we’ve refined it, we’ve turned it into information.. Now what? What’s the point? The point is, as Ben so ably pointed out when I showed him the first draft of this slide, is that information is only useful if you do something with it.. So we extended the definition a little. Being empowered to take action is very important to us. It’s the whole point actually.
  21. Real-time Tells us what is happening, now Empowers us to makes a difference, now Traditional BI implementations have reduced the amount of time it takes to ETL data, the time it takes to get from data to information but they are still essentially, ‘after the fact’ systems. Big data originally sacrificed query speed for sheer richness of data. This query speed is dropping all the time, even for ad-hoc queries, but there is still a way to go. For many of us, we want to know about a significant business event ‘as it happens’. We need to know what has happened, when it has happened. We need to know this so we can take action to exploit opportunities and tackle problems. In this kind of business, 5 minutes ago it probably too late. In a small data system we work to narrow the focus down to critical items, the things that we really need to know, so that we can see them happen in real-time with absolute clarity.
  22. Sorting the wheat from the chaff Narrowing the focus Highlighting the important and ignoring the irrelevant. Detect, Prioritise, Highlight Being able to do the day job rather than spending time analysing and interpreting Managing by exception Meaningful ======= More Signal, Less Noise Understandable Information Highlights Business Events Includes Business Context Relevant and Actionable No interpretation needed We are actively informed when things we say are important happen. Each event contains enough business context for it to make sense to a human Enough information is provided to allow us to take action on what it tells us.
  23. Alerts and Notifications Flexible Definitions of what is important Presented for conclusions, not investigations Quickly accessible by the people that in need it in a form they can understand, with enough supporting information for it to make sense quickly. Organised & Packaged Accessible Contextual Relevant Packaged and Presented for Humans Available when and where its needed
  24. Intuitive Less Analysis, More Action Decide and do more Move Faster Prevent Problems Close the decision loop faster Decide… then DO!
  25. Overview of the technical steps – Ben to discuss as he demo’s? We will show one implementation but the process of translation, detection, filtering, prioritisation and action is generic and can be applied to any business. Relevance – in many connected systems, we see data for other systems which we can chose to take or ignore, we may have a better source, so we need to know if its relevant to us. Importance - is the application of context – for example, We can have high volume betting on popular events (FA Cup) but there IS a threshold that makes the volume important, even for that event. Information – we put the bits together and present it as a human readable piece of information – no digging – a straight up lump of information and we can make an informed decision Action – We have to do something about what we are told… otherwise.. Whats the point?! The computer maybe can’t help us with this.. But it can prompt us!
  26. So, we’ve heard a lot of theory. Probably time to show this thing in action. In the past, we’ve implemented ‘Small Data’ systems for a couple of international airports, a couple of international telecoms operators and one national rail infrastructure provider. For reasons of international copyright I must say that what you are about to see… is none of those things…. The principles are the same, the technology is conceptually similar, but not the same ..and the names have been changed to protect the innocent…
  27. High Betting Volume High Betting Value Marker player bet – multiple marker player bets and betting pattern can also be important. Mention self exclusions.. Would be good to model, but they simply wouldn’t be allowed to place a bet. (When the Fun stops, stop) In the modern bookmaking world an awful lot of ‘trading’ activity as its known is automated. Data is supplied from a range of ‘feed providers’ who tell us what is happening at a sporting event in realtime. This data is translated and fed into a algorithmic model that determines the odds (also known as the prices) that we offer to customers. These prices are then offered to sports books in a range of locations (shops in many countries, online and mobile) and customers can place their bets. Most of the time, we let the computers do their thing and all is well with the world. However, like most other businesses, we do prefer to make money rather than lose it. Our customers feel differently. In some situations we would like our traders to override what the automated systems do: If we have suspect betting behaviour such as high bet volumes or high bet amounts on any one event, we would want to trade manually If we have significant amounts bet by ‘marker’ customers or more than one marker customer then we may want to trade manually to limit our exposure (Racing Post for example) If we have simply taken a lot of legitimate bets and created a high liability for the company, we may want to trade manually, either altering the odds or suspending trading. However, with some kind of sport happening in the world 24 hours a day and your average football match having approximately 200 ‘things’ you can bet on, keeping track of where the traders should focus their attention can be very difficult and the picture changes every second. This seems like a good candidate for Small Data.
  28. Its Mature Technology You can make it happen today Take baby steps, focus on the smallest thing that will deliver value to your business Its not Rocket (or Data) Science Focus on Business Problems, not Technology Solutions Go try it now!
  29. Small Data and Big Data are not mutually exclusive Small Data or Big Data is in some regards, a question of maturity. Collect the data you need *now* - grow it organically Go Big.. Later, but I’m betting that will be much later. You might want to consider a lambda architecture where data is processed according to need. There is a speed layer for analysing and presenting the things you absolutely must know *now* (although accuracy is compromised). There is a batch layer for things you don’t mind waiting for but that must be right and a serving layer that combines the speed and batch layers to present the most accurate, most up to date picture it can. It tries to be the best of both worlds but it has inherent complexity.
  30. Earlier we saw the definition of small data. If we think a little bit bigger than a strict definition, I think there are some guiding principles for small data and taking a little inspiration for the Agile Manifesto I think we can get close to some really good guiding principles. We’re all people, we all want to think we can make a difference in the world. By focusing on Information, not data, we can make a difference, By focussing on interacting with people rather than slavishly following process, we can make a difference. By spending our time on taking appropriate action, not performing deep analysis, we can make a decision. Small Data can help you make informed decisions… try it.
  31. Definition of Small Data https://en.wikipedia.org/wiki/Small_data Structure ====== A. Illustration of the Problem ----------------------------------- Instead of introducing myself, provide a huge stream of data. Who I am – DNA information Where I’m From – Stream of GPS tracking data (journey to belgium) Where I’m Going – Stream of calendar information and GPS information about the coming weeks – meetings etc What I do – Show a steam of code and build information What I like – Stream of purchase information *Maybe play that information as a sound at the start to get their attention* Then ask – did you get that? All the data is there? Did you miss it? I could play it again? So, despite having all the data, you missed the important stuff Who I am Where I’m From Where I’m Going What I do What I like You missed the *information* because you don’t have the skills, technology, time, money or quite frankly the motivation to do you own analysis Where are we now – Data, Data Everywhere ---------------------------------------------------------- Stats about the number of devices, the amount of data (something nice and animated like the WW2 presentation) (they rarely fail due to technology) The situation isn’t going to get any better… how fast can you analyse? Big data tools are getting better but its an arms race. Categorise data analysis, business intelligence and what we have.. What actually is that called??! https://www.promptcloud.com/blog/business-intelligence-Vs-data-analytics/ C. Big Data – Jam Tomorrow – Why it doesn’t work -------------------------------------------------------------- Its over hyped Its confusing Its generating more data than we know what to do with Its about tech and machines, not people and doing It can tell you WHY something is happening, but its usually after the fact.. And for most people, after the fact is near useless.. Most businesses are not that mature Its abstract, not practical (How many talks of big data in practical use Vs talks about Big Data Tech are there?) – we’re always talking, not doing. Big Data appears to be generating its own big data – so many standards, so many products, so many articles, so much talk. But who is doing anything with it. The short answer is, some of the bigger companies are, but for your average company, Big Data is just a buzzword that has little practical impact. (Smaller companies fall back on traditional BI, not ideal and usually backwards looking) E. If not big data then what? - Small Data -------------------------------------------------- Take Small Steps Aim at the important stuff Big Data later, potato F. Just Do It – How to do it -------------------------------- Demo – Flights, Football, Garden Centre (put our money where our mouth is… we need data and we need a problem and it needs to be dynamic Discuss the stages of the demo – use Data, Data Everywhere content – Acquire, G. Summary ======== https://en.wikipedia.org/wiki/Small_data  "Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged “Small Data is about people, Big data is about machines” We have the means to do it now You can make it happen Its not rocket science Its about business problems, not tech solutions Doesn’t exclude Big Data later Information, not Data People, not Process Action, not Analysis Resources ====== https://www.promptcloud.com/blog/business-intelligence-Vs-data-analytics/
  32. Definition of Small Data https://en.wikipedia.org/wiki/Small_data Structure ====== A. Illustration of the Problem ----------------------------------- Instead of introducing myself, provide a huge stream of data. Who I am – DNA information Where I’m From – Stream of GPS tracking data (journey to belgium) Where I’m Going – Stream of calendar information and GPS information about the coming weeks – meetings etc What I do – Show a steam of code and build information What I like – Stream of purchase information *Maybe play that information as a sound at the start to get their attention* Then ask – did you get that? All the data is there? Did you miss it? I could play it again? So, despite having all the data, you missed the important stuff Who I am Where I’m From Where I’m Going What I do What I like You missed the *information* because you don’t have the skills, technology, time, money or quite frankly the motivation to do you own analysis Where are we now – Data, Data Everywhere ---------------------------------------------------------- Stats about the number of devices, the amount of data (something nice and animated like the WW2 presentation) (they rarely fail due to technology) The situation isn’t going to get any better… how fast can you analyse? Big data tools are getting better but its an arms race. Categorise data analysis, business intelligence and what we have.. What actually is that called??! https://www.promptcloud.com/blog/business-intelligence-Vs-data-analytics/ C. Big Data – Jam Tomorrow – Why it doesn’t work -------------------------------------------------------------- Its over hyped Its confusing Its generating more data than we know what to do with Its about tech and machines, not people and doing It can tell you WHY something is happening, but its usually after the fact.. And for most people, after the fact is near useless.. Most businesses are not that mature Its abstract, not practical (How many talks of big data in practical use Vs talks about Big Data Tech are there?) – we’re always talking, not doing. Big Data appears to be generating its own big data – so many standards, so many products, so many articles, so much talk. But who is doing anything with it. The short answer is, some of the bigger companies are, but for your average company, Big Data is just a buzzword that has little practical impact. (Smaller companies fall back on traditional BI, not ideal and usually backwards looking) E. If not big data then what? - Small Data -------------------------------------------------- Take Small Steps Aim at the important stuff Big Data later, potato F. Just Do It – How to do it -------------------------------- Demo – Flights, Football, Garden Centre (put our money where our mouth is… we need data and we need a problem and it needs to be dynamic Discuss the stages of the demo – use Data, Data Everywhere content – Acquire, G. Summary ======== https://en.wikipedia.org/wiki/Small_data  "Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged “Small Data is about people, Big data is about machines” We have the means to do it now You can make it happen Its not rocket science Its about business problems, not tech solutions Doesn’t exclude Big Data later Information, not Data People, not Process Action, not Analysis Resources ====== https://www.promptcloud.com/blog/business-intelligence-Vs-data-analytics/