SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Australian CIO Summit 2014
28 – 30 July 2014
Bigger and Better: Employing a Holistic Strategy for
Big Data toward a Strong Value-Adding Proposition
Patrick Hadley
Chief Information Officer
Australian Bureau of Statistics
Not Another ‘Big Data’ Presentation
(‘V’ is not the only letter in the alphabet!)
Or, to put it another way………
The promise
Big data is at the foundation of all the megatrends that are happening today,
from social to mobile to the cloud to gaming. - Chris Lynch, ex Vertica CEO
“Big Data is a tidal wave, which in the next decade will create consumer –
and producer – value in almost every major sector of the economy” Philip
Evans
“….a tremendous wave of innovation, productivity and growth… all driven by
big data” McKinsey
“Big Data: A Revolution that Will Transform how We Live, Work, and Think”
Viktor Mayer-Schönberger and Kenneth Cukier. 2013.
Big data is like teenage sex: everyone talks about it, nobody really
knows how to do it, everyone thinks everyone else is doing it, so
everyone claims they are doing it...
Dan Ariely, 2013
In God we trust; all others must bring data.
W.E. Deming
Or, the reality…….
Agenda
• What is Big Data (3/4/5/6 v’s)
• Sources of Data
• Data as an asset
• Open Data
• Opportunities…..applications…..benefits
• Data Management
• Data Analytics; technologies
• Security
• Privacy
• Skills and capabilities
• …… and on
Agenda
• What is Big Data (3/4/5/6 v’s)
• Sources od Data
• Data as an asset
• Open Data
• Opportunities…..applications…..benefits
• Data Management
• Data Analytics; technologies
• Security
• Privacy
• Skills and capabilities
• …… and on
Today ………
• The use of Big Data in official statistics
• ABS initiatives, experiences and capabilities
• Learnings: Towards a strong value- adding proposition
Big Data in Official Statistics
The vision…..
A richer, more dynamic statistical picture of Australia;
Opportunity: reduce costs; improve quality
Sources of Data
• digital descriptions of the physical environment
• sensors and other devices
• communications networks
• individual behaviour and information
• digitisation of commerce and supply chains
High potential data sources
• Telecom
• Utilities
• Retailers
• Financial sector
• Satellite
• Other
Example: Telecom data applications
• small area population estimates
• service populations
• travel patterns
• seasonal population movements
• event populations
• internet use……
How do we ?
o identify characteristics of handset owners?
o turn handset counts into people
Initiate exploratory R&D
Targeted streams of investigation
 Use of satellite imagery to determine land utilisation
 Use of integrated demographic data for small area
modelling of unemployment
 Use of mobile device messaging records for real time
estimation of service populations
Progress the methodological framework and trial new
technology approaches
 Machine learning
 Multidimensional data visualisation
 Distributed computing
 Open linked data
Big Data challenges
• Data quality
• Data volatility and stability
• Data representativeness
• Data dimensionality
• Statistical modelling and inference
Data quality
Big Data sets/streams are generally noisy and often
unstructured – they need to undergo non-trivial filtering and
cleaning process before they can be used
Balancing the complexity of the cleaning process with the
information value of the obtained results is significant issue
What methods can be used for noise reduction?
How do we deal with missing data?
Data volatility and stability
Streaming data may fluctuate over short time frames
Data sources themselves may change or disappear
What becomes of time series in a world where data streams
and sources are transient?
Data representativeness
How representative are the data from emerging Big Data
sources of the phenomena we are trying to measure?
How do we determine whether there are hidden biases?
What methods can be used to reduce the volume of data while
retaining the information value of the data and statistical
validity of the analysis?
Data dimensionality
Dimensionality is a significant and challenging aspect of
“bigness”
Dimension has an impact on
 Storage of data
 Processing and analysis of data
Existing storage and computational paradigms fail badly
Statistical modelling and inference
How can population characteristics be determined?
 What is the population? In many cases this is not known (e.g.
Twitter)
 Can we draw a sample and calculate descriptive statistics?
How do we avoid apophenia?
 Seeing meaningful patterns and connections where none exist
 The number of fake correlations grows with the number of
variables
“To understand is to perceive patterns.” – Isaiah Berlin
From ‘V’ (what) to ‘C’ (how)
‘What’ has changed about data?
Vs: Volume, Velocity, Variety, Veracity,
Volatility
‘How’ will we change?
Cs: Creating, Computing,
Comprehending, Competing,
Collaboration
Big Data ‘C’s and the ABS - CREATING
The world is CREATING data like never before and every
individual, household and business we interact with will change in
data creation:
• The Internet of Things (M2M) becomes the ‘Internet of
Everything’
• Sometimes called the 4 internets: people, things, information,
places are all network addressable, most have data
producing/collecting/transmitting capability
Big Data ‘C’s and the ABS - COMPUTING
COMPUTING data like never before. Some examples:
• emerged from Web-scale problems such as search engines with
new solutions such as key-value databases (Hadoop, NOSQL DBs
• advanced computation algorithms and approaches become
‘popularised’ e.g. machine learning approaches, automated
visualisation and explanations systems, data mining/discovery,
semantic (knowledge) representation and reasoning systems
requiring ‘search’
• statistical analysis-as-a-service e.g. auto-coding, confidentiality,
time series analysis, etc
• distributed/parallel computation for low-cost multi-core, multi-
socket, multi-computers, in-memory computation technologies
• embedded processors, sensors/RFIDs/GPS/SIM
• the ‘logical data warehouse’
Big Data ‘C’s and the ABS - COMPREHENDING
COMPREHENDING/CONSUMING data requiring new tools in the ABS kit bag:
• tables – static and data consumer dynamically defined (ABS.stat, REEM Table
Builder) in standard XML formats like SDMX
• visualisation – for internal ABS insight, for our ‘retail’ dissemination, ‘smart’ insight
where software suggests the best way to see data: ‘telling the story’
• narrative – table to text production (auto produce media release & part of main
features):
• voice – text to speech to read narrative & data for Accessibility speech to text for
NIRS analysis
• semantic data outputs in OWL/RDF
• hybrid of above – to add value to information, for ABS data consumers to enhance
comprehension
• data streams – data-as-a-service for M2M (the ABS public Web services library) ,
could be called ‘the embedded ABS’
and all this with adaptive/responsive design for multiple end-points devices types!!!
Big Data ‘C’s and the ABS - COMPETING
COMPETING with data, to obtain it and use it for competitive
advantage
• In some subject-matter areas there is more competition. Who
can make a statistical index ? Anyone with a spreadsheet;
• Who else wants to be influential in and/or monetarise statistics?
• Everyone else starts to understand INFONOMICS
• More ‘agent’ data sources for ABS as we may not have a the
capability to collect (full) unit record ‘big data’?
Big Data ‘C’s and the ABS : COLLABORATING
In ABS
In Government
In Academia
Across the international statistical community
ABS Capabilities, expertise
• collect and process large quantities of data
• data ‘cleansing’
• data standards and framework
• data integration
• methodological techniques
• strong analytical capability
• sophisticated web based dissemination system
• data quality framework
ABS Big Data Challenges
Business Benefit
Validity of Statistical Inference
Privacy and Public Trust
Data Integrity
Data Ownership and Access
Computational Efficacy
Technology Infrastructure
(Source: “Big data and the ABS – from ideas to action”, ABS MM paper, Oct 2013)
Value explained?
Summary - considerations
• Value :
• what’s the proposition
• what’s the question
• Strategy; plan, investments
• Data sources & acquisition
• Eyes open – data challenges
• Build capabilities: V’s to C’s
Questions?

Contenu connexe

Tendances

Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
Matthew Urdan
 
Tools and techniques adopted for big data analytics
Tools and techniques adopted for big data analyticsTools and techniques adopted for big data analytics
Tools and techniques adopted for big data analytics
JOSEPH FRANCIS
 

Tendances (20)

Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Dr Ohad Barzilay
Dr Ohad BarzilayDr Ohad Barzilay
Dr Ohad Barzilay
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big data
Big dataBig data
Big data
 
Big data in transport an international transport forum overview oct 2013
Big data in transport    an international transport forum overview oct 2013Big data in transport    an international transport forum overview oct 2013
Big data in transport an international transport forum overview oct 2013
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
Tools and techniques adopted for big data analytics
Tools and techniques adopted for big data analyticsTools and techniques adopted for big data analytics
Tools and techniques adopted for big data analytics
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunity
 
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of thingsBig Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
Data Storytelling
Data StorytellingData Storytelling
Data Storytelling
 
Big Data and The Future of Insight - Future Foundation
Big Data and The Future of Insight - Future FoundationBig Data and The Future of Insight - Future Foundation
Big Data and The Future of Insight - Future Foundation
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 

En vedette

En vedette (8)

Home Hunter
Home Hunter Home Hunter
Home Hunter
 
Australian CIO Summit 2012: Mobility @ Southern Health by Dr. Philip Nesci
Australian CIO Summit 2012: Mobility @ Southern Health by Dr. Philip NesciAustralian CIO Summit 2012: Mobility @ Southern Health by Dr. Philip Nesci
Australian CIO Summit 2012: Mobility @ Southern Health by Dr. Philip Nesci
 
Australian CIO Summit 2012: DEFINING THE CURVE by Dr Gerry McCartney
Australian CIO Summit 2012: DEFINING THE CURVE by Dr Gerry McCartneyAustralian CIO Summit 2012: DEFINING THE CURVE by Dr Gerry McCartney
Australian CIO Summit 2012: DEFINING THE CURVE by Dr Gerry McCartney
 
Australian CIO Summit 2012: Architecting a Secure Castle in the Clouds by Dr ...
Australian CIO Summit 2012: Architecting a Secure Castle in the Clouds by Dr ...Australian CIO Summit 2012: Architecting a Secure Castle in the Clouds by Dr ...
Australian CIO Summit 2012: Architecting a Secure Castle in the Clouds by Dr ...
 
Australian CIO Summit 2012: Driving High Value, Low Cost Business Intelligenc...
Australian CIO Summit 2012: Driving High Value, Low Cost Business Intelligenc...Australian CIO Summit 2012: Driving High Value, Low Cost Business Intelligenc...
Australian CIO Summit 2012: Driving High Value, Low Cost Business Intelligenc...
 
Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Cha...
Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Cha...Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Cha...
Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Cha...
 
A New Approach to the CIO role by Redefining the IT Department’s Contribution...
A New Approach to the CIO role by Redefining the IT Department’s Contribution...A New Approach to the CIO role by Redefining the IT Department’s Contribution...
A New Approach to the CIO role by Redefining the IT Department’s Contribution...
 
Australian CIO Summit 2012: Implementing change in The Westpac Group by Jim B...
Australian CIO Summit 2012: Implementing change in The Westpac Group by Jim B...Australian CIO Summit 2012: Implementing change in The Westpac Group by Jim B...
Australian CIO Summit 2012: Implementing change in The Westpac Group by Jim B...
 

Similaire à Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition

An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
Pavankalayankusetty
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
Trillium Software
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
Raul Chong
 

Similaire à Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition (20)

Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
Monetize Big Data
Monetize Big DataMonetize Big Data
Monetize Big Data
 
Data Governance in the Big Data Era
Data Governance in the Big Data EraData Governance in the Big Data Era
Data Governance in the Big Data Era
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
 
Digitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for BanksDigitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for Banks
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
The New Convergence of Data; The Next Strategic Business Advantage
The New Convergence of Data; The Next Strategic Business AdvantageThe New Convergence of Data; The Next Strategic Business Advantage
The New Convergence of Data; The Next Strategic Business Advantage
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Transformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big DataTransformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big Data
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
Big Data et eGovernment
Big Data et eGovernmentBig Data et eGovernment
Big Data et eGovernment
 
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupCrowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 

Plus de IT Network marcus evans

Plus de IT Network marcus evans (20)

How CIOs Can Bridge the Gap Between Executive Leadership and IT Teams - Greg ...
How CIOs Can Bridge the Gap Between Executive Leadership and IT Teams - Greg ...How CIOs Can Bridge the Gap Between Executive Leadership and IT Teams - Greg ...
How CIOs Can Bridge the Gap Between Executive Leadership and IT Teams - Greg ...
 
How the IT Function Can Enable the Organisation to Achieve its Goals - Anupam...
How the IT Function Can Enable the Organisation to Achieve its Goals - Anupam...How the IT Function Can Enable the Organisation to Achieve its Goals - Anupam...
How the IT Function Can Enable the Organisation to Achieve its Goals - Anupam...
 
What CIOs Need to Know about the Future of Technology - Steve Sammartino, Fu...
What CIOs Need to Know about the Future of Technology  - Steve Sammartino, Fu...What CIOs Need to Know about the Future of Technology  - Steve Sammartino, Fu...
What CIOs Need to Know about the Future of Technology - Steve Sammartino, Fu...
 
The Low Risk Way to Expanding a Business into South East Asia Joe Fussell & D...
The Low Risk Way to Expanding a Business into South East Asia Joe Fussell & D...The Low Risk Way to Expanding a Business into South East Asia Joe Fussell & D...
The Low Risk Way to Expanding a Business into South East Asia Joe Fussell & D...
 
Why IT Systems Need to Conduct IT System Penetration Tests - Chris Gatford, N...
Why IT Systems Need to Conduct IT System Penetration Tests - Chris Gatford, N...Why IT Systems Need to Conduct IT System Penetration Tests - Chris Gatford, N...
Why IT Systems Need to Conduct IT System Penetration Tests - Chris Gatford, N...
 
Gestión, Ejecución, y Eficiencia a Escala Panregional. Desafíos a Superar-Ant...
Gestión, Ejecución, y Eficiencia a Escala Panregional. Desafíos a Superar-Ant...Gestión, Ejecución, y Eficiencia a Escala Panregional. Desafíos a Superar-Ant...
Gestión, Ejecución, y Eficiencia a Escala Panregional. Desafíos a Superar-Ant...
 
Time Machines: The Evolution and Application of Predictive Analytics-Dr Steve...
Time Machines: The Evolution and Application of Predictive Analytics-Dr Steve...Time Machines: The Evolution and Application of Predictive Analytics-Dr Steve...
Time Machines: The Evolution and Application of Predictive Analytics-Dr Steve...
 
Data Breaches and Security: Ditching Data Disasters-Michael McNeil, Philips H...
Data Breaches and Security: Ditching Data Disasters-Michael McNeil, Philips H...Data Breaches and Security: Ditching Data Disasters-Michael McNeil, Philips H...
Data Breaches and Security: Ditching Data Disasters-Michael McNeil, Philips H...
 
How CIOs Can Execute Change Programmes Successfully - Melissa Bell news release
How CIOs Can Execute Change Programmes Successfully - Melissa Bell news releaseHow CIOs Can Execute Change Programmes Successfully - Melissa Bell news release
How CIOs Can Execute Change Programmes Successfully - Melissa Bell news release
 
Transitioning to a Digital Enterprise - Dan Hushon News Release
Transitioning to a Digital Enterprise -  Dan Hushon News ReleaseTransitioning to a Digital Enterprise -  Dan Hushon News Release
Transitioning to a Digital Enterprise - Dan Hushon News Release
 
Grow Your Business
Grow Your Business Grow Your Business
Grow Your Business
 
The one-on-one meetings with potential customers is what matters most
The one-on-one meetings with potential customers is what matters mostThe one-on-one meetings with potential customers is what matters most
The one-on-one meetings with potential customers is what matters most
 
Where marcus evans fits in our business development mix
Where marcus evans fits in our business development mixWhere marcus evans fits in our business development mix
Where marcus evans fits in our business development mix
 
Crafting the Right Mobile Device Management Framework to Mitigate Risks and M...
Crafting the Right Mobile Device Management Framework to Mitigate Risks and M...Crafting the Right Mobile Device Management Framework to Mitigate Risks and M...
Crafting the Right Mobile Device Management Framework to Mitigate Risks and M...
 
Adaptive Transformation: Transitioning from Resource to Flow Efficiency
Adaptive Transformation: Transitioning from Resource to Flow Efficiency Adaptive Transformation: Transitioning from Resource to Flow Efficiency
Adaptive Transformation: Transitioning from Resource to Flow Efficiency
 
The Shifting Role of the CIO as a Strategic Innovator
The Shifting Role of the CIO as a Strategic InnovatorThe Shifting Role of the CIO as a Strategic Innovator
The Shifting Role of the CIO as a Strategic Innovator
 
Active Defence: Safeguarding Crucial Capability while Boosting Functionality ...
Active Defence: Safeguarding Crucial Capability while Boosting Functionality ...Active Defence: Safeguarding Crucial Capability while Boosting Functionality ...
Active Defence: Safeguarding Crucial Capability while Boosting Functionality ...
 
Outsourcing to Save IT Costs: Interview with: George Bower, President and Chi...
Outsourcing to Save IT Costs: Interview with: George Bower, President and Chi...Outsourcing to Save IT Costs: Interview with: George Bower, President and Chi...
Outsourcing to Save IT Costs: Interview with: George Bower, President and Chi...
 
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...
 
How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Di...
How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Di...How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Di...
How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Di...
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition

  • 1. Australian CIO Summit 2014 28 – 30 July 2014 Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition Patrick Hadley Chief Information Officer Australian Bureau of Statistics
  • 2. Not Another ‘Big Data’ Presentation (‘V’ is not the only letter in the alphabet!)
  • 3. Or, to put it another way………
  • 4. The promise Big data is at the foundation of all the megatrends that are happening today, from social to mobile to the cloud to gaming. - Chris Lynch, ex Vertica CEO “Big Data is a tidal wave, which in the next decade will create consumer – and producer – value in almost every major sector of the economy” Philip Evans “….a tremendous wave of innovation, productivity and growth… all driven by big data” McKinsey “Big Data: A Revolution that Will Transform how We Live, Work, and Think” Viktor Mayer-Schönberger and Kenneth Cukier. 2013.
  • 5. Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... Dan Ariely, 2013 In God we trust; all others must bring data. W.E. Deming Or, the reality…….
  • 6. Agenda • What is Big Data (3/4/5/6 v’s) • Sources of Data • Data as an asset • Open Data • Opportunities…..applications…..benefits • Data Management • Data Analytics; technologies • Security • Privacy • Skills and capabilities • …… and on
  • 7. Agenda • What is Big Data (3/4/5/6 v’s) • Sources od Data • Data as an asset • Open Data • Opportunities…..applications…..benefits • Data Management • Data Analytics; technologies • Security • Privacy • Skills and capabilities • …… and on
  • 8. Today ……… • The use of Big Data in official statistics • ABS initiatives, experiences and capabilities • Learnings: Towards a strong value- adding proposition
  • 9. Big Data in Official Statistics The vision….. A richer, more dynamic statistical picture of Australia; Opportunity: reduce costs; improve quality
  • 10. Sources of Data • digital descriptions of the physical environment • sensors and other devices • communications networks • individual behaviour and information • digitisation of commerce and supply chains
  • 11. High potential data sources • Telecom • Utilities • Retailers • Financial sector • Satellite • Other
  • 12. Example: Telecom data applications • small area population estimates • service populations • travel patterns • seasonal population movements • event populations • internet use…… How do we ? o identify characteristics of handset owners? o turn handset counts into people
  • 13. Initiate exploratory R&D Targeted streams of investigation  Use of satellite imagery to determine land utilisation  Use of integrated demographic data for small area modelling of unemployment  Use of mobile device messaging records for real time estimation of service populations Progress the methodological framework and trial new technology approaches  Machine learning  Multidimensional data visualisation  Distributed computing  Open linked data
  • 14. Big Data challenges • Data quality • Data volatility and stability • Data representativeness • Data dimensionality • Statistical modelling and inference
  • 15. Data quality Big Data sets/streams are generally noisy and often unstructured – they need to undergo non-trivial filtering and cleaning process before they can be used Balancing the complexity of the cleaning process with the information value of the obtained results is significant issue What methods can be used for noise reduction? How do we deal with missing data?
  • 16. Data volatility and stability Streaming data may fluctuate over short time frames Data sources themselves may change or disappear What becomes of time series in a world where data streams and sources are transient?
  • 17. Data representativeness How representative are the data from emerging Big Data sources of the phenomena we are trying to measure? How do we determine whether there are hidden biases? What methods can be used to reduce the volume of data while retaining the information value of the data and statistical validity of the analysis?
  • 18. Data dimensionality Dimensionality is a significant and challenging aspect of “bigness” Dimension has an impact on  Storage of data  Processing and analysis of data Existing storage and computational paradigms fail badly
  • 19. Statistical modelling and inference How can population characteristics be determined?  What is the population? In many cases this is not known (e.g. Twitter)  Can we draw a sample and calculate descriptive statistics? How do we avoid apophenia?  Seeing meaningful patterns and connections where none exist  The number of fake correlations grows with the number of variables “To understand is to perceive patterns.” – Isaiah Berlin
  • 20. From ‘V’ (what) to ‘C’ (how) ‘What’ has changed about data? Vs: Volume, Velocity, Variety, Veracity, Volatility ‘How’ will we change? Cs: Creating, Computing, Comprehending, Competing, Collaboration
  • 21. Big Data ‘C’s and the ABS - CREATING The world is CREATING data like never before and every individual, household and business we interact with will change in data creation: • The Internet of Things (M2M) becomes the ‘Internet of Everything’ • Sometimes called the 4 internets: people, things, information, places are all network addressable, most have data producing/collecting/transmitting capability
  • 22. Big Data ‘C’s and the ABS - COMPUTING COMPUTING data like never before. Some examples: • emerged from Web-scale problems such as search engines with new solutions such as key-value databases (Hadoop, NOSQL DBs • advanced computation algorithms and approaches become ‘popularised’ e.g. machine learning approaches, automated visualisation and explanations systems, data mining/discovery, semantic (knowledge) representation and reasoning systems requiring ‘search’ • statistical analysis-as-a-service e.g. auto-coding, confidentiality, time series analysis, etc • distributed/parallel computation for low-cost multi-core, multi- socket, multi-computers, in-memory computation technologies • embedded processors, sensors/RFIDs/GPS/SIM • the ‘logical data warehouse’
  • 23. Big Data ‘C’s and the ABS - COMPREHENDING COMPREHENDING/CONSUMING data requiring new tools in the ABS kit bag: • tables – static and data consumer dynamically defined (ABS.stat, REEM Table Builder) in standard XML formats like SDMX • visualisation – for internal ABS insight, for our ‘retail’ dissemination, ‘smart’ insight where software suggests the best way to see data: ‘telling the story’ • narrative – table to text production (auto produce media release & part of main features): • voice – text to speech to read narrative & data for Accessibility speech to text for NIRS analysis • semantic data outputs in OWL/RDF • hybrid of above – to add value to information, for ABS data consumers to enhance comprehension • data streams – data-as-a-service for M2M (the ABS public Web services library) , could be called ‘the embedded ABS’ and all this with adaptive/responsive design for multiple end-points devices types!!!
  • 24. Big Data ‘C’s and the ABS - COMPETING COMPETING with data, to obtain it and use it for competitive advantage • In some subject-matter areas there is more competition. Who can make a statistical index ? Anyone with a spreadsheet; • Who else wants to be influential in and/or monetarise statistics? • Everyone else starts to understand INFONOMICS • More ‘agent’ data sources for ABS as we may not have a the capability to collect (full) unit record ‘big data’?
  • 25. Big Data ‘C’s and the ABS : COLLABORATING In ABS In Government In Academia Across the international statistical community
  • 26. ABS Capabilities, expertise • collect and process large quantities of data • data ‘cleansing’ • data standards and framework • data integration • methodological techniques • strong analytical capability • sophisticated web based dissemination system • data quality framework
  • 27. ABS Big Data Challenges Business Benefit Validity of Statistical Inference Privacy and Public Trust Data Integrity Data Ownership and Access Computational Efficacy Technology Infrastructure (Source: “Big data and the ABS – from ideas to action”, ABS MM paper, Oct 2013)
  • 29. Summary - considerations • Value : • what’s the proposition • what’s the question • Strategy; plan, investments • Data sources & acquisition • Eyes open – data challenges • Build capabilities: V’s to C’s