SlideShare une entreprise Scribd logo
1  sur  67
Télécharger pour lire hors ligne
Data and Analytics Essentials
Christine CHEONG & Brandon NG
#ISSLearningFest
The Essential of Data and Analytics
• Data and Analytics
• Analytics Maturity Model
• Descriptive Analytics
• Predictive Analytics
• Prescriptive Analytics
#ISSLearningFest
Icons made by Vectors Market, http://www.flaticon.com/authors/vectors-market
is licensed by Creative Commons BY 3.0, http://creativecommons.org/licenses/by/3.0/
Data and Analytics
#ISSLearningFest
“Without data
we’re just another person with an
opinion.”
– W. Edwards Deming
http://www.meliorgroup.com/without-data-just-another-person-with-opinion/
Insights from Data
What you see
is not what you
get!
from
HiPPO = Highest Paid Personnel’s Opinion
to
Data Driven Decision Making Organization
Worlds Collide as Augmented Analytics Draws Analytics, BI and Data Science Together
Published: 10 March 2020 ID: G00463513, Analyst(s): Carlie Idoine
Descriptive/ Diagnostics/
Predictive/ Prescriptive
Analytics Models
Reports / Dashboards
Consumes
Produces
Descriptive/ Diagnostics/
Predictive/ Prescriptive
Analytics Models
Reports / Dashboards
Business Analyst /
Business Intelligence Analyst Data Analyst / Data Scientist
Analytics Role
Expanding,
Understanding &
Investigating
• Data scientist
• Data engineers
• Business analyst
Exploration &
Discovery
• Data scientist
• Data engineer
Foundational Core
(Core operational
processes)
• Business user
• Business analyst
Establishing Value
• Data engineer
• Business analyst
Data
Known Unknown
Business
Questions
Unknown
Known
Solve Your Data Challenges With the Data Management Infrastructure
Model, Refreshed: 3 April 2019 | Published: 19 October 2017 ID:
G00336474 Analyst(s): Adam Ronthal, Nick Heudecker
Data Analytics
Roles and Skills
Data Management
Infrastructure Model
Role of Analytics in Decision Management
Analytically
Assisted Decision
Making
Decision
Management
How Companies Succeed at Decision Management, Published: 19 October 2018 ID: G00341368, Analyst(s): W. Roy Schulte, Erick Brethenoux
Descriptive Analytics
- Explain what happened
Diagnostic Analytics
- Explore how it happened
Predictive Analytics
- Explore what is likely to happen
next or in the future
Prescriptive Analytics
- Specify what to do, or
automatically trigger a response
does not specify what to do
Worlds Collide as Augmented Analytics Draws Analytics, BI and Data Science Together
Published: 10 March 2020 ID: G00463513, Analyst(s): Carlie Idoine
Intersection & Interrelationship of
Data, Analytics and Decision
Decision
•Change, movement
Wisdom
•Understanding, integrated, actionable
Knowledge
•contextual, synthesized
Information
•Useful, organized, structured
Data
•Signals/know-nothing
Descriptive
Analytics
FUTURE
What Action?
- direction
What is the
best?
- Principles
PAST
Why?
- patterns
What?
- relationships
Diagnostics
Analytics
Predictive
Analytics
Prescriptive
Analytics Data Analysis
Artificial
Intelligence,
Machine
Learning,
Deep Learning,
etc
Data Integration
Big data, cloud
computing, etc
Data Collection
IoT, sensor
network, mobile
devices, etc
Descriptive Analytics
#ISSLearningFest
Hype Cycle for Analytics and Business Intelligence, 2022, 1
4 July 2022 - ID G00770971, By Analyst(s): Peter Krensky
Business Intelligence
and Analytics for
Decision-Making
Business
Intelligence
and Analytics
right
information
right person
right time
right
quantity
right
quality
right place
Effective Insights Communication and Persuasive
Presentation
with Dashboard Implementation using Tableau / Microsoft Power BI
Dashboard Design
#ISSLearningFest
https://www.tableau.com/resource/eye-tracking-study#HGmzx7C5EKRL3heV.99
Dashboard Design
https://www.tableau.com/resource/eye-tracking-study#HGmzx7C5EKRL3heV.99
Case Study:
Tackling Singapore’s Population
Challenges
DataLion Team,
Tackling Singapore’s Population Challenges,
2018 S2
Data Story Telling Framework Diagram
DataLion Team, Tackling Singapore’s Population Challenges, 2018 2 KE Unit 3,
Dashboard Design
System Architecture Diagram
Dashboard Design Mock-up
Dashboard Implementation (Tableau)
Predictive Analytics
Prediction, Forecasting
#ISSLearningFest
25
Hype Cycle for Data Science and Machine Learning, 2022,
29 June 2022 - ID G00770938, By Analyst(s): Farhan Choudhary, Peter Krensky
https://www.gartner.com/interactive/hc/4016149?ref=TypeAheadSearch
Marketing Decisions
Managerial decisions –whether to advertise, change prices, launch
a new product or service, assess impact of marketing and
communication effectiveness etc
• What factors or product formulation are important in driving
product/brand choice?
• What prices to charge for different range of products?
• Which advertising messages/campaigns are effective in
deepening engagement with stakeholders?
• Which customer/stakeholder segments should we target to drive
conversion and/improve profitability?
Marketing Trends
#ISSLearningFest
• Marketing-mix decisions are increasingly made quantitatively instead
of qualitatively. Pricing decisions are routinely made using dynamic
quantitative models, so are assortment, channel, and location decisions.
• Customer Engagement - Machine learning algorithms extract consumer
preferences from massive online data, and help create engaging text
and images to attract attention; intelligent agents assist customer
engagement to improve experience.
• Search engine is where many customer journeys begin. While keyword has been the dominant form
of online search, machine learning methods are making searches based on other content types within
reach eg with voice recognition, natural language processing, and text-to-speech capabilities
• Recommending the right products to the interested consumers can significantly improve marketing
performance. Deep neural networks and embedding methods have been leveraged to further
enhance performance.
Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
Marketing Trends
• Product Development - Rapid experimentation and simulation for product and process innovation
• Go-to-market/commercialization - Real time analytics, dynamic pricing optimization, connected product innovation
• From Big Data to Small and Wide Data - statistical/machine learning, AI techniques
https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/why-tech-enabled-go-to-market-innovation-is-critical-for-industrial-companies
• Machine Learning arose as a subfield of Artificial Intelligence while Statistical Learning arose as a subfield of
Statistics. While statistical and econometric models with increasing levels of sophistication are being developed,
researchers have also turned to machine learning methods as a valuable alternative
• Machine Learning has a greater emphasis on large scale applications and prediction accuracywhile Statistical
Learning emphasizes models and their interpretability, and precision and uncertainty. But the distinction has become
more and more blurred, and there is a great deal of “cross-fertilization”.
• Balance between a theory-driven with a data-driven perspective by injecting human insights and domain
knowledge into the use of machine learning methods
Statistical and Machine Learning
Structured data is comprised of clearly defined data types with patterns that make
them easily searchable; while unstructured data is comprised of data that is usually not
as easily searchable, including formats like audio, video, and social media postings.
Statistical and Machine Learning Problems
Structured (eg quantitative demographic and behavioural data)
• Identify the effects of demographic and marketing data on insurance policy product purchase
• Predict housing prices based on sociodemographic and geospatial data
• Establish the relationship between marketing promotion (eg price, location, advertising etc on store level product sales)
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Statistical and Machine Learning Problems
Unstructured (eg image, text, speech/voice data etc)
• Customize an email spam detection system based on frequently occurring words (features).
• Predict user interaction and engagement on social media based on image / facial recognition
• Predict positive vs negative sentiments based on attributes of internet movie ratings
• data from 4601 emails sent to an individual (named George, at HP labs,
before 2000). Each is labeled as s
p
amor email.
• goal: build a customized spam filter.
• input features: relative frequencies of 57 of the most commonly occurring
words and punctuation marks in these email messages.
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Trends in Machine Learning
#ISSLearningFest
ML methods are well positioned to extract rich insights from rich data. While studies have frequently
analyzed text and image data, there are opportunities to focus on audio, video, and consumer tracking data, as
well as network data and data of hybrid formats.
Opportunities to broaden and extend usage of machine learning methods. While machine learning methods
have been used frequently for prediction and feature extraction, they can be harnessed for causal and
prescriptive analysis
Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
Trends in Machine Learning
#ISSLearningFest
Opportunities to broaden and extend ML usage in the entire customer purchase journey, to develop decision-
support capabilities covering all aspects of marketing functions, from more strategic areas like brand positioning and
competitive analysis to operational areas like customer satisfaction/service delivery
Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
Machine Learning Tasks
Supervised Learning (eg prediction and forecasting techniques)
• Outcome measurement Y (also called dependent variable, response, target) vs a set of predictors (features)
measured on a set of samples
• Regression vs Classification Problem
UnsupervisedLearning (eg segmentation and association)
• No outcome variable, just a set of predictors (features) measured on a set of samples.
• Find groups of samples that behave similarly, find features that behave similarly, find linear combinations of
features with the most variation. Useful as a pre-processing step for supervised learning.
Machine Learning Tasks
Semi-supervised Learning and Transfer Learning
• Semi-supervised – Output is known for only a subset of the data. The instances in the training dataset for which the output is
not observed are nonetheless used to improve learning eg through label propagation
• Transfer learning – Researchers leverage an existing model, trained using a different dataset or for a different
purpose. For example, image analysis where an existing model trained using a large set of images is updated
using the specific images of the research project
Active Learning
• Only limited training instances are available at first. The goal is to maximize the predictive accuracy while minimizing the
data requirement. Determining the most important instances is a key focus of active learning
• Reinforcement learning : The learning agent continuously interacts with the surrounding environment by taking actions and
observing feedback. The learning algorithm needs to determine the actions to take to both learn the environment’s
characteristics and craft optimal policy given the states.
Supervised Learning - Feature Selection Methods
Subset selection
We identify a subset of p predictors that we believe to be related to the response. We then fit a model using least
squares on the reduced set of variables.
Dimension Reduction
We project that p predictors into a M-dimensional subspace where M <p.
This is achieved by computing M different linear combinations or projections
of the variables. Then these M projections are used as predictors to fit a
linear regression model by least squares
Shrinkage (or Regularization) for large sparse data
We fit a model involving all p predictors, but the estimated coefficients
are shrunken towards zero relative to the least squares estimates. This
shrinkage (also known as regularization) has the effect of reducing
variance and can also perform variable selection
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Feature Selection (Flexibility vs Interpretability)
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Example: Income prediction based on socio-demographic survey data (eg age, education, seniority etc)
Feature Selection (Dimension reduction)
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Example: Income prediction on socio-demographic and geo-location data
Feature Selection (Regularization)
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Example: Credit risk assessment on demographic and behavioural data
Big data vary in shape. These call for different approaches
Big Data Learning Problems
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Big Data Learning Problems
Example: IMDB (internet movie database) ratings using machine/deep learning
RNN - https://web.cs.dal.ca/~shali/project2.html
Many data sources are sequential in nature, and call for special treatment when building predictive models. For example,
documents such as book and movie reviews, newspaper articles and tweets. We can use the sequence of words occurring in a
document to make predictions about the label for the entire document (eg positive or negative sentiment). Machine/deep
learning approaches eg recurrent neural networks can be used for classification, sentiment analysis, and language translation.
Big Data Learning Problems
Example: Image recognition in social media context using machine/deep learning
CNN - https://www.semanticscholar.org/paper/Toward-Large-Scale-Face-Recognition-Using-Social-Stone-Zickler/2f2d69bdfaca54eb3a6ede3e5eb2c76713bb8064
Neural networks rebounded around 2010 with big successes in image classification. Around that time, massive databases of
labeled images were being accumulated, with ever-increasing numbers of classes. A special family of convolutional neural
networks (CNNs) has evolved for classifying images on a wide range of problems. CNNs mimic to some degree how humans
classify images, by recognizing specific features or patterns anywhere in the image that distinguish each particular object class.
Example: Online shopping analysis using models on large sparse data (B2C)
From Big to Small and Wide Data
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
A marketing analyst interested in understanding people’s online shopping
patterns could treat as features all of the search terms entered by users of
a search engine. This is sometimes known as the “bag-of-words” model.
The same researcher might have access to the search histories of only a
few hundred or a few thousand search engine users who have consented to
share their information with the researcher. For a given user, each of the p
search terms is scored present (0) or absent (1), creating a large binary
feature vector. Then n ≈ 1,000 and p is much larger.
Example: Webpage browsing analytics using models on large sparse webpage session information (B2C)
Quantcast is a digital marketing company. Data are five-minute internet sessions. Binary target is type of family (≤ 2 adults vs
adults plus children). 7 million features of session info (web page indicators and descriptors). Divided into training set (54M),
validation (5M) and test (5M).
All but 1.1M features could be screened because ≤ 3 nonzero values. Fit 100 models in 2 hours in R
Richest model had 42K nonzero coefficients, and explained 10% deviance (like R-squared).
From Big to Small and Wide Data
Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
Observational vs Experimental Studies
In observational studies, researchers are only observers. They measure
what people do, or say they would do in a situation not of their making
(eg surveys and focus groups)
In contrast, when conducting experiments, researchers control the
important variables that influence consumer behavior to more precisely
observe the effect.
https://www.youtube.com/watch?v=qwfd8cf3_UY&feature=youtu.be
Observational studies/data
Data : Then and now
Data in the 80s-90s Data now
• Retail scanner
data
• Survey data
• Transactional/
• behavioral data
+ clickstream data
+ Social networking
+ Product review
+ Search data
+ Mobile
+ Text
Primary & secondary market research/trends (eg structured
vs unstructured data including social media)
Knowledge/Consumer immersion (eg observation
studies/ethnography, extracting value from connected products,
real-time analytics eg smart sensors etc)
Quantitative data (eg direct questioning, buy-response surveys,
transactional data)
Observational studies (ethnography)
https://www.youtube.com/watch?v=yjFkUqAeUq8
Ethnography is a type of observation research enabled by offline and online tools.
It is a convenient way for participants to share how they interact with products and services in their natural environment.
An Ethnographic Case Study of Ikea Shoppers
Observational studies (ethnography)
#ISSLearningFest
https://www.youtube.com/watch?v=yjFkUqAeUq8
Why experiments?
Experiments allow analysts to answer business questions related to
cause and effect.
It is important for the analyst to know whether she has an
“umbrella problem” or a “rain dance problem.” If all she wants
to know is whether or not she should carry an umbrella, then she
has a pure prediction problem and causal questions are of
secondary importance; she only needs to know whether the
probability of rain is high or low.
On the other hand, if there has been a long drought and she
wants to end it, prediction is of little value: causal questions are
of primary importance. If she wants to induce rainfall, she needs
to know what variables cause rain and then try to manipulate
those variables
Business Experiments
• A/B/n testing using hypothesis testing (eg compare landing pages
to see which one generates more sales)
• Multivariate analysis using predictive analytics (eg screening
designs and factorial designs, conjoint analysis)
https://www.youtube.com/watch?v=zFMgpxG-chM
Business Experiments
Right Customers, Right Channels, Right Comms Messages
(Example : Alcon case study)
https://www.edenspiekermann.com/case-studies/alcon-wearlenses/
Prescriptive Analytics
Optimisation, Simulation
#ISSLearningFest
Optimal mix of Data Science and
Machine Learning Techniques
Predictive
Predictions
•Probability of a specific
outcome
Forecasting
•Predicting a series of
outcomes over time
(univariate vs multivariate)
Simulation
•Predicting multiple
outcomes and
highlighting uncertainties
Prescriptive
Rules
•Predefined framework for
choosing between
alternatives
Optimisation
•Outcome-driven, constraint-
based evaluation of an
interdependent set of options
Decision
Making
Greater
Business
Impact
When and How to Use Advanced Analytics Techniques to Solve Business Problems, Published 17 September
2021 - ID G00750951, By Analyst(s): Carlie Idoine, Erick Brethenoux
Three Emergent AI Technologies
Pre-trained
AI Model
Optimization
Solver
Generative
AI
#ISSLearningFest
Quick Answer: What Three Emergent AI Technologies Will Have an Impact in 2022?,
Published 11 March 2022 - ID G00752286, Owen Chen
Prescriptive Analytics
Optimisation Techniques
Linear
programming
Non-linear
programming
Goal
programming
Dynamic
programming
Analytic
Hierarchy
Process
Assignment
Model
Network
model
Monte Carlo
Simulation Markov Chain Queuing
Model
Learning
Curve
Design of
Experiment
Scenario
Analysis and
Planning
Game Theory Decision Tree Utility Theory Graph Theory
Application of Optimisation
Travelling Salesman Problem
Traffic and Shipment Routing  Route (travel time, cost, distance) optimisation
Introduction to Genetic Algorithm & their application in data science
https://www.analyticsvidhya.com/blog/2017/07/introduction-to-genetic-algorithm/
Linear programming
(Genetic Algorithm)
Monte Carlo
Simulation
Inventory Optimisation and Simulation
Inventory Simulation using Monte Carlo Simulation
https://cloud.anylogic.com/model/b0156f6d-6c04-431b-b48d-1b875b2720e7?mode=SETTINGS
Monte Carlo
Simulation
Visual
Analytics
Timestamp/
Temporal
Locational/
Spatial
Statistics/
Static
Data Visualisation for
Spatial and Non-Spatial Data
Locational Analytics
Sensemaking of Customer Locational Data for
Geomarketing
Retail Analytics
In-Store Operational Excellence via Real Time Streaming Analytics
IoT powered Intelligent Retail, https://www.youtube.com/watch?v=n-ouKu9tNPM
Operational Analytics
Foot Traffic Analytics for Demand Planning and Management
using Queuing Theory / Model
https://www.channelnewsasia.com/commentary/singapore-slow-reopening-seniors-elderly-strategy-covid-19-2230601
https://www.channelnewsasia.com/singapore/covid-singapore-vaccine-vaccination-centre-behind-the-scenes-1882811
Queuing
Model
Operational Analytics
Foot Traffic Analytics for Demand Planning and Management using
Queuing Theory / Model
#ISSLearningFest
https://www.todayonline.com/singapore/long-queues-supermarkets-after-
announcement-circuit-breakers-contain-covid-19
Learning Pathway https://www.iss.nus.edu.sg/stackable-certificate-programmes/business-analytics
Give Us Your Feedback
#ISSLearningFest
Day 3 Programme
Survey:
Data and Analytics Essentials
#ISSLearningFest
https://docs.google.com/forms/d/e/1FAIpQLScayAdYauu-SwTwzOgKQhpBpK8tsCrv-3cJhYycdlAWH9WThQ/viewform?usp=sf_link
Thank You!
christinecheong@nus.edu.sg
brandon.ng@nus.edu.sg
#ISSLearningFest
Icons made by Vectors Market, http://www.flaticon.com/authors/vectors-market
is licensed by Creative Commons BY 3.0, http://creativecommons.org/licenses/by/3.0/

Contenu connexe

Tendances

Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
Simplilearn
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 

Tendances (20)

Data Analytics course.pptx
Data Analytics course.pptxData Analytics course.pptx
Data Analytics course.pptx
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Become a Data Analyst
Become a Data Analyst Become a Data Analyst
Become a Data Analyst
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
Apply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business ApplicationApply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business Application
 
Analytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko DimeskiAnalytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko Dimeski
 
Standard Chartered- Threat Intelligence using Knowledge Graphs.pdf
Standard Chartered- Threat Intelligence using Knowledge Graphs.pdfStandard Chartered- Threat Intelligence using Knowledge Graphs.pdf
Standard Chartered- Threat Intelligence using Knowledge Graphs.pdf
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data world
 
Data Management Services
Data Management ServicesData Management Services
Data Management Services
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
Real-World Data Governance Webinar: Data Governance Framework Components
Real-World Data Governance Webinar: Data Governance Framework ComponentsReal-World Data Governance Webinar: Data Governance Framework Components
Real-World Data Governance Webinar: Data Governance Framework Components
 
Introduction to Business Anlytics and Strategic Landscape
Introduction to Business Anlytics and Strategic LandscapeIntroduction to Business Anlytics and Strategic Landscape
Introduction to Business Anlytics and Strategic Landscape
 

Similaire à Overview of Data and Analytics Essentials and Foundations

Difference b/w DataScience, Data Analyst
Difference b/w DataScience, Data AnalystDifference b/w DataScience, Data Analyst
Difference b/w DataScience, Data Analyst
3RI Technologies Pvt Ltd
 
Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the Marketspace
Bala Iyer
 

Similaire à Overview of Data and Analytics Essentials and Foundations (20)

Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics Capabilities
 
What is data science ?
What is data science ?What is data science ?
What is data science ?
 
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
 
The Future of Applied Marketing Research
The Future of Applied Marketing ResearchThe Future of Applied Marketing Research
The Future of Applied Marketing Research
 
3.BITOOLS - DIGITAL TRANSFORMATION AND STRATEGY
3.BITOOLS - DIGITAL TRANSFORMATION AND STRATEGY3.BITOOLS - DIGITAL TRANSFORMATION AND STRATEGY
3.BITOOLS - DIGITAL TRANSFORMATION AND STRATEGY
 
Big data and Marketing by Edward Chenard
Big data and Marketing by Edward ChenardBig data and Marketing by Edward Chenard
Big data and Marketing by Edward Chenard
 
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
 
Data storytelling neptune digital space dubai
Data storytelling   neptune digital space dubaiData storytelling   neptune digital space dubai
Data storytelling neptune digital space dubai
 
Enabling Success With Big Data - Driven Talent Acquisition
Enabling Success With Big Data - Driven Talent AcquisitionEnabling Success With Big Data - Driven Talent Acquisition
Enabling Success With Big Data - Driven Talent Acquisition
 
Capturing Marketing Information to Fuel Growth
Capturing Marketing Information to Fuel GrowthCapturing Marketing Information to Fuel Growth
Capturing Marketing Information to Fuel Growth
 
Difference b/w DataScience, Data Analyst
Difference b/w DataScience, Data AnalystDifference b/w DataScience, Data Analyst
Difference b/w DataScience, Data Analyst
 
Artificial Intelligence: Evolution and its Impact on Marketing
Artificial Intelligence: Evolution and its Impact on MarketingArtificial Intelligence: Evolution and its Impact on Marketing
Artificial Intelligence: Evolution and its Impact on Marketing
 
Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the Marketspace
 
Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2
 
data analytics lecture2.pptx
data analytics lecture2.pptxdata analytics lecture2.pptx
data analytics lecture2.pptx
 
Lecture3 business intelligence
Lecture3 business intelligenceLecture3 business intelligence
Lecture3 business intelligence
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Intro to Artificial Intelligence w/ Target's Director of PM
 Intro to Artificial Intelligence w/ Target's Director of PM Intro to Artificial Intelligence w/ Target's Director of PM
Intro to Artificial Intelligence w/ Target's Director of PM
 
What MBA Students Need to Know about CX, Data Science and Surveys
What MBA Students Need to Know about CX, Data Science and SurveysWhat MBA Students Need to Know about CX, Data Science and Surveys
What MBA Students Need to Know about CX, Data Science and Surveys
 

Plus de NUS-ISS

Plus de NUS-ISS (20)

Designing Impactful Services and User Experience - Lim Wee Khee
Designing Impactful Services and User Experience - Lim Wee KheeDesigning Impactful Services and User Experience - Lim Wee Khee
Designing Impactful Services and User Experience - Lim Wee Khee
 
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
 
How the World's Leading Independent Automotive Distributor is Reinventing Its...
How the World's Leading Independent Automotive Distributor is Reinventing Its...How the World's Leading Independent Automotive Distributor is Reinventing Its...
How the World's Leading Independent Automotive Distributor is Reinventing Its...
 
The Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital TransformationThe Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital Transformation
 
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
 
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
 
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
 
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
 
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
 
Supply Chain Security for Containerised Workloads - Lee Chuk Munn
Supply Chain Security for Containerised Workloads - Lee Chuk MunnSupply Chain Security for Containerised Workloads - Lee Chuk Munn
Supply Chain Security for Containerised Workloads - Lee Chuk Munn
 
Future of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
 
Future of Learning - Khoong Chan Meng
Future of Learning - Khoong Chan MengFuture of Learning - Khoong Chan Meng
Future of Learning - Khoong Chan Meng
 
Site Reliability Engineer (SRE), We Keep The Lights On 24/7
Site Reliability Engineer (SRE), We Keep The Lights On 24/7Site Reliability Engineer (SRE), We Keep The Lights On 24/7
Site Reliability Engineer (SRE), We Keep The Lights On 24/7
 
Product Management in The Trenches for a Cloud Service
Product Management in The Trenches for a Cloud ServiceProduct Management in The Trenches for a Cloud Service
Product Management in The Trenches for a Cloud Service
 
Predictive Analytics
Predictive AnalyticsPredictive Analytics
Predictive Analytics
 
Feature Engineering for IoT
Feature Engineering for IoTFeature Engineering for IoT
Feature Engineering for IoT
 
Master of Technology in Software Engineering
Master of Technology in Software EngineeringMaster of Technology in Software Engineering
Master of Technology in Software Engineering
 
Diagnosing Complex Problems Using System Archetypes
Diagnosing Complex Problems Using System ArchetypesDiagnosing Complex Problems Using System Archetypes
Diagnosing Complex Problems Using System Archetypes
 
Satisfying the ‘-ilities’ of an Enterprise Cloud Service
Satisfying the ‘-ilities’ of an Enterprise Cloud ServiceSatisfying the ‘-ilities’ of an Enterprise Cloud Service
Satisfying the ‘-ilities’ of an Enterprise Cloud Service
 
Preparing and Acing your Kubernetes Certification
Preparing and Acing your Kubernetes CertificationPreparing and Acing your Kubernetes Certification
Preparing and Acing your Kubernetes Certification
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Overview of Data and Analytics Essentials and Foundations

  • 1. Data and Analytics Essentials Christine CHEONG & Brandon NG #ISSLearningFest
  • 2. The Essential of Data and Analytics • Data and Analytics • Analytics Maturity Model • Descriptive Analytics • Predictive Analytics • Prescriptive Analytics #ISSLearningFest Icons made by Vectors Market, http://www.flaticon.com/authors/vectors-market is licensed by Creative Commons BY 3.0, http://creativecommons.org/licenses/by/3.0/
  • 4. “Without data we’re just another person with an opinion.” – W. Edwards Deming http://www.meliorgroup.com/without-data-just-another-person-with-opinion/
  • 5. Insights from Data What you see is not what you get!
  • 6. from HiPPO = Highest Paid Personnel’s Opinion to Data Driven Decision Making Organization
  • 7. Worlds Collide as Augmented Analytics Draws Analytics, BI and Data Science Together Published: 10 March 2020 ID: G00463513, Analyst(s): Carlie Idoine Descriptive/ Diagnostics/ Predictive/ Prescriptive Analytics Models Reports / Dashboards Consumes Produces Descriptive/ Diagnostics/ Predictive/ Prescriptive Analytics Models Reports / Dashboards Business Analyst / Business Intelligence Analyst Data Analyst / Data Scientist Analytics Role
  • 8. Expanding, Understanding & Investigating • Data scientist • Data engineers • Business analyst Exploration & Discovery • Data scientist • Data engineer Foundational Core (Core operational processes) • Business user • Business analyst Establishing Value • Data engineer • Business analyst Data Known Unknown Business Questions Unknown Known Solve Your Data Challenges With the Data Management Infrastructure Model, Refreshed: 3 April 2019 | Published: 19 October 2017 ID: G00336474 Analyst(s): Adam Ronthal, Nick Heudecker Data Analytics Roles and Skills Data Management Infrastructure Model
  • 9. Role of Analytics in Decision Management Analytically Assisted Decision Making Decision Management How Companies Succeed at Decision Management, Published: 19 October 2018 ID: G00341368, Analyst(s): W. Roy Schulte, Erick Brethenoux Descriptive Analytics - Explain what happened Diagnostic Analytics - Explore how it happened Predictive Analytics - Explore what is likely to happen next or in the future Prescriptive Analytics - Specify what to do, or automatically trigger a response does not specify what to do
  • 10. Worlds Collide as Augmented Analytics Draws Analytics, BI and Data Science Together Published: 10 March 2020 ID: G00463513, Analyst(s): Carlie Idoine
  • 11. Intersection & Interrelationship of Data, Analytics and Decision Decision •Change, movement Wisdom •Understanding, integrated, actionable Knowledge •contextual, synthesized Information •Useful, organized, structured Data •Signals/know-nothing Descriptive Analytics FUTURE What Action? - direction What is the best? - Principles PAST Why? - patterns What? - relationships Diagnostics Analytics Predictive Analytics Prescriptive Analytics Data Analysis Artificial Intelligence, Machine Learning, Deep Learning, etc Data Integration Big data, cloud computing, etc Data Collection IoT, sensor network, mobile devices, etc
  • 13. Hype Cycle for Analytics and Business Intelligence, 2022, 1 4 July 2022 - ID G00770971, By Analyst(s): Peter Krensky
  • 14. Business Intelligence and Analytics for Decision-Making Business Intelligence and Analytics right information right person right time right quantity right quality right place
  • 15. Effective Insights Communication and Persuasive Presentation with Dashboard Implementation using Tableau / Microsoft Power BI
  • 18. Case Study: Tackling Singapore’s Population Challenges DataLion Team, Tackling Singapore’s Population Challenges, 2018 S2
  • 19. Data Story Telling Framework Diagram DataLion Team, Tackling Singapore’s Population Challenges, 2018 2 KE Unit 3,
  • 25. 25 Hype Cycle for Data Science and Machine Learning, 2022, 29 June 2022 - ID G00770938, By Analyst(s): Farhan Choudhary, Peter Krensky
  • 27. Marketing Decisions Managerial decisions –whether to advertise, change prices, launch a new product or service, assess impact of marketing and communication effectiveness etc • What factors or product formulation are important in driving product/brand choice? • What prices to charge for different range of products? • Which advertising messages/campaigns are effective in deepening engagement with stakeholders? • Which customer/stakeholder segments should we target to drive conversion and/improve profitability?
  • 28. Marketing Trends #ISSLearningFest • Marketing-mix decisions are increasingly made quantitatively instead of qualitatively. Pricing decisions are routinely made using dynamic quantitative models, so are assortment, channel, and location decisions. • Customer Engagement - Machine learning algorithms extract consumer preferences from massive online data, and help create engaging text and images to attract attention; intelligent agents assist customer engagement to improve experience. • Search engine is where many customer journeys begin. While keyword has been the dominant form of online search, machine learning methods are making searches based on other content types within reach eg with voice recognition, natural language processing, and text-to-speech capabilities • Recommending the right products to the interested consumers can significantly improve marketing performance. Deep neural networks and embedding methods have been leveraged to further enhance performance. Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
  • 29. Marketing Trends • Product Development - Rapid experimentation and simulation for product and process innovation • Go-to-market/commercialization - Real time analytics, dynamic pricing optimization, connected product innovation • From Big Data to Small and Wide Data - statistical/machine learning, AI techniques https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/why-tech-enabled-go-to-market-innovation-is-critical-for-industrial-companies
  • 30. • Machine Learning arose as a subfield of Artificial Intelligence while Statistical Learning arose as a subfield of Statistics. While statistical and econometric models with increasing levels of sophistication are being developed, researchers have also turned to machine learning methods as a valuable alternative • Machine Learning has a greater emphasis on large scale applications and prediction accuracywhile Statistical Learning emphasizes models and their interpretability, and precision and uncertainty. But the distinction has become more and more blurred, and there is a great deal of “cross-fertilization”. • Balance between a theory-driven with a data-driven perspective by injecting human insights and domain knowledge into the use of machine learning methods Statistical and Machine Learning Structured data is comprised of clearly defined data types with patterns that make them easily searchable; while unstructured data is comprised of data that is usually not as easily searchable, including formats like audio, video, and social media postings.
  • 31. Statistical and Machine Learning Problems Structured (eg quantitative demographic and behavioural data) • Identify the effects of demographic and marketing data on insurance policy product purchase • Predict housing prices based on sociodemographic and geospatial data • Establish the relationship between marketing promotion (eg price, location, advertising etc on store level product sales) Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
  • 32. Statistical and Machine Learning Problems Unstructured (eg image, text, speech/voice data etc) • Customize an email spam detection system based on frequently occurring words (features). • Predict user interaction and engagement on social media based on image / facial recognition • Predict positive vs negative sentiments based on attributes of internet movie ratings • data from 4601 emails sent to an individual (named George, at HP labs, before 2000). Each is labeled as s p amor email. • goal: build a customized spam filter. • input features: relative frequencies of 57 of the most commonly occurring words and punctuation marks in these email messages. Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
  • 33. Trends in Machine Learning #ISSLearningFest ML methods are well positioned to extract rich insights from rich data. While studies have frequently analyzed text and image data, there are opportunities to focus on audio, video, and consumer tracking data, as well as network data and data of hybrid formats. Opportunities to broaden and extend usage of machine learning methods. While machine learning methods have been used frequently for prediction and feature extraction, they can be harnessed for causal and prescriptive analysis Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
  • 34. Trends in Machine Learning #ISSLearningFest Opportunities to broaden and extend ML usage in the entire customer purchase journey, to develop decision- support capabilities covering all aspects of marketing functions, from more strategic areas like brand positioning and competitive analysis to operational areas like customer satisfaction/service delivery Machine learning and AI in marketing – Connecting computing power to human insights by Liye Ma a,⁎, Baohong Sun b
  • 35. Machine Learning Tasks Supervised Learning (eg prediction and forecasting techniques) • Outcome measurement Y (also called dependent variable, response, target) vs a set of predictors (features) measured on a set of samples • Regression vs Classification Problem UnsupervisedLearning (eg segmentation and association) • No outcome variable, just a set of predictors (features) measured on a set of samples. • Find groups of samples that behave similarly, find features that behave similarly, find linear combinations of features with the most variation. Useful as a pre-processing step for supervised learning.
  • 36. Machine Learning Tasks Semi-supervised Learning and Transfer Learning • Semi-supervised – Output is known for only a subset of the data. The instances in the training dataset for which the output is not observed are nonetheless used to improve learning eg through label propagation • Transfer learning – Researchers leverage an existing model, trained using a different dataset or for a different purpose. For example, image analysis where an existing model trained using a large set of images is updated using the specific images of the research project Active Learning • Only limited training instances are available at first. The goal is to maximize the predictive accuracy while minimizing the data requirement. Determining the most important instances is a key focus of active learning • Reinforcement learning : The learning agent continuously interacts with the surrounding environment by taking actions and observing feedback. The learning algorithm needs to determine the actions to take to both learn the environment’s characteristics and craft optimal policy given the states.
  • 37. Supervised Learning - Feature Selection Methods Subset selection We identify a subset of p predictors that we believe to be related to the response. We then fit a model using least squares on the reduced set of variables. Dimension Reduction We project that p predictors into a M-dimensional subspace where M <p. This is achieved by computing M different linear combinations or projections of the variables. Then these M projections are used as predictors to fit a linear regression model by least squares Shrinkage (or Regularization) for large sparse data We fit a model involving all p predictors, but the estimated coefficients are shrunken towards zero relative to the least squares estimates. This shrinkage (also known as regularization) has the effect of reducing variance and can also perform variable selection Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
  • 38. Feature Selection (Flexibility vs Interpretability) Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021 Example: Income prediction based on socio-demographic survey data (eg age, education, seniority etc)
  • 39. Feature Selection (Dimension reduction) Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021 Example: Income prediction on socio-demographic and geo-location data
  • 40. Feature Selection (Regularization) Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021 Example: Credit risk assessment on demographic and behavioural data
  • 41. Big data vary in shape. These call for different approaches Big Data Learning Problems Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
  • 42. Big Data Learning Problems Example: IMDB (internet movie database) ratings using machine/deep learning RNN - https://web.cs.dal.ca/~shali/project2.html Many data sources are sequential in nature, and call for special treatment when building predictive models. For example, documents such as book and movie reviews, newspaper articles and tweets. We can use the sequence of words occurring in a document to make predictions about the label for the entire document (eg positive or negative sentiment). Machine/deep learning approaches eg recurrent neural networks can be used for classification, sentiment analysis, and language translation.
  • 43. Big Data Learning Problems Example: Image recognition in social media context using machine/deep learning CNN - https://www.semanticscholar.org/paper/Toward-Large-Scale-Face-Recognition-Using-Social-Stone-Zickler/2f2d69bdfaca54eb3a6ede3e5eb2c76713bb8064 Neural networks rebounded around 2010 with big successes in image classification. Around that time, massive databases of labeled images were being accumulated, with ever-increasing numbers of classes. A special family of convolutional neural networks (CNNs) has evolved for classifying images on a wide range of problems. CNNs mimic to some degree how humans classify images, by recognizing specific features or patterns anywhere in the image that distinguish each particular object class.
  • 44. Example: Online shopping analysis using models on large sparse data (B2C) From Big to Small and Wide Data Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021 A marketing analyst interested in understanding people’s online shopping patterns could treat as features all of the search terms entered by users of a search engine. This is sometimes known as the “bag-of-words” model. The same researcher might have access to the search histories of only a few hundred or a few thousand search engine users who have consented to share their information with the researcher. For a given user, each of the p search terms is scored present (0) or absent (1), creating a large binary feature vector. Then n ≈ 1,000 and p is much larger.
  • 45. Example: Webpage browsing analytics using models on large sparse webpage session information (B2C) Quantcast is a digital marketing company. Data are five-minute internet sessions. Binary target is type of family (≤ 2 adults vs adults plus children). 7 million features of session info (web page indicators and descriptors). Divided into training set (54M), validation (5M) and test (5M). All but 1.1M features could be screened because ≤ 3 nonzero values. Fit 100 models in 2 hours in R Richest model had 42K nonzero coefficients, and explained 10% deviance (like R-squared). From Big to Small and Wide Data Source : Introduction to Statistical Learning by Trevor Hastie and Robert Tibshirani, Second Edition 2021
  • 46. Observational vs Experimental Studies In observational studies, researchers are only observers. They measure what people do, or say they would do in a situation not of their making (eg surveys and focus groups) In contrast, when conducting experiments, researchers control the important variables that influence consumer behavior to more precisely observe the effect. https://www.youtube.com/watch?v=qwfd8cf3_UY&feature=youtu.be
  • 47. Observational studies/data Data : Then and now Data in the 80s-90s Data now • Retail scanner data • Survey data • Transactional/ • behavioral data + clickstream data + Social networking + Product review + Search data + Mobile + Text Primary & secondary market research/trends (eg structured vs unstructured data including social media) Knowledge/Consumer immersion (eg observation studies/ethnography, extracting value from connected products, real-time analytics eg smart sensors etc) Quantitative data (eg direct questioning, buy-response surveys, transactional data)
  • 48. Observational studies (ethnography) https://www.youtube.com/watch?v=yjFkUqAeUq8 Ethnography is a type of observation research enabled by offline and online tools. It is a convenient way for participants to share how they interact with products and services in their natural environment. An Ethnographic Case Study of Ikea Shoppers
  • 50. Why experiments? Experiments allow analysts to answer business questions related to cause and effect. It is important for the analyst to know whether she has an “umbrella problem” or a “rain dance problem.” If all she wants to know is whether or not she should carry an umbrella, then she has a pure prediction problem and causal questions are of secondary importance; she only needs to know whether the probability of rain is high or low. On the other hand, if there has been a long drought and she wants to end it, prediction is of little value: causal questions are of primary importance. If she wants to induce rainfall, she needs to know what variables cause rain and then try to manipulate those variables
  • 51. Business Experiments • A/B/n testing using hypothesis testing (eg compare landing pages to see which one generates more sales) • Multivariate analysis using predictive analytics (eg screening designs and factorial designs, conjoint analysis) https://www.youtube.com/watch?v=zFMgpxG-chM
  • 52. Business Experiments Right Customers, Right Channels, Right Comms Messages (Example : Alcon case study) https://www.edenspiekermann.com/case-studies/alcon-wearlenses/
  • 54. Optimal mix of Data Science and Machine Learning Techniques Predictive Predictions •Probability of a specific outcome Forecasting •Predicting a series of outcomes over time (univariate vs multivariate) Simulation •Predicting multiple outcomes and highlighting uncertainties Prescriptive Rules •Predefined framework for choosing between alternatives Optimisation •Outcome-driven, constraint- based evaluation of an interdependent set of options Decision Making Greater Business Impact When and How to Use Advanced Analytics Techniques to Solve Business Problems, Published 17 September 2021 - ID G00750951, By Analyst(s): Carlie Idoine, Erick Brethenoux
  • 55. Three Emergent AI Technologies Pre-trained AI Model Optimization Solver Generative AI #ISSLearningFest Quick Answer: What Three Emergent AI Technologies Will Have an Impact in 2022?, Published 11 March 2022 - ID G00752286, Owen Chen
  • 56. Prescriptive Analytics Optimisation Techniques Linear programming Non-linear programming Goal programming Dynamic programming Analytic Hierarchy Process Assignment Model Network model Monte Carlo Simulation Markov Chain Queuing Model Learning Curve Design of Experiment Scenario Analysis and Planning Game Theory Decision Tree Utility Theory Graph Theory
  • 57. Application of Optimisation Travelling Salesman Problem Traffic and Shipment Routing  Route (travel time, cost, distance) optimisation Introduction to Genetic Algorithm & their application in data science https://www.analyticsvidhya.com/blog/2017/07/introduction-to-genetic-algorithm/ Linear programming (Genetic Algorithm) Monte Carlo Simulation
  • 58. Inventory Optimisation and Simulation Inventory Simulation using Monte Carlo Simulation https://cloud.anylogic.com/model/b0156f6d-6c04-431b-b48d-1b875b2720e7?mode=SETTINGS Monte Carlo Simulation
  • 60. Locational Analytics Sensemaking of Customer Locational Data for Geomarketing
  • 61. Retail Analytics In-Store Operational Excellence via Real Time Streaming Analytics IoT powered Intelligent Retail, https://www.youtube.com/watch?v=n-ouKu9tNPM
  • 62. Operational Analytics Foot Traffic Analytics for Demand Planning and Management using Queuing Theory / Model https://www.channelnewsasia.com/commentary/singapore-slow-reopening-seniors-elderly-strategy-covid-19-2230601 https://www.channelnewsasia.com/singapore/covid-singapore-vaccine-vaccination-centre-behind-the-scenes-1882811 Queuing Model
  • 63. Operational Analytics Foot Traffic Analytics for Demand Planning and Management using Queuing Theory / Model #ISSLearningFest https://www.todayonline.com/singapore/long-queues-supermarkets-after- announcement-circuit-breakers-contain-covid-19
  • 65. Give Us Your Feedback #ISSLearningFest Day 3 Programme
  • 66. Survey: Data and Analytics Essentials #ISSLearningFest https://docs.google.com/forms/d/e/1FAIpQLScayAdYauu-SwTwzOgKQhpBpK8tsCrv-3cJhYycdlAWH9WThQ/viewform?usp=sf_link
  • 67. Thank You! christinecheong@nus.edu.sg brandon.ng@nus.edu.sg #ISSLearningFest Icons made by Vectors Market, http://www.flaticon.com/authors/vectors-market is licensed by Creative Commons BY 3.0, http://creativecommons.org/licenses/by/3.0/