1. www.neosperience.com | blog.neosperience.com | info@neosperience.com
September 2017
Neosperience.
The Digital Customer Experience.
Artificial Intelligence:
Hype, Treats and promises of the upcoming revolution
1
2. An unclear definition
During last century many attempts have been made to define Artificial Intelligence, following the increasing
knowledge of human mind processes. Since the boundaries of our understanding is shifting from day to day, the
definition changes over time.
“I, robot” provides a great picture of this problem:
Detective Del Spooner: Human beings have dreams. Even dogs have dreams, but not you, you are just a machine.
An imitation of life. Can a robot write a symphony? Can a robot turn a... canvas into a beautiful masterpiece?
Sonny: Can *you*?
What is Artificial Intelligence?
References:
Brooks - Elephants don't play chess
Stanislav Lem - Cyberiad
3. The Law of Accelerated Growth
An analysis of the history of technology shows that
technological change is exponential, contrary to the common-
sense “intuitive linear” view.
Technology growth throughout history has been exponential, it is
not gonna stop until reaches a point where innovation is
happening at a seemingly-infinite pace. Kurzweil called this
event singularity.
After the singularity, something completely new will shape our
world.
Artificial Narrow Intelligence is evolving into Artificial
General Intelligence, then into Artificial Super Intelligence.
Why is happening now?
Everyone wants to live forever
An AGI would be to fix our world, since many illness can be thought as an Artificial Intelligence problem. Computational
biology is fast moving towards self-learning medications, and these are just the beginning of adoption of nano-machines.
References:
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
4. Artificial Narrow Intelligence
Which AI?
Artificial Intelligence constrained to a narrow task. (Siri, Google Search, Google Car, ESB, etc.).
Artificial General Intelligence
Artificial Super Intelligence
The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the
same sense human beings have minds.
An AGI capable of outsmarting human brains, performing thinking and learning tasks at unprecedented pace.
5. Different approaches
How?
• Filling gaps in Existing Knowledge
• Understand and apply Knowledge
• Semantically reduce uncertainty
• Notice similarity between old/new
Human intelligence relies on a powerful supercomputer - our brain - much more powerful than
actual HPC servers and is capable of switching between many different “algorithms” to
understand reality, each one of them is context-independent.
The most powerful capability of our brain and the common denominator of all these features is the
capability of humans to learn from experience. Learning is the key.
6. Tribes of Artificial Intelligence
Symbolists
Inverse deduction/Logical deduction - Google Cat
Connectionists
Neural Network - back propagation - Classifiers / NLP
Evolutionaries
Genetic Algorithms - Fit function - Path finding / solution discovery
Bayesians
Probabilistic inference - Markov Chains - Pattern Recognition
Analogizers
Nearest Neighbor / Kernel Machines - Recommendation systems
8. AI and Machine Learning are the same thing?
Machine Learning is the most promising
field of Artificial Intelligence so, often, it is
used in place of AI, even if the latter is
broader, including knowledge generation
algorithms such as path finding and solution
discovery.
Deep Learning is a subset of Machine
Learning algorithms related to pattern
recognition and reinforcement learning.
Almost..
9. An operational definition
Machine Learning
“A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P if its performance at tasks in T, as measured by P, improves with experience E”
Machine Learning (ML) algorithms have data as input, ‘cause data represents the Experience.
This is a focal point of Machine Learning: large amount of data is needed to achieve good performances.
The Machine Learning equivalent of program in ML world is called ML model and improves over time as
soon as more data is provided, with a process called training.
Data must be prepared (or filtered) to be suitable for training process. Generally input data must be
collapsed into a n-dimensional array with every item representing a sample.
ML performances are measured in probabilistic terms, with metrics called accuracy or precision.
The importance of Experience
10. Types of Machine Learning
Machine Learning - Taxonomy
Machine learning tasks are typically classified into three broad categories, depending on the nature of the
learning "signal" or "feedback" available to a learning system.
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
Input-based classification
• Regression
• Classification
• Clustering
• Density estimation
• Dimensionality reduction
Output-based classification
11. Regression is an instance of Supervised Learning
Regression
Regression analysis helps one understand how the
typical value of the dependent variable (or 'criterion
variable') changes when any one of the independent
variables is varied, while the other independent
variables are held fixed.
Is a statistical method of data analysis. The most
common algorithm least square method that
provides an estimation of regression parameters.
When dataset is not trivial estimation is achieved
through is gradient descent.
12. Statistical regression is used to make predictions about data, based on historical series
Use cases - Regression
Regression, even in the most simple form of Linear Regression is a good tool to learn from data and make
predictions based on data trend.
Common scenarios for Regression are:
• Stock price value
• Product Price Estimation
• Age estimation
• Customer satisfaction rate
defining variables such as response-time, resolution-ration we can forecast satisfaction level or churn
• Customer Conversion rate estimation (based on click data, origin, timestamp, …)
13. Classification is an instance of Supervised Learning
Classification
Classification is the problem of identifying to which of a set
of categories (sub-populations) a new observation belongs,
on the basis of a training set of data containing observations
(or instances) whose category membership is known.
Most used algorithms for classification are:
• Logit Regression
• Decision Trees
• Random Forest
14. Classification is used to detect the binary outcome of a variable
Use cases - Classification
Classification is often used to classify people into pre-defined clusters (good-payer/bad-payer, in/out
target, etc.)
Common scenarios for Regression are:
• Credit scoring
• Human Activity Recognition
• Spam/Not Spam classification
• Customer conversion prediction
• Customer churn prediction
• Customer personas classification
15. Clustering is an instance of Unsupervised Learning
Clustering
is the task of grouping a set of objects in such a way
that objects in the same group (called a cluster) are
more similar (in some sense or another) to each
other than to those in other groups (clusters).
The difference between algorithms is due to the
similarity function that is used:
• Centroid based clusters
• Density based cluster
• Random Forest
16. Clustering is used to segment data
Use cases - Clustering
Clustering labels each sample with a name representing its belonging cluster. Labelling can be exclusive or
multiple. Clusters are dynamic structures: they adapt to new sample coming into the model as soon as thy
label them.
Common scenarios for Clustering are:
• Similar interests recognition
• Shape detection
• Similarity analysis
• Customer base segmentation
17. An operational definition
Deep Learning
“A class of machine learning techniques that exploit many layers of non-linear information processing for
supervised or unsupervised feature extraction and transforma- tion, and for pattern analysis and
classification.”
Deep Learning (DL) is based on non-linear structures that process information. The “deep” in name comes
from the contrast with “traditional” ML algorithms that usually use only one layer. What is a layer?
A cost-function receiving data as input and outputting its function weights.
More complex is the data you want to learn from, more layers are usually needed to learn from. The number
of layers is called depth of the DL algorithm.
How “deep” is your deep learning?
18. An operational definition
Neural Networks (ANN)
“computing systems inspired by the biological neural networks that constitute animal brains. Such systems
learn (progressively improve performance) to do tasks by considering examples, generally without task-
specific programming”
An ANN is based on a collection of connected units called artificial neurons, (analogous to axons in a
biological brain). Each connection (synapse) between neurons can transmit a signal to another neuron. The
receiving (postsynaptic) neuron can process the signal(s) and then signal downstream neurons connected
to it. Neurons may have state, generally represented by real numbers, typically between 0 and 1. Neurons
and synapses may also have a weight that varies as learning proceeds, which can increase or decrease the
strength of the signal that it sends downstream. Further, they may have a threshold such that only if the
aggregate signal is below (or above) that level is the downstream signal sent.
Typically, neurons are organized in layers. Different layers may perform different kinds of transformations on
their inputs. Signals travel from the first (input), to the last (output) layer, possibly after traversing the layers
multiple times.
19.
20. Neural Networks are a type of Deep Learning.
Neural Networks - Classification
Neural Networks can be used either for supervised, unsupervised or reinforcement learning, they are the
most known expression of Deep Learning (even if not the only one).
Neural Networks can be classified on signal propagation method:
• Convolutionary Neural Networks (CNN): are the most promising networks, with a wide range of
application from Computer Vision to NLP
• Recurrent Neural Networks (RNN): used for language translation, content based image processing, word
prediction, Language Detection, Text Summarization
• Feed-Forward Neural Network (FFNN): used for standard prediction (a powerful alternative to
regression/classification)
21. Practical applications
Use cases - Neural Networks
• Voice Search/Recognition: Automotive, Messaging, IoT
• Sentiment Analysis: CRM
• Text Recognition/Translation, Language Detection, Text Summarization : Messaging, Personalizaiton,
Profile analysis
• Image Recognition, Classification, Motion Detection: Computer Vision, Image Classification, Face
Detection, Face Tracking
• Time Series: behavioural analysis, Pattern matching, Trading
• Video detection/Object Tracking: Vehicle Tracking, Path detection, Object Classification, Face
Recognition
22. Data pre-processing is the key.
Training a Machine Learning model
Data must be prepared before being feeded into the model for training. To an abstract perspective, data
must be reduced to a single value for each sample (in each dimension), made available to NN processing.
Data pre-processing (filtering) is a key moment in every Machine/Deep Learning application since can have
an impact on performance and (if not done correctly) could introduce some bias.
(i.e. filtering out relevant dimensions thus making the model less accurate)
Data filtering is a critical step which complexity can tackle the result of the whole process.
23. Models are evaluated probabilistically
Evaluating models
Since each model is a statistical object, that comes with a error estimation and variance.
Model accuracy is measured as the difference between 100% and the precision of the prediction the model
can express in an hypothetically infinite number of evaluations.
Good or bad accuracy is deeply related to the state of the art (i.e. for numeric digits recognition an accuracy
of less than 90% is considered a bad result)
As a rule of thumb, an accuracy if less than 75% is not considered acceptable, since would be too close to
50% that is the result of throwing a coin (random result).
25. Push messaging
Fields of application
A large hotel chain wanted to solve this problem: 90k travelers are stranded every day in America across 5,145
airports. How can the hotel ensure that they show up at the right moment for all these people? The solution was to
leverage real-time signals like bad weather, flight delays at 5,145 airports, and other such data, combine that with ML
powered algorithms to automate ads and messaging in the proximity of local airports. All sans human-control. Result?
60% increase in bookings in targeted areas. ML + Automation = Profit.
Marketing budget forecast
leverage ML to do the above campaign across Search, Display and Video. AND (are you sitting down?) rather than
guessing how much credit to give each marketing strategy from hundreds of thousands of customer touch points, ML
powered attribution can magically analyze every individual’s journey and automatically recommend shifts to your
budget.
26. Customer behaviour patterns
Fields of application
Leverage pattern recognition to understand common interests between customer belonging to different clusters and
push personalized messages.
Customer tracking in store
Customers can be tracked down extracting their movements in store. This leads to exploiting their interests,
identifying returning customers (through face recognition) and sentiment (through face analysis).
i.e. tracking person of interests with AWS Rekognition
Generate smart content
Smart Agents such as Persado generate personalized wording depending on the profile of customer landing on their
site, starting from a few words.
27. Amazon Machine Learning
Tools
Is a good tool for Supervised Learning using only one layer, implements logistic regression, multiclass
classification and linear regression. Unfortunately support for SVM or Decision Trees has yet to come.
Elastic MapReduce (Hadoop / Spark)
Implements a MapReduce cluster, that can be exploited for gradient descent method which is a fundamental
brick for almost every machine and deep learning algorithm. Spark can implement SVM and Random Forest.
TensorFlow
Is the de-facto state of the art for Machine/Deep Learning, provides a set of built-in functions usable to train a
ML model and build a Neural Network. It is natively optimized for available resources such as GPU and CPU.
Apache MXNet
Is a built-in set of ML tools (TensorFlow, Caffè, AppleML, etc.) already configured to be scaled on clusters.
28. A Master Algorithm
Where are we going?
is an algorithm capable of solving a narrow problem with no need to be developed specifically for
a given context.
“The” Master Algorithm
The master algorithm is the ultimate learning algorithm. It's an algorithm that can learn anything from data and
it's the holy grail of machine learning. It's what we're working towards and the book is about both how we're
going to get there and the implications that it's going to have for life. Also, the impact that machine learning is
already having.
29. Capabilities of an Universal Learner (UL)
Real Personal Assistants
World Wide Brains
Cancer Cures
360° recommenders
An Universal Learner could easily outperform in-store agents, understanding customer requests and providing a
feedback or suggestions.
It’s the final step of search engines, a wide intelligence capable of filtering fake news, understanding best and
trending topics, finding the most relevant answer.
Cancer and degenerative illnesses can be re-framed as a machine learning problem and cure is often impossible
due to constant changes in biological parameters. An UL could apply the exact cure and adjust it over time,
learning from body response.
An UL would be able to suggest the right product to a customer, proposing best fit for her needs, emotions and
wishes based on historical data.
30. Where to go from here?
Learning Machine Learning
Learning Machine Learning applications
Marketing applications
https://martechtoday.com/future-ai-marketing-applications-retail-202081
Salesforce Studio Vision
https://martechtoday.com/salesforces-social-studio-can-now-see-202159
Image recognition in Marketing
https://martechtoday.com/ai-image-recognition-transforming-social-media-marketing-202838
An Introduction to Statistical Learning
http://www-bcf.usc.edu/~gareth/ISL/
Artificial Intelligence for Humans
http://www.heatonresearch.com/aifh/
31. Additional resources
Pedro Domingos on AI
https://www.youtube.com/watch?v=UPsYGzln-Ys
Immortality or Extinction
http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
Dan Simmons
The Hyperion Cycle
Pedro Domingos
The Master Algorithm