Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×
by Sajan Mathew
What is this Guide?
This guide demystifies AI and democratizes AI knowledge on how it creates
and delivers value. It provi...
"Just as electricity
transformed almost
everything 100 years ago,
today I actually have a hard
time thinking of ...
Prochain SlideShare
Artificial intellect ukraine
Artificial intellect ukraine
Chargement dans…3

Consultez-les par la suite

1 sur 34 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à AI101 Guide (20)


Plus récents (20)

AI101 Guide

  1. 1. AI 101 E V E R Y T H I N G T H A T Y O U A R E C U R I O U S T O K N O W by Sajan Mathew
  2. 2. What is this Guide? This guide demystifies AI and democratizes AI knowledge on how it creates and delivers value. It provides an essential understanding of AI to anyone with varying technical knowledge, curiosity, and interest in the technology. Why was this book written? Thanks to the internet, today knowledge is available in abundance and on a click of a button. This also applies to even niche domains such as AI. There are numerous articles, books and other material on AI that are easily available on the world wide web but the only challenge is they are either highly technical or philosophical in nature or they exist in fragments in form of long posts and blogs. Being an AI enthusiast, I believe that to build a preferable AI future, it is important that we all have a uniform and shared understanding about AI. I spent time educating myself on AI concepts through several resources and captured here some of the finest and simplest explanations about AI technology that can help anyone to learn and explain AI to others. I understood AI, How do I find opportunities to apply AI technology? I am also creating an AI strategy canvas, similar to the Business Model Canvas, which helps you critically and creatively think about opportunities for AI application, why it is fit for AI, what building blocks of AI are needed, how do we apply and how will it benefit the business and customers. If you are interested in trying the AI Strategy Canvas, you can sign up here https://bit.ly/2OGl6BC or if you have feedback for AI 101 guide, send it to i.sajanmathew@outlook.com Hello,
  3. 3. ANDREW NG "Just as electricity transformed almost everything 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI will transform in the next several years.
  4. 4. Content Description source: https://bit.ly/2I5pSpQ 5 AI today for the future 7 Artificial Intelligence - Categories of Artificial Intelligence - Types of Artificial Intelligence 11 Machine Learning - Machine Learning Algorithm Types 22 Deep Learning - How Deep Learning works? - Deep Learning models 27 AI enabled companies 28 AI case studies: AI at India’s top eCom firms - Flipkart's Project Mira - Myntra’s Rapid platform and AI initiatives - Amazon India's AI enabled Smart eCom
  5. 5. AI Today for the future AI is not a technology that has developed in the last decade; it has been taking shape in the laboratories since the 1950's. This technology remained with the scientists and researchers in the lab until the recent advancement in computing power and growth in data fueled the development and application of AI for the commercial uses. AI can bring tremendous economic value to organizations as it can perform various actions and tasks which earlier were unthinkable for a computer to do and beyond human ability. In the last 5 years, AI has garnered a lot of attention and has become a buzzword in the industry just like big data generated the interest, attracting enormous investments from Angels, VCs, and corporations. AI technology development has been faster than any other similar promising technologies. Unlike AR/VR which is still on the hype curve, AI has started to move into the mainstream. It now lives everywhere, from smartphones to refrigerators. It has evolved significantly in the last few years but is still far from achieving human-level intelligence or the state of Singularity. As AI technology is maturing and evolving every day, it is also unlocking several possibilities. From beating the Alpha Go World champion to developing superhuman skills by playing itself over and over, AI technology is continuously proving that it is achieving the ability to think like a human mind and perform better than humans. Companies are getting closer to creating general purpose AI that can intelligently tackle various challenges in science such as designing new drugs to helping the farmers to increase their crop yield. With every successful AI experiment, a better future is getting promised. However, the growing ability of AI to succeed in human skills is also raising concerns and worries. The most common fear being AI and robots taking away jobs from humans. Michio Kaku says the jobs that are going to be safe from automation are the ones that robots can't do such as non-repetitive jobs, jobs that require common sense, creativity, and imagination. Some proponents of AI believe that if robots and automation take away jobs, then they will create a different kind of jobs like how industrial revolution has done in the past. Hope you enjoy the read! - Sajan Mathew
  6. 6. What is Artificial Intelligence, Machine learning, and Deep Learning? AI, Machine Learning, Deep learning are very hot buzzwords today, creating both excitement and fear. Even though they are not quite the same thing but they are still used very interchangeably or all in the same context. To leverage the benefits of the AI technology, it is important to understand what AI, ML and DL mean, what they can do and what they can’t do.
  7. 7. Artificial Intelligence Artificial intelligence was coined in 1956 by Dartmouth Assistant Professor John McCarthy. On a high level, AI could be defined as "an ability of a machine/computer to think intelligently like humans." Basic ‘AI’ has existed for decades, via rules-based programs that deliver rudimentary displays of ‘intelligence’ in specific contexts. Progress, however, has been limited — because algorithms to tackle many real-world problems such as predicting machine failures, identifying objects in images etc are too complex for people to program by hand. What if we could transfer these complex activities from the programmer to the program? This is the promise of modern artificial intelligence. AI research has focused on five fields of enquiry: 1. Reasoning: the ability to solve problems through logical deduction. e.g. Legal assessment; financial asset management 2. Knowledge: the ability to represent knowledge about the world. e.g. Medical diagnosis; drug creation; media recommendation 3. Planning: the ability to set and achieve goals. e.g. Logistics; scheduling; navigation; predictive maintenance 4, Communication: the ability to understand written and spoken language. e.g. Voice control; intelligent agents, assistants and customer support 5. Perception: the ability to deduce things about the world from visual images, sounds and other sensory inputs. E.g. Autonomous vehicles; medical diagnosis; surveillance. Why is AI rising Now? Improved algorithms: There has been a huge evolution in the algorithms that are used to provide intelligence to the machines. Specialised hardware: The advancement in the computational and processing hardware has slashed the time required to train a machine. Extensive data: Data creation and availability has grown exponentially in the last two decades, which is fueling the development of AI systems. Interest and entrepreneurship: The interest and awareness of AI have increased in the last five years from big companies and startups alike, which has attracted investments and talents, catalyzing the progress. Description source: https://bit.ly/2hfiauX
  8. 8. Categories of AI AI is the branch of computer science that gives the machines/computers ability to mimic human decision making processes or intelligence and carry out tasks in human ways. Even though AI is a very broad term, experts categorize AI development into Artificial Narrow Intelligence, Artificial General Intelligence and Artificial Super Intelligence ANI: Artificial Narrow Intelligence AI that specializes in optimizing a specific area/task such as playing chess or recommending songs on Spotify and that’s the only thing it does.” AGI: Artificial General Intelligence AI that has reached or passed the human intelligence, has the ability to “reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience. It can handle tasks from different areas and apply experience gathered in one area to a different area. In comparison to a narrow AI, a general AI has all the necessary knowledge and abilities to improve not only tomato growth in a greenhouse but cucumber, eggplant, peppers, radishes and kohlrabi as well. Thus, a general AI is a system, that can handle more than just one specific task. ASI: Artificial Super Intelligence AI that achieves a level of intelligence smarter than all of humanity combined — “ranging from just a little smarter … to one trillion times smarter.” To date humans have been able to achieve ANI through hard-coded and sophisticated algorithms and programs, and now it exists everywhere, from Google search to airplanes. Currently these ANI systems don’t pose any existential threats but badly programmed or poorly tested AI could cause loss. However every advancement in AI research is maturing ANI and bringing use closer to AGI/ASI.
  9. 9. Types of AI Type I AI: Reactive machines The most basic types of AI systems are purely reactive, and have the ability neither to form memories nor to use past experiences to inform current decisions. E.g. Deep Blue, IBM’s chess-playing supercomputer. Deep Blue can identify the pieces on a chess board and know how each moves. It can make predictions about what moves might be next for it and its opponent. And it can choose the most optimal moves from among the possibilities. But it doesn’t remember the past moves, or what has happened before. Type II AI: Limited memory This Type II class contains machines can look into the past. E.g. Self-driving cars. They observe other cars’ speed and direction and identify specific objects and monitor them over time. These observations are added to the self-driving cars’ preprogrammed representations of the world, but aren’t saved as part of the car’s library of experience it can learn from, the way human drivers compile experience over years behind the wheel. Type III AI: Theory of mind Machines in the next, more advanced, class not only form representations about the world, but also about other agents or entities in the world. In psychology, this is called “theory of mind” – the understanding that people, creatures and objects in the world can have thoughts and emotions that affect their own behavior. If AI systems are indeed ever to walk among us, they’ll have to be able to understand that each of us has thoughts and feelings and expectations for how we’ll be treated. And they’ll have to adjust their behavior accordingly. Type IV AI: Self-awareness The final step of AI development is to build systems that can form representations about themselves. This is, in a sense, an extension of Type III artificial intelligence. Consciousness is also called “self-awareness”. Conscious beings are aware of themselves, know about their internal states, and are able to predict the feelings of others. While we are probably far from creating machines that are self-aware, we should focus our efforts toward understanding memory, learning and the ability to base decisions on past experiences. Description source: https://bit.ly/2fyeKmL
  10. 10. Summary Artificial Intelligence: Anything which enables computers to think and behave like a human Machine Learning: Subset of Artificial Intelligence which deals with the extraction of patterns from data sets Deep Learning: A specific class of Machine Learning algorithms which are using complex neural networks
  11. 11. Machine Learning As teaching computers became challenging and complex, Arthur Samuel in 1959 developed the idea of teaching computers to learn for themselves and coined the term Machine Learning, a sub-field of AI. The emergence of the internet and large stream of data, made Machine learning to be a more efficient way to teach computers to think like humans. At the basic level, Machine Learning is the practice of using algorithms to parse data, learn from it, and then make a recommendation, prediction about something in the world, find the association between events or cluster by a condition. Instead of hand-coding complex software routines to accomplish a particular task, the machine is “trained” using large amounts of data and algorithms that give it the ability to learn how to perform the task. Labeled data: A dataset that has been tagged with one or more labels, an input, and the desired output value. Unlabeled data: Samples of natural or human-created artifacts that can be obtained relatively easily from the world but lack meaningful tags that are informative. e.g. photos, audio recordings, videos, news articles, tweets etc. Machine learning algorithms initially provided with examples whose outputs are known, notes the difference between the predictions and the correct outputs, and tunes the weightings of the inputs to improve the accuracy of its predictions until they are optimized. e.g. finding the probability of the person enjoying a film in the future based on the films the person has watched in the past. The defining characteristic of machine learning algorithms, therefore, is that the quality of their predictions improves with experience. The more data we provide, the better the prediction engines we can create. First, the "training data" must be labeled and then it is "classified." When features of the object in question are labeled and put into the system with a set of rules it leads to a prediction. e.g. "red" and "round" are inputs into the system that leads to the output: Apple. Similarly, a learning algorithm could also be left alone to create its own rules that will apply when it is provided with a large set of the object—like a group of apples, and the machine figures out that they have properties like "round" and "red" in common. Description source: https://tek.io/2npWvp0
  12. 12. ML Algorithms Types Machine learning algorithms can be fundamentally categorized in two ways: 1. By the learning style 2. By similarity in form or function Algorithms Grouped by Learning Style There are different ways an algorithm can model a problem based on its interaction with the experience or environment. It is suggested to first consider the learning styles that an algorithm can adopt because it forces you to think about the roles of the input data and the model preparation process and select one that is the most appropriate for your problem in order to get the best result. Three learning styles in machine learning algorithms: Supervised Learning Supervised learning is mainly used in predictive modeling. A predictive model is basically a model constructed from a machine learning algorithm and features or attributes from training data such that we can predict a value using the other values obtained from the input data. Supervised learning algorithms try to model relationships and dependencies between the target prediction output and the input features such that we can predict the output values for new data based on those relationships which it learned from the previous datasets. Most of the time we are not able to figure out the true function that always make the correct predictions and the algorithm rely upon the assumption made by humans about how the computer should learn and these assumptions introduce bias. Here the human experts acts as the teacher where we feed the computer with Input data which is called training data and has a known label or result (output) such as spam/not-spam or a stock price at a time. A model is prepared through a training process in which it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data. Description source: https://bit.ly/2yIMqem
  13. 13. The main types of supervised learning algorithms include: Classification algorithms: These algorithms build predictive models from training data which have features and class labels. These predictive models in-turn use the features learned from training data on new, previously unseen data to predict their class labels. The output classes are discrete. Types of classification algorithms include decision trees, random forests, Support Vector Machines (SVM) Neural Networks etc. Regression algorithms: These algorithms are used to predict output values based on some input features obtained from the data. To do this, the algorithm builds a model based on features and output values of the training data and this model is used to predict values for new data. The output values, in this case, are continuous and not discrete. Types of regression algorithms include logistic regression, linear regression, multivariate regression, regression trees, and lasso regression, among many others. Examples of Supervised learning applications: 1. Predict creditworthiness of credit card holders: Build an ML model to look for delinquency attributes by providing it with delinquent and non-delinquent customers 2. Predict patient readmission rates: Build a regression model by providing data on the patient treatment regime and readmissions to show variables that best correlate with readmission variables. 3. Analyze products customer buy together: Build a supervised learning to identify frequent sets and association rules from transactional data. Description source: https://bit.ly/2uoRZgr
  14. 14. The main types of unsupervised learning algorithms include: Clustering algorithms: The main objective of these algorithms is to cluster or group input data points into different classes or categories using just the features derived from the input data alone and no other external information. Unlike classification, the output labels are not known beforehand in clustering. There are different approaches to build clustering models, such as by using means, medoids, hierarchies, and many more. Some popular clustering algorithms include k-means, k-medoids, and hierarchical clustering. Association rule learning algorithms: These algorithms are used to mine and extract rules and patterns from data sets. These rules explain relationships between different variables and attributes, and also depict frequent item sets and patterns which occur in the data. These rules in turn help discover useful insights for any business or organization from their huge data repositories. Popular algorithms include Apriori and FP Growth. Examples of Unsupervised learning applications: 1. Segment customers by behavioral characteristics: Survey prospects and customers to develop multiple segments using clustering 2. Categorize MRI data by normal or abnormal images: Use deep learning techniques to build a model that learns different features of images to recognize different patterns. 3. Recommend products to customers based on past purchases: Build a collaborative filtering model based on past purchases by "Customers like them" Unsupervised Learning The unsupervised learning algorithms are the family of machine learning algorithms which are mainly used in pattern detection and descriptive modeling. However, there are no output categories or labels here based on which the algorithm can try to model relationships. These algorithms try to use techniques on the input data to mine for rules, detect patterns, and summarize and group the data points which help in deriving meaningful insights and describe the data better to the users. Description source: https://bit.ly/2uoRZgr
  15. 15. These methods exploit the idea that even though the group memberships of the unlabeled data are unknown, this data carries important information about the group parameters. Input data is a mixture of labeled and unlabelled examples. There is a desired prediction problem but the model must learn the structures to organize the data as well as make predictions. Semi-supervised learning can be applied to classification and regression problems. Example algorithms are extensions to other flexible methods that make assumptions about how to model the unlabeled data. Semi-Supervised Learning In the previous two types, either there are no labels for all the observation in the dataset or labels are present for all the observations. Semi-supervised learning falls in between these two. In many practical situations, the cost to label is quite high, since it requires skilled human experts to do that. So, in cases where labels are absent in the majority of the data but present in few, semi-supervised algorithms are the best candidates for the model building. Description source: https://bit.ly/2uoRZgr
  16. 16. Reinforcement Learning This method aims at using observations gathered from the interaction with the environment to take actions that would maximize the reward or minimize the risk. It allows machines and software agents to automatically determine the ideal behavior within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behavior; this is known as the reinforcement signal. There are many different algorithms that tackle this issue. As a matter of fact, Reinforcement Learning is defined by a specific type of problem, and all of its solutions are classed as Reinforcement Learning algorithms. In order to produce intelligent programs (also called agents), reinforcement learning goes through the following steps: 1. Input state is observed by the agent. 2. Decision making function is used to make the agent perform an action. 3. After the action is performed, the agent receives reward or reinforcement from the environment. 4. The state-action pair information about the reward is stored. Example algorithms include: Q-Learning, Temporal Difference (TD), Deep Adversarial Networks Examples of reinforcement learning applications 1. Create a 'next best offer' model fo the call center group: Build a predictive model that learns over time as users accept or reject offers made by the sales staff. 2. Allocate scarce medical resources to handle different types of ER cases: Build a Markov Decision Process that learns treatment strategies for each type of ER case. 3. Reduce excess stock with dynamic pricing: Build a dynamic pricing model that adjusts the price based on customers response to offers. Description source: https://bit.ly/2uoRZgr Reinforcement learning algorithm (called the agent) continuously learns from the environment in an iterative fashion. In the process, the agent learns from its experiences of the environment until it explores the full range of possible states.
  17. 17. Algorithms Grouped by Similarity Algorithms are grouped by similarity in terms of their function, (how they work). This is not an exhaustive list but a list of the popular machine learning algorithms. Regression Algorithms Regression Algorithms Regression is concerned with modeling the relationship between variables that are iteratively refined using a measure of error in the predictions made by the model. Regression methods are a workhorse of statistics and have been co-opted into statistical machine learning. Regression is a process. E.g. Linear Regression, Logistic Regression, and Stepwise Regression, Multivariate Adaptive Regression Splines (MARS), Locally Estimated Scatterplot Smoothing (LOESS) Instance-based Algorithms Instance-based learning model is a decision problem with instances or examples of training data that are deemed important or required to the model. Such methods typically build up a database of example data and compare new data to the database using a similarity measure in order to find the best match and make a prediction. For this reason, instance-based methods are also called winner-take- all methods and memory-based learning. Focus is put on the representation of the stored instances and similarity measures used between instances. E.g. k-Nearest Neighbor (kNN), Learning Vector Quantization (LVQ), Locally Weighted Learning (LWL) Description source: https://bit.ly/2yIMqem
  18. 18. Decision Tree Algorithms Decision tree methods construct a model of decisions made based on the actual values of attributes in the data. Decisions fork in tree structures until a prediction decision is made for a given record. Decision trees are trained on data for classification and regression problems. Decision trees are often fast and accurate and a big favorite in machine learning. E.g. Classification and Regression Tree (CART), Conditional Decision Trees Bayesian Algorithms Bayesian methods are those that explicitly apply Bayes’ Theorem for problems such as classification and regression. E.g. Naive Bayes, Gaussian Naive Bayes, Multinomial Naive Bayes, Bayesian Network (BN) Clustering Algorithms Clustering, like regression, describes the class of problem and the class of methods. Clustering methods are typically organized by the modeling approaches such as centroid-based and hierarchial. All methods are concerned with using the inherent structures in the data to best organize the data into groups of maximum commonality. E.g. k-Means, k-Medians, Expectation Maximisation (EM), Hierarchical Clustering Description source: https://bit.ly/2yIMqem
  19. 19. Association Rule Learning Algorithms Association rule learning methods extract rules that best explain observed relationships between variables in data. These rules can discover important and commercially useful associations in large multidimensional datasets that can be exploited by an organization. E.g. Apriori algorithm, Eclat algorithm Dimensionality Reduction Algorithms Like clustering methods, dimensionality reduction seeks and exploit the inherent structure of the data. In this case in an unsupervised manner or order to summarize or describe data using less information. This can be useful to visualize dimensional data or to simplify data which can then be used in a supervised learning method. Many of these methods can be adapted for use in classification and regression. E.g. Principal Component Analysis (PCA), Principal Component Regression (PCR), Sammon Mapping Ensemble Algorithms Ensemble methods are models composed of multiple weaker models that are independently trained and whose predictions are combined in some way to make the overall prediction. Effort is put into what types of weak learners to combine and the ways in which to combine them. This is a very powerful class of techniques and as such is very popular. E.g. Boosting, Bootstrapped Aggregation (Bagging) AdaBoost, Stacked Generalization (blending) Gradient Boosting Machines (GBM), Random Forest Description source: https://bit.ly/2yIMqem
  20. 20. Association Rule Learning Algorithms Association rule learning methods extract rules that best explain observed relationships between variables in data. These rules can discover important and commercially useful associations in large multidimensional datasets that can be exploited by an organization. E.g. Apriori algorithm, Eclat algorithm Dimensionality Reduction Algorithms Like clustering methods, dimensionality reduction seeks and exploits the inherent structure of the data. In this case in an unsupervised manner or order to summarize or describe data using less information. This can be useful to visualize dimensional data or to simplify data which can then be used in a supervised learning method. Many of these methods can be adapted for use in classification and regression. E.g. Principal Component Analysis (PCA), Principal Component Regression (PCR), Sammon Mapping Ensemble Algorithms Ensemble methods are models composed of multiple weaker models that are independently trained and whose predictions are combined in some way to make the overall prediction. Effort is put into what types of weak learners to combine and the ways in which to combine them. This is a very powerful class of techniques and as such is very popular. E.g. Boosting, Bootstrapped Aggregation (Bagging) AdaBoost, Stacked Generalization (blending) Gradient Boosting Machines (GBM), Random Forest Description source: https://bit.ly/2yIMqem
  21. 21. Image source: https://bit.ly/2uoRZgr Summary The 7 Steps of Machine Learning
  22. 22. Deep Learning Deep learning has revolutionized the world of artificial intelligence, it is a sub- set of machine learning. All deep learning is machine learning, but not all machine learning is deep learning. Deep learning is useful because it avoids the programmer having to undertake the tasks of feature specification (defining the features to analyze from the data) or optimization (how to weigh the data to deliver an accurate prediction) — the algorithm does both. The breakthrough in deep learning is to model the brain, not the world. Our own brains learn to do difficult things — including understanding speech and recognizing objects — not by processing exhaustive rules but through practice and feedback. Deep learning uses the same approach. Artificial, software-based calculators that approximate the function of neurons in a brain are connected together. They form a ‘neural network’ which receives an input; analyses it; makes a determination about it and is informed if its determination is correct. If the output is wrong, the connections between the neurons are adjusted by the algorithm, which will change future predictions. Initially the network will be wrong many times. But as we feed in millions of examples, the connections between neurons will be tuned so the neural network makes correct determinations on almost all occasions. Using this process, with increasing effectiveness we can now recognize elements in pictures, translate between languages in real-time, detect tumours in medical images; and more. Deep learning is not well suited to every problem as It typically requires large datasets for training and extensive processing power to train and run a neural network. And it has an ‘explainability’ problem — it can be difficult to know how a neural network developed its predictions. But by freeing programmers from complex feature specification, deep learning has delivered successful prediction engines for a range of important problems. As a result, it has become a powerful tool in the AI developer’s toolkit. Description source: https://bit.ly/2hfiauX
  23. 23. How DL works? Deep learning involves using an artificial ‘neural network’ — a collection of ‘neurons’ (software-based calculators) connected together. An artificial neuron has one or more inputs. It performs a mathematical calculation based on these to deliver an output. The output will depend on both the ‘weights’ of each input and the configuration of ‘input-output function’ in the neuron. The input-output function can vary. A neuron may be a linear unit (the output is proportional to the total weighted input, a threshold unit (the output is set to one of two levels, depending on whether the total input is above a specified value); or a sigmoid unit (the output varies continuously, but not linearly as the input changes). A neural network is created when neurons are connected to one another; the output of one neuron becomes an input for another. Neural networks are organized into multiple layers of neurons (hence ‘deep’ learning). The ‘input layer’ receives information the network will process — for example, a set of pictures. The ‘output layer’ provides the results. Between the input and output layers are ‘hidden layers’ where most activities occurs. Typically, the outputs of each neuron on one level of the neural network serves as one of the inputs for each of the neurons in the next layer. Typically, neural networks are trained by exposing them to a large number of labeled examples. Errors are detected and the weights of the connections between the neurons tuned by the algorithm to improve results. The optimization process is extensively repeated, after which the system is deployed and unlabelled images are assessed. Description source: https://bit.ly/2hfiauX
  24. 24. Deep Learning Models Convolutional Neural Networks ConvNets or CNNs are a category of Neural Networks that have a different architecture than regular Neural Networks. Regular Neural Networks transform an unstructured data set e.g. images by putting it through a series of connected hidden layers made up of a set of neurons and receive the prediction as an output. CNNs organize the layers in 3 dimensions: width, height and depth. Further, the neurons in one layer do not connect to all the neurons in the next layer but only to a small region of it. Lastly, the final output will be reduced to a single vector of probability scores, organized along the depth dimension. CNNs have two components: The Hidden layers/Feature extraction part In this part, the network will perform a series of convolutions and pooling operations during which the features are detected. If you had a picture of a zebra, this is the part where the network would recognise its stripes, two ears, and four legs. The Classification part Here, the fully connected layers will serve as a classifier on top of these extracted features. They will assign a probability for the object on the image being what the algorithm predicts it is. Examples: 1. Diagnose health diseases from medical conditions 2. Understand customer brand perception and usage through images 3. Detect a defective product on a production line through images Description source: https://bit.ly/2KGHgFT
  25. 25. Recurrent Neural Network (RNN) Recurrent neural network (RNN) is a class of artificial neural network which stores information in the context nodes to process sequences of inputs and provide the output based on the input sequence. Unlike other neural networks, all the inputs in RNN are related to each other. e.g. To predict the next word in a given sentence, the relation among all the previous words helps in predicting the better output. The RNN remembers all these relations while training itself. In order to achieve it, the RNN creates the networks with loops in them, which allows it to persist the information. This loop structure allows the neural network to take the sequence of input. As you can see in the unrolled version. First, it takes the x(0) from the sequence of input and then it outputs h(0) which together with x(1) is the input for the next step. So, the h(0) and x(1) is the input for the next step. Similarly, h(1) from the next is the input with x(2) for the next step and so on. This way, it keeps remembering the context while training. The following are the few applications of the RNN: 1. Next word prediction 2. Music composition. 3. Image captioning 4. Speech recognition 5. Time series anomaly detection 6. Stock market prediction Description source: https://bit.ly/2xfR4NK
  26. 26. Summary Description source: https://bit.ly/2I5pSpQ
  27. 27. AI enabled Companies Today organizations are implementing/adopting AI at different levels depending on the organization’s vision, understanding, structure, agility, and ability to create opportunities for both creativity and disruption. These organizations can be classified into three broad categories: Applied AI companies Applied AI companies use AI to optimize, personalize and/or automate existing processes, products and services to make people, businesses and organizations more productive. Today, most companies adopting AI would fit in this category. AI First AI First companies distinguish themselves in that they develop applications/services/products that are built from the ground up with AI at their core and use every interaction with customers or users to feed and train the AI algorithms and as a result improve the quality of the application /product/service with each and every interaction, enabling new experiences and business models and increasingly stronger competitive moats. AI First solutions will change the way we interact with software/machines from a master-slave relationship (where we tell the machine what to do and the machine executes) to more of a peer-to-peer relationship (where the machine anticipates our needs and makes suggestions as we interact with it) and eventually to a slave-master relationship (where the machine tells us what/when/how to do — based on a series of inputs and desired outcomes). The AI Stack/Machine As mentioned above, the world’s most dominant companies over the past five years, including Google, Facebook, Amazon, Microsoft, Apple and Baidu, are making massive investments in AI with the ambition of becoming the platform (the Machine) that gets used to run our lives, our businesses and our societies. Companies that are building the different components of the Machine and understand how to best interact with it to create value for themselves and their customers are also of great interest. This includes AI technology such as new frameworks, software infrastructure (distributed, centralized, local/edge, hybrid) required to run AI powered solutions, hardware platforms specialized for AI (distributed, centralized, local/edge, hybrid), connected devices, etc. Description source: https://bit.ly/2xAqI9k
  28. 28. AI CASE STUDIES Artificial Intelligence at India’s Top eCommerce Firms
  29. 29. Flipkart's Project Mira Flipkart's Project Mira is an artificial intelligence focused on understanding customers better to improve product search experience and reduce product returns. Project Mira was piloted through a conversational search experience that guides the users with relevant questions, conversational filters, shopping ideas, offers, and trending collections. Flipkart's marketplace processes more than 400,000 shipments a day, of which customers return 10-11%. One-in-four fashion products such as clothing and accessories are returned because of reasons such as incorrect fit or as customers change their minds about a particular style. When Flipkart's team reviewed product returns data on shoes and lifestyle categories, they observed that there was a mismatch of expectations from customers regarding size and fit issues. Flipkart’s team of experts started brainstorming for attributes that could be prompted to buyers instead of having them narrow the search results using filters. Say, if a Flipkart customer is searching for an air-conditioner Mira would ask the customer about what kind of AC they want, the tonnage, room size, brand, etc. Flipkart also uses Mira to streamline its backend processes such as — accurate classification of products, accurate product descriptions, avoid duplication etc. Flipkart adds more than 10 million products from around 20,000 sellers every month. Due to the unstructured data provided by sellers often it hugely difficult to accurately classify a product to an automated catalog. Mira can classify the product into the specific vertical based on the provided image. For verticals with similar images (say shampoos and body lotions), it uses the product description to classify products with 95% accuracy. Mira can also detect incorrect images and morphed images and identify duplicate products as sellers intentionally or unintentionally post duplicate products, increasing user's effort in scanning to the desired set of products. Project Mira is still in its infancy, but it has been expanded to several verticals to solve issues such as product returns and quicker delivery. Anticipate if a delivery is likely to lead to return and for what reasons. Estimate if the product can be delivered in two days time to avoid midway customer cancellations. Description source: https://bit.ly/2xfGQgV
  30. 30. Myntra’s Rapid platform and Other AI initiatives What's Rapid Platform? Fashion e-tailer Myntra’s AI initiatives are centered around three key pillars – product, experience, and logistics: Product Myntra is focussing on building intelligent fast fashion through its AI platform known as Rapid. “Fast fashion”, a term used by retailers to describe the speeding up of production processes to get new trends to the market as quickly and cheaply as possible. This can dramatically reduce the time taken to create a fashion product to few weeks from the typically long 9-14 months lifecycle. Leveraging the available sales data, best selling attributes can be identified and based on that designers can start producing the fashion items. This has helped Myntra to quickly uncover fashion trends. Myntra is using a new technique called Generative Adversarial Networks (GANs) for design which creates products that are similar but not the same. Myntra has also launched T-shirts with fully machine generated designs. It is also using Rapid to intelligently select what to sell on the platform. Experience Myntra is using machine learning to improve the payment acceptance rates of online transactions – an issue which is particularly prevalent in India where failure rates are high. Online payments transactions typically fail in India due to two distinct set of reasons: 1. User abandonment, this could be due to anything from a patchy internet connection, a clunky interface, to just loss of interest. 2. Banks, which provide acquiring services in any payment transactions, tend to have poor IT systems. Using machine learning, the system figures out the best payment gateway the payment needs to be routed through. This is done by detecting and analysing thousands of success and failure patterns and then sending it through the most optimized route. Myntra also enhances the user experience by giving the right recommendations based on what a customer has seen or bought in the past. It uses “collaborative filtering”, which recommends products to one person based on what another person has just bought and also helps match which fashion goes well with what.
  31. 31. Logistics As customers often complain about late refunds, Myntra created ‘Sabre’, an AI-based returns system that enables faster refunds for customers who demonstrated good buying-return behaviour. Myntra wants to make its returns policy more efficient as it believes that returns is an integral part of the fashion industry – which relies on sizes, fits, and tastes that make returns more common than other sectors (differentiated goods like apparel tend to have higher return rates than undifferentiated goods). By analysing a customer’s past returns patterns, Myntra claims that ‘Sabre’ is able to detect which customers are genuinely returning the shipments and which ones are attempting fraudulent activities. Myntra is also aiming to reduce its rate of return to origin (RTO). A higher RTO translates into higher losses as many cash on delivery (COD) orders are not delivered for various reasons such as customers not being present or not having cash at that point in time. Myntra can now to a great extent predict if something is going to result in an RTO. Description source: https://bit.ly/2KQLY0a
  32. 32. Amazon India's AI enabled Smart eCommerce Amazon India has used machine learning and AI in a number of areas. Correcting Addresses Addresses in India are not well structured and often users enter incorrect addresses (e.g. wrong pin code or city name) or addresses with missing information (e.g. missing street name). Wrong addresses cause packages to miss delivery dates and lead to failed deliveries. The company has been using machine learning techniques to detect junk addresses, compute address quality scores, correct city-pin code mismatches, and provide suggestions to users to correct wrong addresses. Catalog Quality Product catalog defects such as missing attributes like brand, color or poor- quality images/titles can adversely impact customer experience. The company is using AI and machine learning to extract missing attribute information like brand or color from product titles and images. Product Size Recommendations In categories such as shoes and other apparels, different brands often have different size conventions. For example, a catalog size 6 may correspond to a physical size of 15 cm for Reebok while for Nike a catalog size 6 may correspond to a physical size of 16 cm. Amazon uses using machine learning to recommend the product size that would best fit a customer when the customer visits a product page based on past customer purchase and returns data (e.g. product size was too large/small). Deals for Events Machine learning is used to identify relevant products for specific events such as Diwali, Christmas, etc. These are typically products that are in high demand during the event or get high volumes of search queries and review mentions during the event period. Machine Learning algorithms also predict the deals and discounts to offer on the products to achieve a certain sales forecast that helps in better planning. By training a machine learning system on past holiday purchase data and current purchase activity, a system may be able to calibrate demand more accurately in order to sell products at the right prices to either (a) move certain items at high volume, or (b) maximize profit margins by matching the highest margin products to users during the holiday season. Description source: https://bit.ly/2KQLY0a
  33. 33. THANK YOU! I hope you enjoyed reading this guide and it met your expectations regarding learning about AI. If you are interested in trying the AI Strategy Canvas, please signup here https://bit.ly/2OGl6BC or if you have feedback for AI 101 guide, send it to i.sajanmathew@outlook.com or simply connect https://www.linkedin.com/in/mathewsajan/ And feel free to share this with others https://bit.ly/2xvzKFf There is one more thing....
  34. 34. Check out some cool AI experiments by coders here experiments.withgoogle.com