Reinforcement Learning- AI Track

Reinforcement Learning:
Deﬁning Next-Generation
Artiﬁcial Intelligence Applications
Startup Ecosystem Analysis

Executive Summary
Reinforcement learning is the next revolution in artiﬁcial intelligence (AI). It is an advanced AI
algorithm, and has the potential to form the basis of autonomous systems with decision-
making capabilities. Reinforcement learning is based on a feedback-driven architecture, with
agent-based learning mechanisms making it suitable for dynamic environments. It is
fundamentally different from supervised and unsupervised algorithms that require a large
dataset for training purposes. In contrast to these algorithms, reinforcement learning
algorithms can continually learn from experience rather than only from data. The self-learning
ability of reinforcement learning algorithms is critical for exploring the real opportunities of AI
and to expand these technologies into practical business use cases.
The initial illustrations of reinforcement learning were evident in a series of gaming
demonstrations where deep reinforcement learning-based computer models started
challenging expert human players in games like Atari and the board game, Go. DeepMind’s
reinforcement learning models constituted real-life instances of how such techniques could
enable self-learning capabilities, with iterations in a gaming environment. These
groundbreaking outcomes triggered research on use cases of reinforcement learning for
other sectors as well.
In order to move beyond the traditional machine learning algorithms, new AI entities are
targeting business models around reinforcement learning solutions. These startups are
innovating in the reinforcement learning space to develop products and solutions that can
meet the demand of next-generation applications.
Startups are majorly crowded in the automotive, retail/ecommerce and robotics sectors.
Osaro, Kindred, Micropsi Industries, Acutronic Robotics, and Covariant.ai are some of the
startups that are developing reinforcement learning-based solutions for robotic applications,
especially piece-picking in warehouses.

The automotive sector is benefitting from reinforcement learning solution with the real-time
decision-making requirements and Level 5 autonomy in self-driving vehicles. These emerging
technologies can leverage reinforcement learning algorithms to operate in the dynamic and
unstructured environment on the roads. New entities such as Wayve, Latent Logic, Ascent
Robotics, AiGent-Tech, and CogitAI are developing reinforcement learning solutions to
understand and predict complex road scenarios. Most of these companies are using imitation
learning to train their reinforcement learning agents, with the algorithms learning from
human demonstrations through computer vision techniques. Wayve is following a disruptive
method to eliminate pre-mapping constraints in self-driving vehicles, and is challenging the
current set of sensing mechanisms that are being explored for autonomous vehicles.
Financial services and industrial plants are two other sectors that are tapping into the
advantages of reinforcement learning to obtain accurate prediction in their inherently
dynamic environments. Cerebri AI and hiHedge are using reinforcement learning techniques
for optimal trading and risk management, while Nnaisense and NeuDax are revolutionizing
industrial plants with efficient decision-making leveraging reinforcement learning algorithms.
The food retail industry is also benefitting from reinforcement learning. Wasteless is a startup
that is applying reinforcement learning for dynamic pricing strategies based on product
expiration dates, reducing food wastage, and effectively managing inventories. Logistics,
agriculture, education, and research are some other sectors that are also seeing active
deployment of reinforcement learning. Additionally, companies like Deeplite and DataOne
are targeting to accelerate the adoption of reinforcement learning techniques in the IoT
market.
Most of the startups operating in the reinforcement learning space are at an early stage of
development and are conducting tests with industry partners. In-future, these companies are
mostly going to follow software-as-a-service or robot-as-a-service business model in their
target domains. Licensing out technology is also an option as servitization will be cost effec-
tive for the customers. These startups can also be potential collaborators across industries for
diversified applications, as self-learning and self-configuring abilities are going to define the
next generation of AI technologies.
The effort by these startups is a clear indication of how the market is getting ready for
Artificial General Intelligence (AGI), with a vision towards the development of safe and secure
systems that would prepare us for the next AI revolution.

Content
Introduction
Taxonomy of Reinforcement Learning
Algorithms and Other Important Concepts
Model-based and Model-free Methods
Imitation Learning
Artiﬁcial General Intelligence (AGI)
Reinforcement Learning Successes
in Complex Gaming Environments
How Reinforcement Learning Beneﬁts
Different Industries
How Reinforcement
Learning is Transforming
the Present Scenario
Startup Ecosystem
Detailed Analysis of Prominent Startups
Osaro
Open AI 26
Acutronic Robotics 29
Wayve 31
Kindred 33
Prowler.io 36
7
9
10
11
11
12
14
18
20
23
Cerebri AI 39
Micropsi Industries 41
CogitAI 43
InstaDeep 45
23
1
2
3

Other Startups That are Leveraging
Reinforcement Learning Across Industries
Insights &
Recommendations
Acronyms
48
63
66
References 68
Content
4
5
6

Inputs
Figure 1: Types of ML Techniques
Introduction
Supervised Learning Unsupervised Learning Reinforcement Learning
Labeling
Output
Unlabeled
Data
Output
Agent
Environment
7
State
Reward
Action
Machine learning (ML), a subfield of AI, is based on algorithms that enable machines (systems)
to learn without being explicitly programmed. ML is not a new concept and is being explored
for some time now.
The explosion of data availability and the complexity of application requirements, however,
has led to innovations in the ML domain with attempts being made to unleash the full
potential of the learning abilities of machines. In addition, the quest to develop autonomous
solutions, starting from prescriptive and predictive frameworks, has subdivided ML
technologies into three categories for better customization and use case suitability. These
include: Supervised Learning, Unsupervised Learning and Reinforcement Learning, as shown
in Figure 1.
Supervised learning deals with mapping a function from the input and output variables by
utilizing labelled datasets. This labeling is done for a controlled training of the ML algorithms
for each piece of new data introduced in the system. Supervised learning techniques learn
from the tagging process, identify similar patterns in the new datasets, and perform the
required tasks. These techniques are best suited for making predictions and classification
tasks that require prior references for decision making. On the other hand, unsupervised
learning techniques involve training an algorithm whose datasets are neither labelled, nor
categorized. These techniques can organize the unlabeled data into similar groups, some-
thing supervised learning is incapable of doing. Supervised and unsupervised algorithms are
already being used in multiple applications.
Reinforcement Learning: Defining Next-Generation Artificial Intelligence Applications

Parameters
Datasets
Feedback
Use cases
Examples of
Algorithms
Maturity
Large volumes Large volumes
Human-based labelling No
Classification, prediction,
and regression problems
Clustering, dimensionality
reduction, and autoencoding
Linear Regression,
Decision Tree, Naive Bayes
Commercialized
(Most of the current
ML implementations)
K-means Clustering,
Gaussian Mixture Model,
Recommender System
In research (expected to
take more time for
commercial success)
Small dataset
Reward system
Optimization and learning
through actions
Q-learning, SARSA, DQN
Initial adoption has begun for
real-world business applications
Table 1: Comparison of ML Techniques
Reinforcement learning techniques, however, can address the requirements for building a
completely self-learning framework. The core concept of reinforcement learning lies in
designing intelligent agents which interact with the environment in discrete steps. The main
difference between reinforcement learning and the other two ML techniques is that
reinforcement learning does not require a large volume of training data. The agents learn by
interacting with the environment based on a reward system that also acts as a feedback
mechanism. In a way, reinforcement learning techniques rely on trial and error for
continuously improvising and carrying out the assigned task, making it suitable for dynamic
environments. The ultimate goal of reinforcement learning is to embed an optimal behavior
into the agents to maximize the reward function.
Table 1 shows a comparative view of the different types of ML techniques.
Supervised Learning Unsupervised Learning Reinforcement Learning
8Reinforcement Learning: Defining Next-Generation Artificial Intelligence Applications

Taxonomy of Reinforcement
Learning Algorithms and
Other Important Concepts
Figure 2 provides an overview of the classification
of the reinforcement learning algorithms.
RL Algorithms
Model-free RL Model based RL
Policy optimization Q-learning Learn the Model Given the Model
Alpha Zero
World Models
I2A
MBMF
MBVE
DQN
CS1
QR-DQN
HER
SAC
TD3
DDPG
A2C/A3C
PPO
TRPO
Policy Gradient
Figure 2: Taxonomy of Reinforcement Learning Algorithms (Source: OpenAI)1
9 Reinforcement Learning: Defining Next-Generation Artificial Intelligence Applications

Model-based and
Model-free
Methods
Reinforcement learning
algorithms are broadly divided into
two types based on whether the agent
follows a pre-set model of the
environment defined by the user, or
starts learning about the environment
from scratch. Model-based
reinforcement learning has its own set
of advantages where the agent is
aware of various parameters and can
determine possibilities and predict
certain outcomes. This approach in
an ideal environment can help
improve sample efficiency (requires
less data to learn policy), which is a
major concern in reinforcement
learning. However, model-based
techniques do not assure accurate
mapping with the real-world
environment, making it an intensive
task with reduced chances of success.
Also, these model-based
reinforcement learning algorithms
have been shown to be inefficient for
larger state and action spaces.
Model-free methods may not be as
effective in terms of improving sample
efficiency, but are easier to implement
and are more popular. The model-free
reinforcement learning algorithms
overcome the issues of larger state and
action spaces by updating the data
continuously, eliminating the need of
action and state space storage combi-
nations. In fact, model-based methods
also sometimes apply model-free
techniques for developing a model of
the environment.
The model-free methods are mainly divided into
two parts – policy-based and value-based
methods. Policy is a concept that defines suitable
actions to be taken at a given state. Value is
another fundamental concept of reinforcement
learning that measures the importance of a state
and the way of reaching a certain state or taking a
certain action.

Imitation Learning
In imitation learning, the learner
(machine or system) learns from
expert demonstrations that are
mostly provided by humans. In a
way, imitation learning broadly falls
under supervised learning
techniques. But what makes
imitation learning similar to
reinforcement learning is the way in
which both the methods are used
for performing sequential tasks
where the agent develops policies
for optimum performance. Imitation
of expert trajectories, policy-based
imitation, and optimized and
stabilized behaviour are some of the
prominent advantages of imitation
learning. However, the efficacy of
this method is largely dependent on
the accuracy of human
demonstrations and how well the
course of learning is set up in
dynamic environments. Therefore,
there is a surge in the trend where
people are combining
imitation learning with
reinforcement learning for
improved operation.
Artificial General
Intelligence (AGI)
Currently, most AI implementations
are based on artificial narrow
intelligence (ANI) that is restricted to
carrying out a programmed set of
tasks. The common applications of ANI
include customer preference services,
personalized marketing,
recommendation-based music and
video streaming, virtual assistants and
chatbots, robotic process automation,
intelligent vision systems, predictive
analytics, high-frequency trading, and
cognitive capital schemes. Such
technological advancements are
automating everyday human life and
are changing the way in which
traditional businesses are conducted.
However, ANI is only limited to
performing everyday tasks that are
simple and do not involve dynamic
environments
.AI is at an inflection point with the
next stage of its evolution, AGI, set to
produce= advanced self-learning,
self-organizing, and adaptive systems.
These systems will be capable of
reasoning, planning, solving problems,
comprehending complex and abstract
ideas, and learning from experience,
thus driving an ecosystem
independent of large data
requirements. As a result, they will
possess human-like intelligence,
enabling them to autonomously
perform tasks with increased
productivity and reliability.
Reinforcement learning and continual
learning are going to play a major role
in shaping the future of AGI solutions.

Reinforcement learning techniques came into limelight when companies such as Google’s
DeepMind started experimenting the agent-based methods in gaming. Groundbreaking
results were obtained when reinforcement learning trained agents surpassed human skills
and defeated the best players in games such as Atari, SpaceCraft and Mario.
Reinforcement Learning Successes in
Complex Gaming Environments
Reinforcement Learning Showing Successes
in Complex Gaming Environment
2013
Atari with Deep
Reinforcement
Learning
2015
Google’s DeepMind
announced AlphaGo, the ﬁrst
computer Go program that
defeated a human player
Reinforcement
Learning +
Supervised learning+
Tree Search
2017
OpenAI’s demonstrated
self-play feature
through reinforcement
model in Dota 2
Does not use imitation
learning or tree search
Marl/O program using
neural networks at
Super Mario World
No human
data used
for training
AlphaGo Zero
Learns completely from self-play
and improves with each game
Reinforcement Learning Showing Successes in Complex Gaming Environment
Exploration of Multi-agent Reinforcement has initiated for complex multi-player games. A team of agents
learn from parallel instances of a gaming environment. The individual agents must act independently and
also co-ordinate their behaviors with other agents.
Quake III Arena
DeepMind has demonstrated
population-based reinforcement
learning in Quake III Arena
mulit-player game.
The population of agents are trained by playing with
each other (as teammates and competitors) and
operate at fast and slow timescales
Two-tier Optimization process using Reinforcement
AlphaStar developed by
DeepMind uses a multi-agent
learning algorithm to address
the complexity of StarCraft and
challenge human intellect.
Supervised Learning + Reinforcement Learning
Off-policy actor-critic reinforcement learning with experience
relay, self-imitation learning and policy distillation are used
for neural network weight updation of each agent
SpaceCraft
Sources: DeepMind
Sources: DeepMind, OpenAI

How
Reinforcement
Learning
Benefits
Different
Industries.
Reinforcement learning is
finding significance in several
use cases across sectors as
shown in Figure 3. The
technology is going
mainstream and is helping
businesses develop
next-generation solutions and
driving transformation in the
conventional offerings to boost
efficiency.
Automotive Retail Food Industry
Security Industrial
Sector
Telecomm-
unication
Robotics
Financial
Services
Logistics
Web Research Gaming Energy
Bots Healthcare Agriculture
Education IoT
Figure 3: Reinforcement Learning
Impacting Varied Applications
and Industries

The implementation of Level 4 and 5
autonomous vehicles requires prediction
of complex scenarios and intelligence to
act in dynamic real-world situations
Reinforcement learning in combination with
imitation learning is building end-to-end training
solution for developing self-learning-abilities that
are crucial for autonomous vehicles
Deep reinforcement learning models are
providing the capabilities for learning from
real-world environments and improving driving
skills based on past experiences, without the
need to feed data for all-scenarios.
Sector Industry Challenges Reinforcement Learning Solutions
Automotive
Medical sector requires intelligent
computational methods that can predict
protein native structures from the amino
sequences.
Reinforcement learning is helping
pharmaceutical companies develop tools for
drug discovery and pursue novel strategies for
rapid validation of drugs.
It is helping accelerate the development of
techniques for protein generation for therapeutic
applications. The adoption of these models is,
however, dependent on stringent regulatory
approvals and standards.
Healthcare
Consumer retail is dependent on manual
workforce for efficiency and faster product
delivery. These units are already facing issues
related to shrinking labor forces and lack of
autonomous infrastructure. For example, the
food retail sector is dependent on manual
effort for inventory management and there
is a need to develop tools that can
autonomously monitor perishable products
and prevent food wastage.
Reinforcement learning algorithms in retail
technology are being used to develop inventory
management tools, for learning customer
behavior, and for refining price modeling.
Reinforcement learning models can be used for
building robots that manage warehouse
inventory by handling stocks and piece-picking
objects of varied shape, size, and material.
Reinforcement learning is helping build
differentiated methods in grocery chains to boost
revenue, provide effective waste management,
and provide transition from food labels to
electronic shelf labels.
Retail
The traditional complex industrial
environment currently faces challenges in
areas like process control systems, defect
classification, equipment maintenance, and
other factors – leading to a high operational
cost. Additionally, the path to build an
intelligent system in such infrastructure is
slow due to lack of historical datasets and
slow progress in obtaining accurate AI
models for decision making purposes.
Industrial
Sector
Reinforcement learning can help in uplifting
productivity levels in production processes by
deriving insights from smaller datasets to control
the complex environments. Such solutions will
have an important role to play in the Industry
4.0 ecosystem.
Telecom companies are looking for ML
frameworks that can help in building the
self-organizing networks of the future.
Automatic network planning, configuration,
control and self-healing properties depend
on algorithms that do not depend on
human supervision and can bring full
autonomy on the core telecom infrastructure.
Reinforcement learning is emerging as a tool
capable of addressing the dynamic requirements
of the telecom sector. It is being used for
intelligent operations like power adjustment,
antenna tilting, cell channel selection, handover,
interference management, and churn prediction,
and will be crucial in enabling cell-clusters to
self-heal in the future.
Reinforcement learning is also paving a path for
intelligent scheduling techniques in 5G networks.
Telecomm-
unication
Table 2 provides an overview of how reinforcement learning solutions are essential for some of the
key industry segments.

The demand for reinforcement learning
methods is evident from the number of
startups operating in the domain. These new
entities are exploring the usage of
reinforcement learning models in different
areas and are addressing the existing gaps. The
research of these companies can be expected
to advance AGI development with deﬁned
security features, standards, and control
policies. Most of these companies are initially
deploying reinforcement learning in robotic
platforms to perform varied tasks and resolve
business challenges.
This report covers 39 startups focused on
reinforcement learning techniques, including
detailed analysis of 10 prominent startups in
the domain.
How
Reinforcement
Learning is
Transforming
the Present
Scenario.

Startup Ecosystem
Startups Focusing on Reinforcement Learning
Across Industries

Geographic Distribution of the Startups

Detailed Analysis of Prominent Startups
Current
Target
Markets
Retail
Food Industry
Osaro is an early-stage ML startup founded in 2015. The San Francisco,
USA headquartered company focuses on developing vision and control
software solutions for industrial robotics, leveraging its proprietary deep
reinforcement learning technology. It has received USD 13.3 million
(Mn) in funding, and is backed by high-profile investors including Peter
Thiel (American entrepreneur, cofounder of PayPal), Scott Banister
(board member at PayPal), Sean Parker (cofounder of Napster), Darian
Shirazi (cofounder of Radius) and several other entrepreneurs and
investment firms such as Morado Ventures, and AME Cloud Ventures.2
Osaro builds automation solutions for e-commerce
fulfillment, automotive and advanced manufacturing, and food
assembly, and its potential application areas include drones,
autonomous vehicles, internet of things (IoT), and digital advertising.3
#I

Technology Stack
Osaro’s core technology blends deep learning,
reinforcement learning, sensor fusion, and
motion planning into hardware to enable new
applications and products. The company is one
of the early adopters of deep reinforcement
learning techniques that combine deep
learning architectures with reinforcement
learning algorithms. Osaro’s technology is
capable of training neural networks with large
and relevant datasets. These neural networks
are trained to make inferences on new data by
employing imitation learning technique.
In addition, the deep reinforcement learning
technique addresses the issues of high
dimensional input spaces for a host of
applications ranging from robotics to
autonomous driving systems and drones. The
company employs imitation learning method
over the traditional methods, such as DQN,
A3C, DDPG, Bootstrapped DQN, etc., for
accelerating its deep reinforcement learning
technique.4
Solutions and Offerings
Currently, Osaro offers two software solutions,
OsaroPick and Osaro FoodPick. The first
product, OsaroPick was deployed in Japan in
early 2018. Osaropick is an orientation and
placement solution that enables fully
automated distribution centers and integrates
automated storage and retrieval systems
(ASRS) in an ecommerce environment. It has a
speed accuracy of 99.99% and can manipulate
transparent or reflective packaging.
Recently, the company released its second
product Osaro FoodPick, which can be
employed in the food industry for automated
food assembly tasks. With this product, Osaro
is solving the long-standing problem of
consistently assembling non-uniform food
items without compromising on speed and
accuracy.5
The company also offers software solutions for
other use cases like sorter induction, kitting,
packing, and assembly tasks.
Patenting Activities
Osaro has filed two patent publications related
to reinforcement learning. These patents are
focused on determining a method to be
selected by the agent to interact with the
environment. Its patent focus areas include: 67
Partitioning a reinforcement learning
input state space
The specifications for selecting an
action to be performed by the agent or
computer-implemented agent that
interacts with the environment
Determination method for estimating value
functions in accordance with the actions
performed by the agent in response to the
current observations
Partnerships and Alliances
The company unveiled FoodPick in
collaboration with Denso Wave, a producer of
automated data capturing products and
industrial robots. The collaboration includes
the integration of DENSO Wave’s robotics and
Osaro’s AI, designed for non-uniform food
assembly. Osaro deployed its services with
Innotech Corporation, an electronic design
automation (EDA) software company.8
ABB
has also partnered with Osaro to incorporate
ML techniques in ABB’s products.9
Key Personnel
Derik Pridmore, CEO and cofounder of Osaro,
is an MIT graduate who started off as a
technical analyst for patents and infringement
opinions for various technologies at Wolf
Greenfield. He has worked as an associate at
JP Morgan Chase. An early technology
investor, Derik has expertise in ML, data
analytics, and finance. As a Principal at
Founders Fund, a venture capital firm led by
Peter Thiel, Derik drove investments for
DeepMind and various other ML startups such
as Vicarious, Prior Knowledge, and Palantir. He
is the one of the active investors of Clarifai, a
computer vision and ML-based company for

Why Osaro?
Osaro’s technology differs from that of its
competitors due to its unique approach
towards reinforcement learning. In this
approach, the algorithm inputs are
repeatedly taken until the optimized result is
obtained. It employs sensor fusion
techniques and object manipulation
strategies that are compatible with ASRS
systems and are scalable for high velocity
inventories. Currently, the robotics industry is
limited to highly structured environments
and systems that are considerably sensitive
to calibrations. This is where Osaro’s deep
reinforcement learning techniques can lift
the restrictions by offering adaptive, data
driven and closed loop software solutions,
which learn control via inputs like videos and
images. The advanced perception and
control solutions are enabling the company
to offer services for warehouse automation,
manufacturing, textiles, food, etc.
The company claims that the solutions are
unaffected by variations in lightning and are
compatible across all types of robots and end
effectors. Osaro’s technology enables low cost
sensors and embeds intelligence and training
across its products.
Osaro aims to continue its partnership with
Innotech Corporation for further improving its
solutions. This partnership will add value to the
service portfolio of Osaro by improving error
instances. Advisors and investors on board
with Osaro also indicate the potential for the
company’s accelerated growth.
Future Roadmap
Currently, Osaro is targeting the
piece-picking warehouse applications for
e-commerce fulfillment centers as their first
target market. In the future, the company
aims to improve precision and adaptability
of its solution and is aiming to expand into
other sector including automotive, food, and
electronics manufacturing.11
Limitations
Osaro develops automated solutions using
deep reinforcement learning techniques. The
deep reinforcement learning technique still
require a lot of data, and as a result, these
networks can become unstable in training
and generalization across environments can
be challenging. The company currently
focusses on limited areas of application.
25
Michael Kahane, CTO and Itamar Arel, Advisor
at Osaro, have closely worked for developing
Osaro’s proprietary technology and are
inventors of both the patents assigned to the
company. Michael Kahane has an extensive
experience in designing and implementation
of hardware and software solutions. As an
application engineer in Samsung
Semiconductors he has worked on
technologies like CMOS image sensors, post
silicon projects. Itamar Arel has worked as a
chief technology officer at the startup Binatix
and is currently the principal investigator of
the Machine Intelligence Laboratory at the
University of Tennessee, Knoxville.
image and video analysis. Derik serves an
advisor to 1517 fund, Planet Labs, and Sliced
Investing, Inc., and was a cofounder and
partner at Arda Capital Management, a
quantitative equity fund that employs ML to
global equity markets. After DeepMind was
acquired by Google in 2014, Derik recognized
the potential of reinforcement learning
outside research. He founded Osaro to take
reinforcement learning to a commercial level,
with robot picking as the starting point.

Current
Target
Markets
Robotics
OpenAI was started as a non-profit AI research organization in 2015 by Elon
Musk, Sam Altman, and other founders with a vision to build safe artificial
general intelligence systems. The company is headquartered in California, USA.
In February 2018, Elon Musk parted ways with OpenAI and the company
created a capped profit known as OpenAI LP in the following year to raise
investments and implement their mission. This structural change was
undertaken to perform cost intensive AI-research using high computational
power and AI scientists. OpenAI LP is working on reinforcement learning,
robotics, and language models for a wider adoption of safe AGI systems. Reid
Hoffman’s charitable foundation and Khosla Ventures are two of the prime
investors. The company was initially backed by Peter Thiel as well.
Technology Stack
OpenAI is developing reinforcement learning algorithms for high performance and ease of use.
A few of the company’s initial reinforcement explorations are Actor Critic using
Kronecker-Factored Trust Region (ACKTR), used for imposing trust optimization into deep
reinforcement learning, and Asynchronous Advantage Actor Critic (A3C), used for optimization
of deep neural network controllers by leveraging asynchronous gradient descent. OpenAI states
Rapid reinforcement learning algorithm and Proximal Policy Optimization (PPO) as the default
reinforcement learning algorithm because of its ease of use and good performance.14
# II

OpenAI is investigating and improving
reinforcement techniques continuously to train
the agents for advancing AGI applications. The
company has developed Gym, a toolkit for
developing and comparing reinforcement
learning algorithms.
For robotic environments, the company has
addressed the issues with sparse rewards in
reinforcement learning. It has released eight
simulated robotics environments and baseline
implementation of Hindsight Experience
Replay (HER). Out of these eight research
environments, four are utilized for fetch
robotics and the other four for shadow hand
robotics. The HER is a reinforcement learning
algorithm that can learn from failures.15
OpenAI Five is a reinforcement-based project,
which learns using a massively scaled version
of Proximal Policy Optimization (PPO). It runs
on 256 GPUs and 128,000 CPU cores. The
framework has been tested in complex video
games like Dota where a team of five bots
competes against human players. OpenAI Five
is deployed on a general purpose
reinforcement learning infrastructure called
Rapid, that helps to solve challenges related to
competitive self-play. 16
The company is also using the OpenAI Five
code for Dactyl, a system for manipulating
objects using shadow dexterous hand. Dactyl
self-learns from scratch using the
reinforcement learning algorithm. Dactyl can
perform vision-based object orientation on a
physical shadow dexterous hand.17
Apart from these projects, OpenAI has released
Spinning Up in Deep RL, an open educational
resource designed for practitioners in deep
reinforcement learning.18
The company has multiple research
publications regarding reinforcement
algorithms for AGI advancement. However,
there are no patent publications assigned to
OpenAI.
OpenAI is currently focusing on research
partnerships with academic institutions,
non-profits and industry labs to enhance
societal preparedness for large language
models. The partnerships done by OpenAI will
support in the decision making on larger
models.19
The company is partnering with
various institutes like Center for
Human-Compatible AI (CHAI) at the University
of California at Berkeley, University of Toronto
and New York University.20
Key Personnel
Greg Brockman, co-founder, pursued
Computer Science at Harvard and then shifted
to MIT. He then dropped out of MIT to form
Stripe, an online payment platform. He held a
CTO position at Stripe for five years after which
he started his exploration of AI at OpenAI. He
has authored many publications on
reinforcement learning topics and worked on
diversified coding projects.21
Ilya Sutskever, cofounder and chief scientist,
is a pioneer in ML and has a PhD degree in
Computer Science from University of Toronto
and a Post doctorate from Stanford University.
Ilya has contributed immensely in the field of
deep learning. Prior to OpenAI, he founded
DNNResearch, a voice and image recognition
focused startup that was later acquired by
Google. Ilya then worked as a Research
Scientist at Google. He has led the revolution
in computer vision and natural language
processing and has been the co-inventor of
AlexaNet, AlphaGo, TensorFlow, and Sequence
to Sequence Learning.
Sam Altman, cofounder, holds an honorary
degree from the University of Waterloo. Altman
was also the cofounder and CEO of Loopt, a
location-based networking mobile application
company which was later acquired by Green
Dot Corporation. He has been the President of
Y Combinator and is a personal investor of
companies like Airbnb, Reddit, Pinterest, etc.

Why OpenAI?
OpenAI is a startup that was formed and
backed by some of the radical thinkers with
a mission of advancing AGI for achieving
benefits for humanity. The company is now
run by a team of AI experts who have
mentored multiple startups before and
have the potential of creating a difference
in thepresent AGI environment with their
core expertise. The team is focused to
disrupt the intelligent age by distributing AI
powers evenly. Being an open source
company, OpenAI gives researchers and
companies free library resources for
developing and implementing
reinforcement algorithms. The company
strives to approach difficult real-world
problems by combining different
reinforcement learning algorithms. The
company’s reinforcement agents can now
be trained in simulations without any
physical accurate modelling of the world.
Future Roadmap
The company is dedicated to enhance its
research to make a secure AGI that drives a
broader adoption of the technology. The
company’s transformation from non-profit
research entity to a profit firm suggests that it
is preparing to compete with other AI
companies like Google and Facebook.
Limitations
The company is testing the algorithms on
complex gaming scenarios. However, it is yet
to explore and demonstrate reinforcement
algorithms on industrial-grade applications.
The real implementation of the company’s
vision will be defined only when it deploys a
safe and secure AGI ecosystem.
28
Wojciech Zaremba, cofounder, heads the
robotics department. Zaremba has been
employing general purpose robots through
means of deep reinforcement learning and
meta-learning. He holds a doctorate in deep
learning from New York University and has
served as a research scientist at companies like
Google Brain and Facebook AI Research.
Zaremba’s area of interest lies in large neural
networks.

Current
Target
Markets
Robotics
Acutronic Robotics is a robotics startup headquartered in Zurich, Switzerland.
The company was founded in 2016 after the acquisition of another robotics
startup Erle Robotics, that was initially funded by DARPA. Sony Innovation
Fund, the investor of Acutronic Robotics plans to integrate the company’s
solutions in Sony’s robotics division. Acutronic Robotics is focusing on
addressing the hardware incompatibility issues plaguing the modular
robotics market.
Technology Stack
The company’s technologies range across robotic hardware, software and a physical channel to
permit real-time data communication between the different robotic components. Hardware
Robot Operating System (H-ROS) is the company’s robotic bus that addresses the hardware
incompatibility issue related to robot manufacturing.
H-ROS aims to serve as a standardized hardware infrastructure for robots. It is based on a
hierarchical API that utilizes deep reinforcement learning algorithms. The ﬁrst level of the
layered API is built on top of a Hardware Robot Information Model (HRIM) and is powered by
the de facto robot API standards ROS-2 and Gazebo for module enhancement. The level 2 uses
reinforcement learning models such as PPO, TRPO, DQN, DDPG that connects with the u
nderlying layer and interoperates with ROS. At the top level, the company is incorporating
imitation learning using Generative Adversarial. Acutronic Robotics is thus proposing ROS and
Gazebo-based reinforcement learning toolkit, which complies with OpenAI’s Gym. 23
Additionally, the company is deploying the H-ROS on system on module (SoM) that integrates
several sensors and power mechanisms for hardware and software management in the robot
modules.
# III

Acutronics’ product portfolio includes
AI-powered robotic components and a
modular robot. The entire hardware offering is
based on the H-ROS SOM that runs ROS-2
natively. The component offerings include
open robot controller, ......................24
MARA is a
self-contained robot which implies that every
module in the robot can be replaced and
configured easily. The company is using MARA
for its research on reinforcement algorithms to
obtain accuracies on a millimeter scale and
upgrade the ROS 2 version for real-world
robotic application.
Víctor Mayoral Vilches, who is the CEO of
Acutronic Robotics has patent filings focused
on power management, configuration and
control of modular robot. These patent
publications are assigned to his initial venture,
Erle Robotics that is now a part of Acutronic
Robotics.25 26 27
The company’s partnership route has enabled
them to accelerate development of its hard-
ware offerings. Acutronic Robotics’ strategic
collaborations with robot manufacturers like
Hebi Robotics, H.....................
.................... to leverage programmable SoC and
Gigabit Ethernet Time Sensitive Networking
(TSN) subsystem IP Core for Acutronic’s SOM.29
On the software front, the Acutronic Robotics
is working on a European Union-funded
project known as OFERA. The company is
working with participants like Bosch,
Key Personnel
Víctor Mayoral Vilches, CEO, is a specialist in
robotics and has been a part of multiple
projects related to AI, security, reinforcement
and communication stack for robots. He was
the founder of Erle Robotics and has held
advisory roles for other robotic firms as well.
Victor holds a doctorate in the field of bioro-
botics and has worked with Open Source
Robotics Foundation. He has contributed
towards pushing ROS 2.0 in embedded
platforms. Victor’s work has been recognized
and awarded on multiple events and by key
organizations like MIT, Google and ABB.31
Why Acutronic Robotics?
Acutronic Robotic’s unique proposition of
merging modularity and reinforcement
frameworks has the ability to disrupt the
future of robot manufacturing and
configuration techniques. It is one of the
few companies that is eliminating the
system integration effort which is currently
a major bottleneck for robot developers.
The company’ hardware, robot bus, and
reinforcement focused software
framework is creating a common
integration platform for modular robotics
systems. The company’s complete package
of the H-ROS robot bus offers a
cost-effective and market ready solution in
the present fragmented modular robot
market that includes different hardware,
integrators and developers. The company’s
effort regarding the evaluation of deep
reinforcement learning to extend the
capabilities of the existing ROS2 framework
will lead to the realization of self-adaptable
robots. In coming years, Acutronic Robotics
can pave the path for Robots as a Service
business model that will enable SME’s to
deploy customizable robots across sectors.
30
eProsima, PIAP, and FIWARE Foundation for
bridging the technological gap between
robotic software framework and the
microcontroller libraries. 30

InstaDeep’s decision-making AI systems
offer multiple advantages for various use
cases across industries. The company’s
solutions provide an insight to its clients
regarding delivery efficiency across the
supply chain and maintaining the margins
and pricing to stay in the competitive with
the evolving market demands. Furthermore,
InstaDeep’s mobility sphere
implementations permits solution providers
to make computer-aided decisions on fleet
sizes, effective deployment for reduced
passenger delays, and increased efficiency.
The company’s AI-enabled manufacturing
solution improves the system reliability by
minimizing the downtime and by accurately
predicting machine failures. Additionally,
the company ensures smooth robotic
operations and can automate the visual
inspection.
Future Roadmap
InstaDeep is currently working on the
development of algorithms for its
decision-making processes, optimization and
generalization of AI applications. The company
plans to use the funding amount to improve
operating efficiency and ROI of proprietary
enterprise AI applications.
Limitations
InstaDeep’s current research algorithm
implementation of bin packaging problem has
been done without considering sparse spaces.
The company targets to impose its R2
algorithm over a wider range of problems.
InstaDeep will consider other optimization
problems such as Traveling Salesman Problem
to further evaluate its effectiveness in the
real-world environment.

Latent Logic, formerly Morpheus Labs is a spin-out of Oxford University focusing on combining
state of the art computer vision and reinforcement learning (with imitation learning-based
demonstration) for accelerating the development of autonomous vehicles.
The company’s software is platform agnostic and is based on standard API used by existing
agent model interfaces. The imitation learning incorporates variations of the human behavior
making the model more flexible.
Other application areas are telepresence robots, automated video analytics including evaluating
sports player performance, monitoring industrial processes, profiling wildlife behaviour, or
understanding crowd dynamics.81
Overview
The company is working on OmniCAV and VeriCAV projects, as a part of two multimillion pound
projects funded by Innovate UK, the UK government’s innovation agency. Latent Logic is testing
and working on verification of the autonomous vehicles. It will be partnering with other
companies to make AI-based simulations of real Oxfordshire roads to securely test autonomous
cars.82
The company is planning to sell its technology to automotive and insurance companies.
Additionally, it is advising the government on creating standards for the autonomous vehicles.
Founded: 2017
Headquarter: Oxford, England
Funding: Not Available
Investors: Oxford Capital, Oxford Sciences Innovation
Growth
Opportunities
48
Other Startups That are
Leveraging Reinforcement
Learning Across
Industries
# 1

Founded: 2017
Headquarter: Colorado, USA
Investors: Undisclosed
NeuDax is focused on bringing deep learning, reinforcement learning and ML to the upstream
oil and gas (O&G) industry.
The company’s value proposition for O&G operators is faster decision and recommendations
related to well and design completion tasks leading to higher saving. For O&G investment firms,
the company’s solution offers faster valuation with prediction of varied scenarios.
FracDax™ is an inverse reinforcement learning-based platform and a full stack AI solution that
can analyse more than 10,000 field development scenarios in a few hours.
Overview
The company is offering its AI solution through Software as a Service (SaaS) and Analytics as a
Service (AaaS) models. NeuDax is planning to expand the AI platform to the basins like
Permian, Eagle Ford, Bakken and Marcellus.
NeuDax’s current technology is limited to AI applications before drilling only. The company is
targeting to cover the post drilling challenges as well. 83
Growth
Opportunities
Founded: 2016
Headquarter: : Tokyo, Japan
Funding: USD 17.9 Mn
Investors: SBI investment
Ascent Robotics is using deep reinforcement learning, stochastic control, probabilistic model
and neuroscience for developing intelligent solutions for Level 4 autonomous vehicles and
industrial robotics.84
Atlas is the company’s beta version of AI learning architecture for integrating virtual reality
human interface and 3D simulation environment with deep reinforcement learning algorithm.85
Ascent robotics in collaboration with Kawasaki Heavy Industries has developed a flow-based
generative model for real-time industrial applications.
Overview
The company is targeting Level 4 autonomy in vehicles by 2020. Ascent will license its
technology to automakers primarily in the Japanese market. The company is planning to
charge at least $1,000 per vehicle per license.86
Ascent plans to expand its sales footprint in the US and Europe and open an R&D center in
Hawaii.
Growth
Opportunities
# 2
# 3

Incelligent provides ML-based network analytics for network operators. It advances processes
like network management, retention process, and monetization of big data. Incelligent’s
software framework can analyze heterogeneous data and has the potential to maximize
telecom operator business value with intelligent insights. The company is working on use
cases spectrum management, RAN optimization, traffic management, mobility predictions,
etc.127
The company’s patent filings cover the use of deep neural network models and reinforcement
learning (Q-based algorithm) for improving predictions regarding network configurations. 128
Overview
Incelligent is aiming to deliver ML solutions for the next generation 5G intelligent orchestration
framework. It is actively participating in 5G projects like MATILDA, focused on developing a
holistic framework for 5G-ready applications.129
Future/Growth
Opportunities
Cogent Labs offers intelligent, real-world software solutions. Its offerings like Tegaki, Kaidoku
and Time Series Forecasting leverage reinforcement algorithms along with the other techniques
like natural language understanding, OCR, and data extraction techniques. These services are
being used by financial institutions for incorporating business efficiency. These solutions can be
applied to multi-dimensional data and can be used on edge devices.
The company has recently partnered with SoftBank for developing combinational technology
offerings like Robotic Process Automation (RPA) and OCR. 132
Overview
Cogent labs is targeting different industries like manufacturing, sales, health care, and
education. Leveraging its partnership with SoftBank, the company will focus on combining
SynchRoid (RPA solution) with its proprietary solution Tegaki.
Future/Growth
Opportunities
Founded: 2016
Headquarter: Ahmedabad, India
Investors: Undisclosed
Founded: 2016
Headquarter: New York, USA
Funding: USD 100K
Investors: Right Side Capital Management, Techstars
# 19
# 20

Reinforcement learning is at an early stage of exploration and has already gained prominence
in multiple applications across sectors. Going forward, with more research and development,
reinforcement learning algorithms are set to drive a major transformation in real-world
applications.
Startups focusing on reinforcement learning algorithms are attracting large investments
from key investors. This is evident from XXX’s acquisition of two reinforcement learning
startups, XX and XX, in the last two years. The investment scenarios highlight that large
companies are betting on reinforcement learning for advanced AI capabilities and
complementing traditional ML solutions.
With a goal to pursue AGI, the startups in the domain have begun to venture................
Reinforcement learning has the potential to bring the next stage of innovation in these
sectors. In the near-term, the deployment of reinforcement learning algorithms will be seen
mostly in XXXX platforms for applications catering to fulfillment XXXXXX.
Reinforcement learning will help in the development of effective industrial-grade AI solutions
that will benefit industries with high production capabilities, reduce costs, improve worker
productivity, and automate distribution and logistics with greater accuracy and speed. KXXX
and XXXX are few examples of startups who are targeting the retail industry with their
reinforcement learning-based solutions.
It appears that reinforcement learning is going to be a game changing technology for sectors
like automotive, financial services, industrial plants, medical, telecommunications and others
that require critical decision-making and meta-learning abilities. Startups catering to these
verticals are working in collaboration with industry partners and universities to improve the
algorithms and make them market ready.
Companies who are looking to enter the advanced AI market or implement autonomous
capabilities in their businesses or product offerings could consider ................. These
collaborative strategies will help companies scale up their solutions with AI capabilities.
Automotive Sector: Automakers should invest or collaborate with startups like ...................... to
advance towards Level 4 and Level 5 in
self-driving cars.
Insights and
Recommendations
Recommendations

Retail Sector: Startups incorporating reinforcement learning algorithms in robots for
piece-picking applications can help build an automated distribution center. Ecommerce
fulfillment centers, clothing retailers, and other warehouse retail entities can collaborate
with startups like............., .................................. Industries to maximize efficiency and reduce cost.
Food retailers can also earn product profitability by collaborating with startups like ............
and ................... that are introducing reinforcement learning solutions in the food industry.
Financial Services: Financial firms can target startups .......................... or ................ for
investment opportunities or collaboration, as these entities are approaching higher
accuracy in financial applications with their reinforcement learning models.
Telecommunications: The self-healing network of the future can be realized through
reinforcement learning techniques. Telecom companies . or partnership with
companies like Incelligent to gain reinforcement learning skills.
Industrial Sector: Industrial companies can pursue collaboration with startups ............,
Intelligent Layer, .................and ................. that are providing reinforcement learning solutions for
complex industrial processes to make accurate operational and business decisions.
Other Sectors: Education, research, healthcare and agriculture are some of the other sectors
that can benefit from collaboration with startups ......................
...................................................................................................
........................................................................................................................................................................
........................................................................................................................................................................, respectively.
64

Acronyms
Abbreviation Explanation
ML Machine Learning
AI
AGI
ANI
SARSA
DQN
IoT
A3C
DDPG
ASRS
EDA
ACKTR
PPO
HER
DARPA
H-ROS
MARA
API
TRPO
SoM
SoC
DoF
LIDAR
RADAR
ARP
POMDPS
KPIs
SaaS
PPaC
ROI
AaaS
OCR
RAN
GPCR
OPEX
CNN
LSTM
OBD
Artificial Intelligence
Artificial General Intelligence
Artificial Narrow Intelligence
State-Action-Reward-State-Action
Deep Q Network
Internet of Things
Asynchronous Advantage Actor Critic
Deep Deterministic Policy Gradient
Automated Storage and Retrieval Systems
Electronic Design Automation
Actor Critic using Kronecker-Factored Trust Region
Proximal Policy Optimization
Hindsight Experience Replay
Defense Advanced Research Projects Agency
Hardware Robot Operating System
Modular Articulated Robotic Arm
Application Program Interface
Trust Region Policy Optimization
System on Module
System on Chip
Degrees of Freedom
Light Detection and Ranging
Radio Detection and Ranging
Autoregressive Policy
Partially Observable Markov Decision Process
Key Performance Indicators
Software as a Service
Process Prediction and Control
Return on Investment
Analytics as a Service
Optical Character Recognition
Radio Access Network
G-Protein-Coupled Receptors
Operating Expenditure
Convolutional Neural Network
Long Short-Term Memory
On board Diagnostics

References
1
OpenAI. “Part 2: Kinds of RL Algorithms¶.” Part 2: Kinds of RL Algorithms - Spinning Up Documentation, spinningup.ope
nai.com/en/latest/spinningup/rl_intro2.html.
2
“Osaro.” Crunchbase, www.crunchbase.com/organization/osaro#section-investors.
3
“About.” Osaro, www.osaro.com/about.
4
“Technology.” Osaro, www.osaro.com/technology.
5
“Fooma.” Osaro, www.osaro.com/fooma.
6
Arel, Itamar, et al. “US20170213150A1 - Reinforcement Learning Using a Partitioned Input State Space.” Google Patents,
Google, 2016, patents.google.com/patent/US20170213150A1/en?oq=US20170213150A1.
7
Arel, Itamar, et al. “US9536191B1 - Reinforcement Learning Using Confidence Scores.” Google Patents, Google, patents.goo
gle.com/patent/US9536191B1/en?oq=US9536191B1.
8
“Fooma.” Osaro, www.osaro.com/fooma.
9
Artificial Intelligence. ABB Group R&D and Technology, www.controlglobal.com/assets/knowledge_centers/abb/as
sets/1808/ABB-Review-4Q17-Buzzword-Demystifier-Artificial-Intelligence.pdf.1
10
“ Derik Pridmore.” Crunchbase, www.crunchbase.com/person/derik-pridmore#section-overview.
11
Shaw, Keith. “Osaro Powers the Brains Behind Smarter Picking Robots.” Robotics Business Review, Robotics Business
Review, 22 July 2019, www.roboticsbusinessreview.com/sponsored-content/osaro-powers-the-brains-behind- smarter-
picking-robots/.
12
Brockman, Greg. “OpenAI LP.” OpenAI, OpenAI, 20 July 2019, openai.com/blog/openai-lp/.
13
Wu, Yuhuai. “OpenAI Baselines: ACKTR & A2C.” OpenAI, OpenAI, 9 Mar. 2019, openai.com/blog/baselines-acktr-a2c/.
14
Schulman, John. “Proximal Policy Optimization.” OpenAI, OpenAI, 9 Mar. 2019, openai.com/blog/openai-baselines-ppo/.
15
Plappert, Matthias. “Ingredients for Robotics Research.” OpenAI, OpenAI, 9 Mar. 2019, openai.com/blog/ingredients-for-ro
botics-research/.
16
Chan, Brooke. “OpenAI Five.” OpenAI, OpenAI, 7 June 2019, openai.com/blog/openai-five/.
17
Andrychowicz, Marcin. “Learning Dexterity.” OpenAI, OpenAI, 6 June 2019, openai.com/blog/learning-dexterity/.
18
Andrychowicz, Marcin, et al. Learning Dexterous In-Hand Manipulation. ArXiv, 18 Jan. 2019, arxiv.org/pdf/1808.00177.pdf.
19
Radford, Alec. “Better Language Models and Their Implications.” OpenAI, OpenAI, 3 July 2019, openai.com/blog/better-
language-models/.
20
Wu, Yuhuai. “OpenAI Baselines: ACKTR & A2C.” OpenAI, OpenAI, 9 Mar. 2019, openai.com/blog/baselines-acktr-a2c/.
21
“Greg Brockman's Home Page.” Greg Brockman's Home Page, gregbrockman.com/.
22
“Imitation Learning (IL) for Training Robots.” ACUTRONIC ROBOTICS, acutronicrobotics.com/docs/technology/h-ros/
api/level3/il.
23
Acutronic Robotics. “Gym-gazebo2, a Toolkit for RL Using ROS 2 and Gazebo.” Latest Modular Robotics News | Acutronic
Robotics, Latest Modular Robotics News | Acutronic Robotics, 18 Mar. 2019, acutronicrobotics.com/news/acutronic-robot
ics-launches-gym-gazebo2-a-toolkit-for-reinforcement-learning-using-ros-2-and-gazebo/.
24
MARA. Acutronics Robotics, 14 May 2019, acutronicrobotics.com/products/mara/files/mara-robotic-arm-datasheet-v1.pdf.
25
“EP3396598A2 - Method and User Interface for Managing and Controlling Power in Modular Robots and Apparatus
Therefor.” Google Patents, Google, patents.google.com/patent/EP3396598A2/en?inventor=V%C3%AD tor%2BMayoral%2B
Vilches&oq=V%C3%ADctor%2BMayoral%2BVilches.
26
“WO2018172593A2 - Method for Integrating New Modules into Modular Robots, and Robot Component of Same.” Google
Patents, Google, 2018, patents.google.com/patent/WO2018172593A2/en?inventor=V%C3%ADtor%2BMayoral%
2BVilches&oq= V%C3%ADctor%2BMayoral%2BVilches.
27
Mayoral , Vilches Victor. “ES2661067B1 - Method of Determination of Configuration of a Modular Robot.” Google Patents,
Google, 2016, patents.google.com/patent/ES2661067B1/en.

113
“Turning Transportation Into Intelligent Transportation Services.” Aigent-Tech, 2017, www.aigent-tech.com/.
114
“Learnable, Inc.” About - Learnable.ai, learnable.ai/about.
115
“Startups.” MIT Asian Club, asianclub.mit.edu/startups.
116
“Learnable, Inc.-About.” Linkedin, www.linkedin.com/company/learnable-ai/about/.
117
“HiHedge, AI Trading with Machine Learning.” Hihedge, www.hihedge.com/.
118
Zhang, Tianhao, et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation. 6 Mar.
2018, arxiv.org/pdf/1710.04615.pdf.
119
Ackerman, Evan. “AI Startup Embodied Intelligence Wants Robots to Learn From Humans in Virtual Reality.” IEEE
Spectrum: Technology, Engineering, and Science News, IEEE Spectrum, 8 Nov. 2017, spectrum.ieee.org/automaton/ro
botics/artificial-intelligence/ai-startup-embodied-intelligence.
120
“SOLUTIONS.” Aidentify, www.aidentify.io/page/solutions/solutions.
121
“AIdentify.” Devpost, devpost.com/software/aidentify-tgxwes.
122
Alma Mundi Ventures. “Alma Mundi Ventures Invests In AI Startup Nnaisense, The Pioneers Of Very Deep Learning.” PR
Newswire: Press Release Distribution, Targeting, Monitoring and Marketing, 26 June 2018, www.prnewswire.com/news-
releases/alma-mundi-ventures-invests-in-ai-startup-nnaisense-the-pioneers-of-very-deep-learning-300391576.html.
123
“NNAISENSE.” NNAISENSE, nnaisense.com/.
124
PerimeterX Bot Defender. “PerimeterX Raises $43M In Series C Funding To Fuel Expansion into New Markets and
Accelerate Product Development.” PerimeterX Bot Defender, www.perimeterx.com/about/press/2019/2019-02-11-perime
terx-raises-$43m-series-c/.
125
PerimeterX Bot Defender. “Who We Are.” PerimeterX Bot Defender, www.perimeterx.com/about/who-we-are/.
126
Tseitlin, Ariel. “Bots Are Half of Internet Traffic. The Hard Part Is Knowing Which Half.” Scale Venture Partners, 11 Feb. 2019,
www.scalevp.com/blog/bots-are-half-of-internet-traffic-the-hard-part-is-knowing-which-half.
127
“About Incelligent.” Incelligent, www.incelligent.net/menu
128
Tsagkaris, Kostas, et al. “US9942085B2 - Early Warning and Recommendation System for the Proactive Management of
Wireless Broadband Networks.” Google Patents, 2017, patents.google.com/patent/US9942085B2/en?assignee=Incelli
gent&oq=Incelligent%2B.
129
“5G-Ready Applications and Network Services Made Easy.” Incelligent, incelligent.net/news/251-5g-ready-applica
tions-and-network-services-made-easy.
130
“Cogent Labs.” Pagan Research! Online B2B Lead Database Intelligence Website, paganresearch.io/details/cogent-labs.
131
GIG. “Cogent Labs Completes Series B Fundraising Round With Additional Investments by Samsung Venture Investment
and Other Foreign Investors Raising Round Total to 1.2 Billion JPY.” Cogent Labs, www.cogent.co.jp/en/news/2019-5-23-co
gent-labs-series-b-investment-2/.
132
“SoftBank and Cogent Labs Enter into Business Partnership in the Field of RPA×AI: Press Releases: News: About Us.”
SoftBank, 2018, www.softbank.jp/en/corp/news/press/sbkk/2018/20180129_01/.
133
“Pricemoov, the Start-up That Helps Set Prices in Real Time, Raises 3 Million Euros.” Usine, Floriane Leclerc , 7 Sept. 2018,
www.usine-digitale.fr/article/pricemoov-specialiste-de-l-optimisation-tarifaire-leve-3-millions-d-euros.N738494.
134
“[FW Radar] Pricemoov, L'outil De Variation De Prix Pour Les Entreprises.” FrenchWeb.fr, 16 Nov. 2017, www.frenchweb.
fr/fw-radar-pricemoov-loutil-de-variation-de-prix-pour-les-entreprises/308502.
135
“PRICEMOOV , The Artificial Intelligence of the Price.” MYFrenchStartup, 16 Nov. 2017, www.myfrenchstartup.com/en
/startup-france/198955/pricemoov.
136
Crochet-Damais, Antoine. “Les AI Paris Awards 2018 Décernés à Systran, Pricemoov Et Taqadam.” Journaldunet.com, Le
JDN, 13 June 2018, www.journaldunet.com/solutions/dsi/1209950-les-ai-par
is-awards-viennent-recompenser-systran-et-pricemoov/.
137
Luczak-Rougeaux, Julia. “Pricemoov, the Startup That Plays at the Right Price Thanks to Artificial Intelligence.” TOM, 19
Sept. 2018, www.tom.travel/2018/09/19/pricemoov-la-startup-qui-joue-au-juste-prix-grace-a-lintelligence-artificielle/.
138
“DataOne Innovation Labs.” Leaf GLS University Incubator, www.glsleaf.in/dataone.html.
139
“Jobs at DataOne Innovation Labs.” Dataone, angel.co/company/dataone/jobs.
140
“DataOne Innovation Labs.” Leaf GLS University Incubator, www.glsleaf.in/dataone.html.
141
“Stop Analysing Start Learning.” Intelligent Layer, intelligentlayer.com/.
142
Symcox, Jonathan. “THE REAL ALE DEVELOPED USING ARTIFICIAL INTELLIGENCE.” BusinessCloud.co.uk, 23 Dec. 2016,
www.businesscloud.co.uk/news/the-real-ale-developed-using-artificial-intelligence.

144
“Artificial Intelligence & Machine Learning.” Omina Technologies, ominatechnologies.com/.
145
“Omina Technologies.” Omina Technologies, ominatechnologies.com/services/omina-core/.
146
“Omina Technologies.” Omina Technologies, ominatechnologies.com/our-cases/.
147
“Omina Technologies.” Omina Technologies, ominatechnologies.com/services/omina-consultancy/.
148
“Omina Technologies.” Omina Technologies, ominatechnologies.com/business/belgian-machine-learning-start-
looking-us/.
149
“Faster Neural Networks.” Deeplite, www.deeplite.ai/index.html.
150
“Second Order Acceleration: Making Faster Neural Networks, Faster.” Mc.ai, 1 May 2019, mc.ai/second-order-accelera
tion-making-faster-neural-networks-faster/.
151
“Faster Neural Networks.” Deeplite, www.deeplite.ai/index.html#neutrino.
152
“The Privacy-First AI Layer for e-Commerce.” Free Machines, free-machines.com/.
153
“MOBILE ROBOTS.” Dorabot.com, www.dorabot.com/solutions/mobile-robots.
154
“Dorabot Awarded as Technology Pioneer by World Economic Forum.” Dorabot.com, 7 Feb. 2019, dorabot.com/up
dates#/latestnews/en/0021.
155
“Speeding up Your Packages: China’s Dorabot Bets on Globalization & Diversity.” CGTN, 1 Jan. 2019,
news.cgtn.com/news/31636a4d30494464776c6d636a4e6e62684a4856/share_p.html.
156
“About Us.” Applied Brain Research, appliedbrainresearch.com/about-us/.
157
Applied Brain Research Inc. “Applied Brain Research Inc. Demonstrates Leading Edge Neuromorphic AI Stack at Ontario
Centres of Excellence Discovery 2018.” PR Newswire: Press Release Distribution, Targeting, Monitoring and Marketing, 27
June 2018, www.prnewswire.com/news-releases/applied-brain-re
search-inc-demonstrates-leading-edge-neuromorphic-ai-stack-at-ontario-centres-of- excellence-discov
ery-2018-300638221.html.
158
“Our Vision.” Neurocat, www.neurocat.ai/about/.
159
“Neurocat-Overview.” Crunchbase, www.crunchbase.com/organization/neurocat#section-overview.

Netscribes is a global market intelligence and content services
provider that helps corporations achieve strategic objectives through a
wide range of offerings. Our solutions rely on a unique combination of
qualitative and quantitative primary research, secondary/desk
research, social media analytics, and IP research. For more than 15
years, we have helped our clients across a range of industries,
including technology, financial services, healthcare, retail, and CPG.
Fortune 500 companies, as well as small- to mid-size firms, have
benefited from our partnership with relevant market and competitive
insights to drive higher growth, faster customer acquisition, and a
sustainable edge in their business.
About Netscribes
Appendix

This report is prepared by Netscribes (India) Private Limited (”Netscribes”), a
market intelligence and content service provider.
The content of this report is developed in accordance with Netscribes’ professional standards. Accordingly,
the information provided herein has been obtained from sources which are reasonably believed to be
reliable. All information provided in this report is on an “as-is" and an "as-available” basis, and no
representations are made about the completeness, veracity, reliability, accuracy, or suitability of its content
for any purpose whatsoever. All statements of opinion and all projections, forecasts, or statements relating
to expectations regarding future events represent ROGM’s own assessment and interpretation of
information available to it. All liabilities, however arising, in each of the foregoing respects are expressly
disclaimed.
This report is intended for general information purposes only. This report does not constitute an offer to sell
or issue securities, an invitation to purchase or subscribe for securities, or a recommendation to purchase,
hold, sell, or abstain from purchasing, any securities. This report is not intended to be used as a basis for
making an investment in securities. This report does not form a fiduciary relationship or constitute
investment advice. Nothing in this report constitutes legal advice.
The information and opinions contained in this report are provided as of the date of the report and are
subject to change. Reports may or may not be revised in the future. Any liability to revise any out-of-date
report, or to inform recipients about an updated version of such report, is expressly disclaimed.
A bonafide recipient is hereby granted a worldwide, royalty-free, enterprise-wide limited license to use the
content of this report, subject to the condition that any citation from this report is properly referenced and
credited to Research On Global Markets. Nothing herein conveys to the recipients, by implication or by way
of estoppel, any intellectual property rights in the report (other than the foregoing limited license) or
impairs Netscribes’ intellectual property rights, including but not limited to any rights available to
Netscribes under any law or contract.
To the maximum extent permitted by law, all liabilities in respect of this report and any related material is
expressly disclaimed. Netscribes does not assume any liability or duty of care for any consequences of any
person acting, or refraining to act, by placing reliance on the basis of information contained in this report.
All disputes and claims arising in relation to this report will be submitted to arbitration, which shall be held
in Mumbai, India under the Indian Arbitration and Conciliation Act. The exclusive jurisdiction of the courts
in Mumbai, India, applies to all disputes concerning this report and the interpretation of these terms, and
the same shall be governed by and construed in accordance with Indian law without reference to the
principles of conflict of laws.
Disclaimer

USA
41 East, 11th Street,
New York NY10003, USA
+1-917-885-5983
Singapore
Netscribes Global Pte. Ltd.,
10 Dover Rise, #20-11
Heritage View,
Singapore 138680
Mumbai
Office No. 504, 5th Floor,
Lodha Supremus, Lower Parel,
Mumbai 400013,
Maharashtra, India
+91‐22‐4098-7600
Get in touch with us
Kolkata
3rd Floor, Saberwal House
55B Mirza Ghalib Street,
Kolkata - 700 016
West Bengal, India
+91-33-4027-6200
Gurugram
806, 8th Floor, Unitech Cyber
Park, Tower B, Sector 39,
Gurugram - 122001,
Haryana, India
+91-124-491-4800
US toll free: 1 888 448 4309 India: +91 22 4098 7690 subscription@netscribes.com

Reinforcement Learning- AI Track

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Reinforcement Learning- AI Track

Similar to Reinforcement Learning- AI Track (20)

More from Netscribes

More from Netscribes (17)

Recently uploaded

Recently uploaded (20)

Reinforcement Learning- AI Track