AI_unit3.pptx

 Game Theory,
 Optimal Decisions in Games,
 Heuristic Alpha–Beta Tree Search,
 Monte Carlo Tree Search,
 Stochastic Games, Partially Observable
Games,
 Limitations of Game Search Algorithms,
 Constraint Satisfaction Problems (CSP),
Constraint Propagation: Inference in CSPs,
Backtracking Search for CSPs.

 Many applications for AI
 Computer vision, natural language processing,
speech recognition, search …
 But games are some of the more interesting
 Opponents that are challenging, or allies that
are helpful
 Unit that is credited with acting on own
 Human-level intelligence too hard
 But under narrow circumstances can do pretty well
(ex: chess and Deep Blue)
 For many games, often constrained (by game rules)

 we cover competitive environments, in
which the agents’
 goals are in conflict, giving rise GAME to
adversarial search problems—often known
as games.

 MinMax the heart of almost every computer board
game
 Applies to games where:
 Players take turns
 Have perfect information
 Chess, Checkers, Tactics
 But can work for games without perfect information
or chance
 Poker, Monopoly, Dice
 Can work in real-time (ie- not turn based) with timer
(iterative deepening, later)

 Search tree
 Squares represent decision states (ie- after a move)
 Branches are decisions (ie- the move)
 Start at root
 Nodes at end are leaf nodes
 Ex: Tic-Tac-Toe (symmetrical positions removed)
• Unlike binary trees can have any number of children
– Depends on the game situation
• Levels usually called plies (a ply is one level)
– Each ply is where "turn" switches to other player
• Players called Min and Max (next)

 Named MinMax because of algorithm behind data
structure
 Assign points to the outcome of a game
 Ex: Tic-Tac-Toe: X wins, value of 1. O wins, value -
1.
 Max (X) tries to maximize point value, while Min
(O) tries to minimize point value
 Assume both players play to best of their ability
 Always make a move to minimize or maximize
points
 So, in choosing, Max will choose best move to
get highest points, assuming Min will choose best
move to get lowest points
Click to add text

 With full tree, can determine best possible move
 However, full tree impossible for some games! Ex: Chess
 At a given time, chess has ~ 35 legal moves. Exponential
growth:
 35 at one ply, 352 = 1225 at two plies … 356 = 2 billion and 3510 = 2
quadrillion
 Games can last 40 moves (or more), so 3540 … Stars in universe:
~ 228
 For large games (Chess) can’t see end of the game. Must estimate
winning or losing from top portion
 Evaluate() function to guess end given board
 A numeric value, much smaller than victory (ie- Checkmate for
Max will be one million, for Min minus one million)
 So, computer’s strength at chess comes from:
 How deep can search
 How well can evaluate a board position
 (In some sense, like a human – a chess grand master can
evaluate board better and can look further ahead)

How do we search this tree to find the optimal move?

 Search – no adversary
 Solution is (heuristic) method for finding goal
 Heuristics and CSP techniques can find optimal solution
 Evaluation function: estimate of cost from start to goal through given node
 Examples: path planning, scheduling activities
 Games – adversary
 Solution is strategy
 strategy specifies move for every possible opponent reply.
 Time limits force an approximate solution
 Evaluation function: evaluate “goodness” of game position
 Examples: chess, checkers, Othello, backgammon

 Two players: MAX and MIN
 MAX moves first and they take turns until the game is over
 Winner gets reward, loser gets penalty.
 “Zero sum” means the sum of the reward and the penalty is a constant.
 Formal definition as a search problem:
 Initial state: Set-up specified by the rules, e.g., initial board configuration
of chess.
 Player(s): Defines which player has the move in a state.
 Actions(s): Returns the set of legal moves in a state.
 Result(s,a): Transition model defines the result of a move.
 (2nd ed.: Successor function: list of (move,state) pairs specifying legal
moves.)
 Terminal-Test(s): Is the game finished? True if finished, false otherwise.
 Utility function(s,p): Gives numerical value of terminal state s for player p.
 E.g., win (+1), lose (-1), and draw (0) in tic-tac-toe.
 E.g., win (+1), lose (0), and draw (1/2) in chess.
 MAX uses search tree to determine next move.

Designed to find the optimal strategy for Max and find
best move:
 1. Generate the whole game tree, down to the
leaves.
 2. Apply utility (payoff) function to each leaf.
 3. Back-up values from leaves through branch nodes:
 a Max node computes the Max of its child values
 a Min node computes the Min of its child values
 4. At root: choose the move leading to the child of
highest value.

The minimax decision
Minimax maximizes the utility for the worst-case outcome for max

Mini-max algorithm is a recursive or backtracking algorithm
which is used in decision-making and game theory.
It provides an optimal move for the player assuming that
opponent is also playing optimally.
Mini-Max algorithm uses recursion to search through the
game-tree.
Min-Max algorithm is mostly used for game playing in AI.
Such as Chess, Checkers, tic-tac-toe, go, and various two-
players game. This Algorithm computes the minimax decision
for the current state.

In this algorithm two players play the game, one is called MAX
and other is called MIN.
Both the players fight it as the opponent player gets the
minimum benefit while they get the maximum benefit.
Both Players of the game are opponent of each other, where MAX
will select the maximized value and MIN will select the
minimized value.
The minimax algorithm performs a depth-first search algorithm
for the exploration of the complete game tree.
The minimax algorithm proceeds all the way down to the
terminal node of the tree, then backtrack the tree as the
recursion.

function MINIMAX-DECISION(state) returns an action
return argmax
a ∈ ACTIONS(s) MIN-VALUE(RESULT(state, a))
function MAX-VALUE(state) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v ←−∞
for each a in ACTIONS(state) do
v ←MAX(v, MIN-VALUE(RESULT(s, a)))
return v
function MIN-VALUE(state) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v←∞
for each a in ACTIONS(state) do
v ←MIN(v, MAX-VALUE(RESULT(s, a)))
return v

The main drawback of the minimax algorithm is that
it gets really slow for complex games such as Chess,
go, etc.
This type of games has a huge branching factor, and
the player has lots of choices to decide.
This limitation of the minimax algorithm can be
improved from alpha-beta pruning

•Alpha-beta pruning is a modified version of the
minimax algorithm.
• It is an optimization technique for the minimax
algorithm.
•game tree we can compute the correct minimax
decision, and this technique is called pruning.

The two-parameter can be defined as:
Alpha: The best (highest-value) choice we
have found so far at any point along the
path of Maximizer. The initial value of
alpha is -∞.
Beta: The best (lowest-value) choice we
have found so far at any point along the
path of Minimizer. The initial value of beta
is +∞.

 Depth first search
 only considers nodes along a single path from root at any time
a = highest-value choice found at any choice point of path for MAX
(initially, a = −infinity)
b = lowest-value choice found at any choice point of path for MIN
(initially, b = +infinity)
 Pass current values of a and b down to child nodes during
search.
 Update values of a and b during search:
 MAX updates a at MAX nodes
 MIN updates b at MIN nodes
 Prune remaining branches at a node when a ≥ b

 Prune whenever a ≥ b.
 Prune below a Max node whose alpha value becomes greater
than or equal to the beta value of its ancestors.
 Max nodes update alpha based on children’s returned
values.
 Prune below a Min node whose beta value becomes less than or
equal to the alpha value of its ancestors.
 Min nodes update beta based on children’s returned values.

a, b, initial values
Do DF-search until first leaf
a=−
b =+
a=−
b =+
a, b, passed to kids

MIN updates b, based on kids
a=−
b =+
a=−
b =3

a=−
b =3
MIN updates b, based on kids.
No change.
a=−
b =+

MAX updates a, based on kids.
a=3
b =+
3 is returned
as node value.

a=3
b =+
a=3
b =+

a=3
b =+
a=3
b =2
MIN updates b,
based on kids.

a=3
b =2
a ≥ b,
so prune.
a=3
b =+

2 is returned
as node value.
MAX updates a, based on kids.
No change. a=3
b =+

,
a=3
b =+
a=3
b =+

,
a=3
b =14
a=3
b =+
MIN updates b,
based on kids.

,
a=3
b =5
a=3
b =+
MIN updates b,
based on kids.

a=3
b =+ 2 is returned
as node value.
2

Max calculates the same
node value, and makes the
same move!
2

 Worst-Case
 branches are ordered so that no pruning takes place. In this case
alpha-beta gives no improvement over exhaustive search
 Best-Case
 each player’s best move is the left-most child (i.e., evaluated first)
 in practice, performance is closer to best rather than worst-case
 E.g., sort moves by the remembered move values found last time.
 E.g., expand captures first, then threats, then forward moves, etc.
 E.g., run Iterative Deepening search, sort by value last iteration.
 In practice often get O(b(d/2)) rather than O(bd)
 this is the same as having a branching factor of sqrt(b),
 (sqrt(b))d = b(d/2),i.e., we effectively go from b to square root of b
 e.g., in chess go from b ~ 35 to b ~ 6
 this permits much deeper search in the same amount of time

 Pruning does not affect final results
 Entire subtrees can be pruned.
 Good move ordering improves effectiveness of pruning
 Repeated states are again possible.
 Store them in memory = transposition table

Monte Carlo Tree Search (MCTS) is a search
technique in the field of Artificial Intelligence
(AI).
It is a probabilistic and heuristic driven search
algorithm that combines the classic tree
search implementations alongside machine
learning principles of reinforcement learning.

MCTS algorithm becomes useful as it
continues to evaluate other alternatives
periodically during the learning phase by
executing them, instead of the current
perceived optimal strategy. This is known as
the ” exploration-exploitation trade-off “.
Search can be broken down into four distinct steps, viz.,
1. selection,
2.expansion,
3.simulation and
4. backpropagation.

•the MCTS algorithm traverses the current
tree from the root node using a specific
strategy.
•The strategy uses an evaluation function to
optimally select nodes with the highest
estimated value.
•MCTS uses the Upper Confidence Bound
(UCB) formula applied to trees as the
strategy in the selection process to traverse
the tree.

where;
Si = value of a node i
xi = empirical mean of a node i
C = a constant
t = total number of simulations
When traversing a tree during the selection
process, the child node that returns the greatest
value from the above equation will be one that
will get selected.

Expansion: In this process, a new child node is
added to the tree to that node which was optimally
reached during the selection process.
Simulation: In this process, a simulation is
performed by choosing moves or strategies until a
result or predefined state is achieved.
Backpropagation: After determining the value of
the newly added node, the remaining tree must be
updated. So, the backpropagation process is
performed, where it backpropagates from the new
node to the root node.

These types of algorithms are particularly useful
in turn based games where there is no element of
chance in the game mechanics, such as Tic Tac
Toe, Connect 4, Checkers, Chess, Go, etc.

Many games mirror this unpredictability by
including a random element, such as the
throwing of dice. We call these stochastic
games.
Backgammon is a typical game that combines luck and skill.
 Dice are rolled at the beginning of a player’s turn to determine
the legal moves.
In the backgammon ,for example, White has rolled a 6–5 and has
four possible moves.
P(1,1)=1/36 (36 are ways to roll two dice.)
15 distinct roll each have 1/18 probability

 Checkers:
 Chinook ended 40-year-reign of human world champion Marion
Tinsley in 1994.
 Chess:
 Deep Blue defeated human world champion Garry Kasparov in a
six-game match in 1997.
 Othello:
 human champions refuse to compete against computers: they are
too good.
 Go:
 human champions refuse to compete against computers: they are
too bad
 b > 300 (!)
 See (e.g.) http://www.cs.ualberta.ca/~games/ for more information

 1957: Herbert Simon
 “within 10 years a computer will beat the world chess
champion”
 1997: Deep Blue beats Kasparov
 Parallel machine with 30 processors for “software” and 480 VLSI
processors for “hardware search”
 Searched 126 million nodes per second on average
 Generated up to 30 billion positions per move
 Reached depth 14 routinely
 Uses iterative-deepening alpha-beta search with transpositioning
 Can explore beyond depth-limit for interesting moves

 Many problems in AI can be considered as problems
of constraint satisfaction, in which the goal state
satisfies a given set of constraint.
 constraint satisfaction problems can be solved by
using any of the search strategies.
 A constraint satisfaction problem (CSP) is
a problem that requires its solution to be within
some limitations or conditions, also known
as constraints, consisting of a finite variable set, a
domain set and a finite constraint set. ... The
optimal solution should satisfy all constraints.

63
 Variables WA, NT, Q, NSW, V, SA, T
 Domains Di = {red,green,blue}
 Constraints: adjacent regions must have different colors
 e.g., WA ≠ NT

64
 Solutions are complete and consistent
assignments, e.g., WA = red, NT = green,Q =
red,NSW = green,V = red,SA = blue,T = green

65
 Binary CSP: each constraint relates two variables
 Constraint graph: nodes are variables, arcs are
constraints

70
 General-purpose methods can give huge
gains in speed:
 Which variable should be assigned next?
 In what order should its values be tried?
 Can we detect inevitable failure early?

71
 Most constrained variable:
choose the variable with the fewest legal values
 a.k.a. minimum remaining values (MRV)
heuristic
 Picks a variable which will cause failure as
soon as possible, allowing the tree to be
pruned.

72
 Tie-breaker among most constrained
variables
 Most constraining variable:
 choose the variable with the most constraints on
remaining variables (most edges in graph)

73
 Given a variable, choose the least
constraining value:
 the one that rules out the fewest values in the
remaining variables
 Leaves maximal flexibility for a solution.
 Combining these heuristics makes 1000
queens feasible

74
 Idea:
 Keep track of remaining legal values for unassigned
variables
 Terminate search when any variable has no legal values

75
 Idea:
variables

76
 Idea:
variables

77
 Idea:
variables

78
 Forward checking propagates information from
assigned to unassigned variables, but doesn't
provide early detection for all failures:
 NT and SA cannot both be blue!
 Constraint propagation repeatedly enforces
constraints locally

80
 Simplest form of propagation makes each arc
consistent
 X Y is consistent iff
for every value x of X there is some allowed y
constraint propagation propagates arc consistency on the graph.

81
consistent

82
consistent
 If X loses a value, neighbors of X need to be
rechecked

83
 Simplest form of propagation makes each arc consistent
 If X loses a value, neighbors of X need to be rechecked
 Arc consistency detects failure earlier than forward
checking
 Can be run as a preprocessor or after each assignment
 Time complexity: O(n2d3)

86
 Note: The path to the solution is unimportant, so
we can
apply local search!
 To apply to CSPs:
 allow states with unsatisfied constraints
 operators reassign variable values
 Variable selection: randomly select any
conflicted variable
 Value selection by min-conflicts heuristic:
 choose value that violates the fewest constraints
 i.e., hill-climb with h(n) = total number of violated
constraints

 Cryptarithmetic Problem is a type of
constraint satisfaction problem where the
game is about digits and its unique
replacement either with alphabets or other
symbols. In cryptarithmetic problem, the
digits (0-9) get substituted by some possible
alphabets or symbols.

 The rules or constraints on a crypt arithmetic
problem are as follows:
 There should be a unique digit to be replaced
with a unique alphabet.
 The result should satisfy the predefined
arithmetic rules, i.e., 2+2 =4, nothing else.
 Digits should be from 0-9 only.
 There should be only one carry forward, while
performing the addition operation on a problem.
 The problem can be solved from both sides,
i.e., lefthand side (L.H.S), or righthand side
(R.H.S)

 Given a cryptarithmetic problem, i.e.,
 Starting from the left hand side (L.H.S) , the terms
are S and M. Assign a digit which could give a
satisfactory result. Let’s assign S->9 and M->1.

 Now, move ahead to the next
terms E and O to get N as its output
Adding E and O, which means 5+0=0, which is not possible
because according to cryptarithmetic constraints, we cannot assign the
same digit to two letters. So, we need to think more and assign some
other value.

 Further, adding the next two
terms N and R we get,
But, we have already assigned E->5. Thus, the above result does not
satisfy the values

 where 1 will be carry forward to the above
term
 Let’s move ahead.
 Again, on adding the last two terms, i.e., the
rightmost terms D and E, we get Y as its
result.

 We decided to look at the value of O again.
If O = 0, then R would also be 0 so that doesn’t work
and O can’t be 1 because F = 1.
If O = 2,
TW2
+TW2
 −−−−−−−
12UR
then R = 4 and T = 6 and we also know that W < 5
because there can’t be anything carried to the
hundreds column. The only possible value of W that
hasn’t already been used is 3 but this would mean
that U is 6 which is the same as T.

 If O = 3,
TW3
+TW3
 −−−−−−−
13UR
then R = 6 and T = 6 which doesn’t work.

 If O = 4,
TW4
+TW4
 −−−−−−−
14UR
then R = 8 and T = 7 and we also know that W < 5
because there can’t be anything carried to the
hundreds column. So W could be 0, 2 or 3.
 W can’t be 0 because then U would be 0 and it
can’t be 2 because U would be 4.
If W = 3, U = 6 which works: 734 + 734 = 1468.

 If O = 5,
TW5
+TW5−−−−−−−
15UR−−−−−−−
11
then R = 0 and T = 7 and we also know that W ≥ 5
because there has to be 1 carried to the
hundreds column.
W can’t be 5 because O = 5.
If W = 6, U = 3 which works: 765 + 765 = 1530.

 So there are seven possible answers:
938+938=1876
928+928=1856
867+867=1734
846+846=1692
836+836=1672
765+765=1530
734+734=1468

 Game playing is best modeled as a search problem
 Game trees represent alternate computer/opponent moves
 Evaluation functions estimate the quality of a given board
configuration for the Max player.
 Minimax is a procedure which chooses moves by assuming that the
opponent will always choose the move which is best for them
 Alpha-Beta is a procedure which can prune large parts of the search
tree and allow search to go deeper
 For many well-known games, computer algorithms based on heuristic
search match or out-perform human world experts.

 Comment on Backtracking and look ahead
strategies (forward)in constraint
 satisfaction problems. [6]
 Apply crypt arithmetic to solve the problem
and represent the state search space to solve
,TWO+TWO=FOUR (OCT2019)

AI_unit3.pptx

Recommandé

Recommandé

Contenu connexe

Similaire à AI_unit3.pptx

Similaire à AI_unit3.pptx (20)

Dernier

Dernier (20)

AI_unit3.pptx