Guest presentation given to a mixed-discipline group at the University of West of Scotland Research Students Society @ UWoS 3rd March 2010.
Topics covered : High level overview of work with AI for Poker, Ms. Pac-Man and my own research on the I2 system, concluding with some of my opinions on the current state of Academic and Industrial Game AI.
3. Firstly
• This image has been used to
advertise this talk.
2
4. Firstly
• This image has been used to
advertise this talk.
‣ To clarify, this is not me...
2
5. About Me
• Undergrad in AI and CS at Edinburgh
• MSc in Bio-Informatics also Edinburgh
• MRes in “Automated Planning for Autonomous
Systems” Strathclyde
• RA/PhD Student attached to “Strathclyde Planning
Group” and “Strathclyde AI in Games Group”
• Staff Writer for AIGameDev.com
3
6. Summary
• Intro to AI for Games
• AI for Poker
• AI for Ms Pac-Man
• The Integrated Influence Architecture
• The Future of AI in Games
4
8. What is AI?
• Any time a computer makes any sort of decision
between a number of options, it can be thought of
as acting “intelligently”.
5
9. What is AI?
• Any time a computer makes any sort of decision
between a number of options, it can be thought of
as acting “intelligently”.
• Whether or not those decisions are the right ones
is how “good” the intelligence is.
5
19. Why Games?
• Games provide AI research with some interesting
properties
7
20. Why Games?
• Games provide AI research with some interesting
properties
‣ We find predicting the behaviour of other agents (e.g.
players) difficult.
7
21. Why Games?
• Games provide AI research with some interesting
properties
‣ We find predicting the behaviour of other agents (e.g.
players) difficult.
‣ We need to be able to automatically generate long-term
strategies to be able to win the game
7
22. Why Games?
• Games provide AI research with some interesting
properties
‣ We find predicting the behaviour of other agents (e.g.
players) difficult.
‣ We need to be able to automatically generate long-term
strategies to be able to win the game
‣ Working with existing games, we’re not biasing our
results designing games that are suited to AI
7
23. Why Games?
• Games provide AI research with some interesting
properties
‣ We find predicting the behaviour of other agents (e.g.
players) difficult.
‣ We need to be able to automatically generate long-term
strategies to be able to win the game
‣ Working with existing games, we’re not biasing our
results designing games that are suited to AI
‣ Why do extra work to make simulations and demos?
7
24. Summary
• Intro to AI for Games
• AI for Poker
• AI for Ms Pac-Man
• The Integrated Influence Architecture
• The Future of AI in Games
8
25. Limit Texas Hold ‘em
• Texas Hold ‘em is a popular variant of poker
• Players are dealt two cards each and share 5
communal cards.
• The “Limit” part caps the amount of betting to a
discrete amount - typically only 3 raises per round
of betting, with the value of each raise being
determined by the “level” of the table.
9
28. Why Limit?
• Limit makes life a lot easier for us as decision
makers - a raise is a raise is a raise.
11
29. Why Limit?
• Limit makes life a lot easier for us as decision
makers - a raise is a raise is a raise.
• At each decision point in the game, a player can call/
check, bet/raise or fold.
11
30. Why Limit?
• Limit makes life a lot easier for us as decision
makers - a raise is a raise is a raise.
• At each decision point in the game, a player can call/
check, bet/raise or fold.
• This means at the k th decision point, the space of
possible states is approximated by 3k.
11
31. Current
State
1st Player
Call Fold Raise
2nd Player
Call Fold Raise
12
32. Current
State
1st Player
Call Fold Raise
2nd Player
Call Fold Raise
13
33. StrathPoker
• StrathPoker is a system under development at
Strathclyde based around Opponent Modelling
(OM) and “Monte Carlo Simulation” (MC).
• Core idea is that if we can categorise players into
archetypes, we can predict their actions more
accurately, and push this prediction into the MC to
get a better understanding of how the game will go.
14
35. Categorising Players
• We have sample data of around a million hands of
poker taken from online poker sites.
15
36. Categorising Players
• We have sample data of around a million hands of
poker taken from online poker sites.
• Each player is identified by a unique ID so we can
see what the individuals are up to.
15
37. Categorising Players
• We have sample data of around a million hands of
poker taken from online poker sites.
• Each player is identified by a unique ID so we can
see what the individuals are up to.
• The data is not complete, players who fold do not
show their cards.
15
39. Players as Datapoints
• We do have a lot of info about the actions a player
took, and we can generate aggregate stats such as
how much a player has won (that we’ve seen).
16
40. Players as Datapoints
• We do have a lot of info about the actions a player
took, and we can generate aggregate stats such as
how much a player has won (that we’ve seen).
• We can generate a data point representing a player
in about 30 dimensions.
16
41. Players as Datapoints
• We do have a lot of info about the actions a player
took, and we can generate aggregate stats such as
how much a player has won (that we’ve seen).
• We can generate a data point representing a player
in about 30 dimensions.
• Contrast this with professional (human)
categorisation based on just 3 dimensions.
16
43. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
17
44. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
• Typically classify players based on three stats:
17
45. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
• Typically classify players based on three stats:
‣ VPiP - Voluntarily Put in Pot
17
46. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
• Typically classify players based on three stats:
‣ VPiP - Voluntarily Put in Pot
‣ WSD - Percentage of show downs won
17
47. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
• Typically classify players based on three stats:
‣ VPiP - Voluntarily Put in Pot
‣ WSD - Percentage of show downs won
‣ PFR - Pre-Flop Raise
17
48. Human Categorisation
• Pro poker players use stats software to monitor the
flow of the game and track individual players’
performance.
• Typically classify players based on three stats:
‣ VPiP - Voluntarily Put in Pot
‣ WSD - Percentage of show downs won
‣ PFR - Pre-Flop Raise
• Interface applies a HUD to online game with info.
17
50. Categorisation
• We throw as much data into our categorisation as
possible.
18
51. Categorisation
• We throw as much data into our categorisation as
possible.
• Run “Principal Components Analysis” to find a new
set of basis vectors for the data and compress ~30
to 8 dimensions with minimal loss.
18
52. Categorisation
• We throw as much data into our categorisation as
possible.
• Run “Principal Components Analysis” to find a new
set of basis vectors for the data and compress ~30
to 8 dimensions with minimal loss.
• Cluster datapoints in the 8 dimensions using Fuzzy
c-Means.
18
55. Using Categorisation
• Monte Carlo plays out simulations of the game to
get a feel for how good each of the possible
decisions now will end up in the future.
20
56. Using Categorisation
• Monte Carlo plays out simulations of the game to
get a feel for how good each of the possible
decisions now will end up in the future.
• Our approach attempts to get a more accurate
simulation by modelling how each player acts much
better.
20
58. Evaluation
• How do you evaluate an AI poker player?
21
59. Evaluation
• How do you evaluate an AI poker player?
• Problem is that random deals may or may not
favour our player.
21
60. Evaluation
• How do you evaluate an AI poker player?
• Problem is that random deals may or may not
favour our player.
• This is true for real players - how do they evaluate?
21
61. Evaluation
• How do you evaluate an AI poker player?
• Problem is that random deals may or may not
favour our player.
• This is true for real players - how do they evaluate?
• Trick lies in not looking at the individual games for
evaluation, but a total playing session.
21
63. Experiments
• We set up a sequence of different hands of Poker
for which the deal is fixed.
‣ I.e. deck of cards in hand 1 is set to:
- Ac, 10s, 2h, 4d ......
22
64. Experiments
• We set up a sequence of different hands of Poker
for which the deal is fixed.
‣ I.e. deck of cards in hand 1 is set to:
- Ac, 10s, 2h, 4d ......
• We use 6-max games, 6 players gives us 12 cards
dealt initially, and then 5 for the table.
22
65. Experiments
• We set up a sequence of different hands of Poker
for which the deal is fixed.
‣ I.e. deck of cards in hand 1 is set to:
- Ac, 10s, 2h, 4d ......
• We use 6-max games, 6 players gives us 12 cards
dealt initially, and then 5 for the table.
• We use bots that conform to the archetypes to
populate the table.
22
67. Experiments (ii)
• This gives us the closest to a “controlled”
environment we can get for comparing bots.
23
68. Experiments (ii)
• This gives us the closest to a “controlled”
environment we can get for comparing bots.
‣ Not entirely controlled since opponents may still react
differently to a different test-bot’s actions
23
69. Experiments (ii)
• This gives us the closest to a “controlled”
environment we can get for comparing bots.
‣ Not entirely controlled since opponents may still react
differently to a different test-bot’s actions
• We can compare bots by setting them loose at this
“table” and compare final results to see how well
they each do.
23
70. Experiments (ii)
• This gives us the closest to a “controlled”
environment we can get for comparing bots.
‣ Not entirely controlled since opponents may still react
differently to a different test-bot’s actions
• We can compare bots by setting them loose at this
“table” and compare final results to see how well
they each do.
• Experiments running right now - no results yet.
23
71. Summary
• Intro to AI for Games
• AI for Poker
• AI for Ms Pac-Man
• The Integrated Influence Architecture
• The Future of AI in Games
24
73. Ms Pac-Man
• Pac-Man is a deterministic game.
‣ From the start of the game, if the player makes exactly
the same set of moves, the result will always be the same
25
74. Ms Pac-Man
• Pac-Man is a deterministic game.
‣ From the start of the game, if the player makes exactly
the same set of moves, the result will always be the same
• That’s not a very interesting problem for AI.
‣ Optimal solutions exist and can be found with enough
horsepower and/or time.
25
75. Ms Pac-Man
• Pac-Man is a deterministic game.
‣ From the start of the game, if the player makes exactly
the same set of moves, the result will always be the same
• That’s not a very interesting problem for AI.
‣ Optimal solutions exist and can be found with enough
horsepower and/or time.
• Ms Pac-Man is non-deterministic.
‣ The ghosts act in a reasonably unpredictable manner.
25
77. StrathPac
• StrathPac is the highly original name for our project
to tackle writing AI systems to play Ms. Pac-Man.
26
78. StrathPac
• StrathPac is the highly original name for our project
to tackle writing AI systems to play Ms. Pac-Man.
• I’ve been working on this off and on, primarily with
undergraduates, for a couple of years.
26
79. StrathPac
• StrathPac is the highly original name for our project
to tackle writing AI systems to play Ms. Pac-Man.
• I’ve been working on this off and on, primarily with
undergraduates, for a couple of years.
• Aim is to maximise score.
26
80. StrathPac
• StrathPac is the highly original name for our project
to tackle writing AI systems to play Ms. Pac-Man.
• I’ve been working on this off and on, primarily with
undergraduates, for a couple of years.
• Aim is to maximise score.
‣ Pill clearing is very secondary.
26
81. StrathPac
• StrathPac is the highly original name for our project
to tackle writing AI systems to play Ms. Pac-Man.
• I’ve been working on this off and on, primarily with
undergraduates, for a couple of years.
• Aim is to maximise score.
‣ Pill clearing is very secondary.
• Different approaches to this have been tried, with
limited success.
26
83. Why?
• Who cares if we make a good Ms. Pac-Man player?
27
84. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
27
85. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
‣ Real-time operation
27
86. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
‣ Real-time operation
‣ Completing objectives with adversaries
27
87. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
‣ Real-time operation
‣ Completing objectives with adversaries
‣ Contrasting objectives (e.g. staying alive vs killing ghosts)
27
88. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
‣ Real-time operation
‣ Completing objectives with adversaries
‣ Contrasting objectives (e.g. staying alive vs killing ghosts)
• Good solutions to this will have other applications
27
89. Why?
• Who cares if we make a good Ms. Pac-Man player?
• The aspects that make this game challenging make
other things challenging too.
‣ Real-time operation
‣ Completing objectives with adversaries
‣ Contrasting objectives (e.g. staying alive vs killing ghosts)
• Good solutions to this will have other applications
• Also, its part of many competition tracks!
27
91. How?
• The previous versions of StrathPac have been based
on a “Screen Scraping” framework developed by
Lucas (U. Essex)
28
92. How?
• The previous versions of StrathPac have been based
on a “Screen Scraping” framework developed by
Lucas (U. Essex)
• Interesting challenge in as much as no interaction
between AI and game except “seeing” and “acting”
‣ Models closely the way actual intelligence is
compartmentalised from the world.
28
95. One Approach
• Driven by three “motivations”
‣ Hunger for pills
29
96. One Approach
• Driven by three “motivations”
‣ Hunger for pills
‣ Fear of ghosts
29
97. One Approach
• Driven by three “motivations”
‣ Hunger for pills
‣ Fear of ghosts
‣ Aggression towards blue ghosts
29
98. One Approach
• Driven by three “motivations”
‣ Hunger for pills
‣ Fear of ghosts
‣ Aggression towards blue ghosts
• These motivations generate “Influence Maps” that
attract and repel the agent from points of the game
world.
29
101. Balancing Motivations
• Lot of experimentation trying to tune the
parameters governing how the influence is
generated.
31
102. Balancing Motivations
• Lot of experimentation trying to tune the
parameters governing how the influence is
generated.
• Unsupervised learning using “Genetic Algorithms”,
ideal for fiddling with multiple variables at once.
31
103. Balancing Motivations
• Lot of experimentation trying to tune the
parameters governing how the influence is
generated.
• Unsupervised learning using “Genetic Algorithms”,
ideal for fiddling with multiple variables at once.
• 20 computers playing 20 games each per
configuration, across 30 generations of evolution
31
104. Balancing Motivations
• Lot of experimentation trying to tune the
parameters governing how the influence is
generated.
• Unsupervised learning using “Genetic Algorithms”,
ideal for fiddling with multiple variables at once.
• 20 computers playing 20 games each per
configuration, across 30 generations of evolution
‣ About 12,000 games of Ms Pac-Man...
31
107. Results
• Not great.
• GA gave a 100% increase in score over initial
configuration, but final score still not particularly
competitive.
32
108. Results
• Not great.
• GA gave a 100% increase in score over initial
configuration, but final score still not particularly
competitive.
• Principal take away is that naive solutions that don’t
do reasoning don’t do very well.
32
109. Summary
• Intro to AI for Games
• AI for Poker
• AI for Ms Pac-Man
• The Integrated Influence Architecture
• The Future of AI in Games
33
111. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
34
112. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
34
113. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
‣ Fast techniques too stupid
34
114. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
‣ Fast techniques too stupid
‣ Smart techniques too slow
34
115. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
‣ Fast techniques too stupid
‣ Smart techniques too slow
• My research is aimed at bridging the gap
34
116. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
‣ Fast techniques too stupid
‣ Smart techniques too slow
• My research is aimed at bridging the gap
• Additionally, looking at “real” environments
34
117. Motivation
• Work with Ms. Pac-Man highlighted deficiencies in
current AI techniques
‣ Fast, flexible long term planning currently impossible
‣ Fast techniques too stupid
‣ Smart techniques too slow
• My research is aimed at bridging the gap
• Additionally, looking at “real” environments
‣ Dynamic, multi-agent, real-time etc.
34
119. Core Premise
• Searching state or trajectory spaces is a slow
process. Most Automated Planning domains are rich
enough to describe P-Space Complete problems
35
120. Core Premise
• Searching state or trajectory spaces is a slow
process. Most Automated Planning domains are rich
enough to describe P-Space Complete problems
‣ Though human-solvable problems tend towards NP-Hard
and below...
35
121. Core Premise
• Searching state or trajectory spaces is a slow
process. Most Automated Planning domains are rich
enough to describe P-Space Complete problems
‣ Though human-solvable problems tend towards NP-Hard
and below...
• Evaluating functions is trivial by comparison.
35
122. Core Premise
• Searching state or trajectory spaces is a slow
process. Most Automated Planning domains are rich
enough to describe P-Space Complete problems
‣ Though human-solvable problems tend towards NP-Hard
and below...
• Evaluating functions is trivial by comparison.
• Extends the notion of Influence Maps into
conceptual space we call “Influence Landscapes”
35
124. Architecture
• Most similar systems either choose to respond
reactively or deliberatively to specific aspects of the
environment, or can act reactively within certain
parameters of the deliberative system.
36
125. Architecture
• Most similar systems either choose to respond
reactively or deliberatively to specific aspects of the
environment, or can act reactively within certain
parameters of the deliberative system.
• The I2 system aims to continually be influenced by
input from both a purely reactive evaluator and a
deliberative evaluator at all decision points.
36
129. To Be Continued...
• In depth discussion of exactly how this works is
beyond the scope of this presentation.
• Tune in next week when I’ll be presenting this work
as part of the Computer Science Departmental
Seminar series next week.
40
130. Summary
• Intro to AI for Games
• AI for Poker
• AI for Ms Pac-Man
• The Integrated Influence Architecture
• The Future of AI in Games
41
132. Where are we going?
• Game AI research broken into two different factions
42
133. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
42
134. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
‣ People using AI to make “better” games
42
135. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
‣ People using AI to make “better” games
• Major dichotomy as a very good enemy AI
42
136. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
‣ People using AI to make “better” games
• Major dichotomy as a very good enemy AI
‣ Is frustrating to play against
42
137. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
‣ People using AI to make “better” games
• Major dichotomy as a very good enemy AI
‣ Is frustrating to play against
‣ Takes resources to create
42
138. Where are we going?
• Game AI research broken into two different factions
‣ People using games to frame “serious” AI questions.
‣ People using AI to make “better” games
• Major dichotomy as a very good enemy AI
‣ Is frustrating to play against
‣ Takes resources to create
‣ Does not give a good player experience
42
139. Where are we going?
Does NOT increase sales!
43
140. Where are we going?
• At the same time though, there’s an increasing drive
to use AI in more and more interesting ways.
• Means that AI is no longer restricted to “controlling
the enemies” and not bound to a level of human-like
(or below) intelligence in order to be beatable.
44
142. Left 4 Dead
• Left 4 Dead is a Survival Horror shooter game.
• As one of four people not infected, kill waves of
zombies as you push to escape the quarantine zone.
• Introduced the concept of an “AI Director” which
controls the pacing of the game.
• Aim is to replicate horror film cycle of:
Calm => Build-up => Frenzy => Relax => Calm
46
143. Galactic Arms Race
• GAR is a game created by University of Central
Florida. Project led by Ken Stanley (who you may
know as the creator of NERO)
• Uses an algorithm to produce content within the
game - Content Generating NeuroEvolution of
Augmenting Topologies
• Evolves new and unique weapons unsupervised
47
145. Batman Arkham Asylum
• The new Batman game is a great example of a game
where improving the NPC AI would be detrimental.
• Chosen by some for action Game of the Year 2009,
relies on the use of “thugs” to generate iconic feel.
• Making enemies smarter, or even slightly less
predictable, would destroy the KAPOW! aspect.
49
146. Cinematic Games
• Designers are often loathe to impart true AI into
their NPCs because this could detract from the
cinematic experience.
• They want to stage-manage how the AI acts, and
what the player experiences.
• They don’t want AI smart enough to realise standing
next to the big red exploding barrel might be bad!
50
147. The Sandbox
• Sandbox games are where AI techniques can come
into their own from an NPC perspective.
• Imagine MMOs where small player populations were
masked by imperceptible bots.
• Imagine games where NPCs interact with the world
on an equal footing to players.
‣ Unpredictable, sometimes novel, solutions to problems
51
148. Final Remarks
• A* still ridiculously over-represented in Game AI
• Lots of interesting things happening in Academia
‣ Lots of unanswered AI questions still exist to work on
‣ We’re a long way from having the kind of General AI that
you see in movies.
• Lots of techniques being adopted by Industry
52
149. Final Remarks (ii)
• Graphics has been pushed about far as it can go.
• Increasingly more emphasis on other aspects:
‣ Physics
‣ Animation
‣ AI
• AI has a lot more to offer in terms of content
creation, experience management etc.
53
150. Shameless Plugs
• Me
‣ http://lukedicken.com
‣ luke@cis.strath.ac.uk
‣ Next talk - Next week, here.
• Strathclyde AI in Games Group
‣ johnl@cis.strath.ac.uk
54
151. Shameless Plugs (ii)
• AIGameDev.com
‣ One stop shop for all things Games and AI
‣ Regular posts about current AI techniques
‣ Interviews with Industry figures
‣ Masterclasses explaining techniques
• Paris Game AI Conference
‣ Organised by AIGameDev.com
‣ Heavy Industry focus rather than Academic
55
152. Shameless Plugs (iii)
• IEEE Symposium on Computational Intelligence and
Games 2010
‣ Call for papers out now
‣ Submission deadline March 15th
‣ Conference in Copenhagen in August
‣ Excellent competition track
56