Using ASP for Game Playing
Date: May 25, 2015
From: Evgenii Balai
To: TAG
We are interested in using ASP for developing an agent to play in 2-player
games (e.g, checkers or nim). Is anyone aware of any methods and/or
published work in this direction?
=====
Date: May 25, 2015
From: Vladimir Lifschitz
To: Evgenii Balai
This paper was published long before the emergence of ASP, but you may
find it useful:
@INCOLLECTION(emd84,
AUTHOR="van Emden, Maarten and Clark, Keith",
TITLE="The logic of two-person games",
BOOKTITLE="Micro-PROLOG: Programming in Logic",
YEAR="1984",
PUBLISHER="Prentice-Hall",
PAGES="320--340"
)
=====
Date: May 26, 2015
From: Peter Schueller
To: Evgenii Balai
The dlvhex system was used to develop a player for Angry Birds.
Calimeri, F., Fink, M., Germano, S., Ianni, G., Redl, C., & Wimmer, A.
(2013, December). AngryHEX: an Artificial Player for Angry Birds Based
on Declarative Knowledge Bases. In PAI@ AI* IA (pp. 29-35).
This was done using the "acthex" plugin for creating acting agents in
the HEX language. It is also possible to do such things using only
dlvhex and its Python Plugin interface.
http://www.kr.tuwien.ac.at/research/systems/dlvhex/
http://www.kr.tuwien.ac.at/research/systems/dlvhex/actionplugin.html
=====
Date: May 26, 2015
From: Francesco Calimeri
To: Evgenii Balai
this might be of interest as well.
http://ceur-ws.org/Vol-1107/paper9.pdf
=====
Date: May 26, 2015
From: Son Tran
To: Evgenii Balai
If you consider negotiation as a game then we did have an implementation
as well. We did describe it in the paper "Ngoc-Hieu Nguyen, Tran Cao Son,
Enrico Pontelli, and Chiaki Sakama. ASP-Prolog for Negotiation Among
Dishonest Agents. LPNMR, 2011 and more detail can be found in the paper
"Tran Cao Son, Enrico Pontelli, Ngoc-Hieu Nguyen, Chiaki Sakama:
Formalizing Negotiations Using Logic Programming. ACM TOCL.
=====
Date: May 26, 2015
From: Alex Brik
To: Evgenii Balai
Prof. Remmel and I have a paper on using Hybrid ASP and dynamic programming
to compute a finite horizon optimal strategy for making decisions in a
dynamic domain (http://www.dbai.tuwien.ac.at/NMR12/proceedings/22_NMR_22.pdf).
The approach can probably be extended to computing a strategy for a player
in some 2-player games.
=====
Date: May 28, 2015
From: Evgenii Balai
To: Alex Brik
Your paper formalizes some of the thoughts we originally had.
I do believe the approach can be extended to strategies in 2-player games.
The conditions which change in the Merchant Story you describe (such as
storm, enacting a tax, etc) may correspond to opponents' moves in a game.
=====
Date: May 28, 2015
From: Ale Provetti
To: Evgenii Balai
I know it's inelegant to promote one's publications but at this point I
feel that my work at JELIA2004 with Luca Padovani on 'first-person'
video games could also help. This is our article at the publisher:
http://dx.doi.org/10.1007/978-3-540-30227-8_58
I also put it on the arXiv:
http://arxiv.org/abs/1505.07263
We applied ASP-based automated planning and execution to provide the
'AI' control to software agents acting inside Quake III arena, a (then)
popular "first-person shooter" video game.
While it is designed for multi-player games, there is also a
single-player mode which is played against computer-controlled 'bots',
i.e. opponents built inside the game platform.
Those bots were programmed via a finite-state machine ---often called
the 'AI component'.
While their actions, skills and vision over the playing field were the
same as those of human agents, they come across as less goal-driven.
In other words, whereas the human player tries to 'pass the level' by,
well, shooting down opponents,
bots often wander about the playing field and when they see you they may
engage in 'skirmishes,' low-distance combat.
Thanks to Luca's experience with video game programming, we put our
hands deep into the *spaghetti bowl* of Quake III arena source (in C++)
and were able to substitute the default 'AI' with a set of calls to our
external ASP planner, select the most urgent goals, generate a plan and
pass it to execution by our bot in the 'arena'.
The plans were short because on simple Quake III levels it didn't pay
off to plan longer than 4/5 actions at a time.
Also, the intervention of the human opponent, who could change course,
attack unexpectedly, turn back etc. would easily make the plan invalid
and send our bot 'back' to planning.
So of course there's a two-player dimension here.
We chose not to touch the 'reactive' part of the bot, i.e., low-level
functions that control body movements and rapid shooting and which take
control during skirmishes
Looking back to it, it has been a very nice modeling and programming
experience!
However, our solution is specific for Quake III arena and we would have
to rework the code (not the planner) for each platform and even for
changes to the same platform.
Also, Quake III Arena is a HUGE software, written and organized by
professionals but certainly not meant to host changes on the scale we
needed to attach an external AI component.
=====
Date: June 6, 2015
From: Mark Nelson
To: TAG
To contribute a specific thing we've often found useful in our work, but is
a bit awkward in ASP: we've found it useful to have quantification over a
series of possible actions of [player that isn't us] as a basic operation.
In multiplayer games this comes up when you want to ensure that an opponent
is prevented from doing something (or forced into doing something) over all
possible response sequences available to them. And it comes up in game
contexts other than game *playing* as well. For example in automatic level
generation, a common desire is to ensure that all possible playthroughs of
a generated level have a certain property.
Unfortunately this kind of quantification, as far as I'm aware, isn't
expressed very naturally in ASP, though it's possible to do. The paper
"Quantifying Over Play: Constraining Undesirable Solutions in Puzzle
Design" (https://www.adamsmith.as/papers/fdg2013_shortcuts.pdf) gives one
method of doing so with the Potassco tools (with application to level
generation). But it uses what is arguably a bit of a hack (the --reify
flag) that is available only in gringo 3.x, and was removed from 4.x. So
this specific method may not be the best for current use. But imo it would
be nice (and have a lot of application in games) if there were natural way
of expressing this type of constraint in ASP.
=====
Date: June 7, 2015
From: Adam Smith
To: TAG
I've recently ported the meta-encoding used in that FDG paper for use with
gringo-4 / ASP-Core-2. It's still about 5 lines, but a different 5 lines.
The "--reify" feature of gringo-3 is now available in a separate executable
"reify" distributed with recent versions of clingo-4.
The big challenge with multi-player games is that you want to arbitrarily
nest quantifiers to express minimax style relations. That is, beyond the
Sigma_2^P relations you can express with disjunctive ASP (or is it Delta_3^P
with optimization?), you want to be able to write down other problems in
PSPACE without being careful to count how many times you nest quantifiers.
Having done some thinking in this area, here's a sketch of how I'd proceed.
In some two-player game, we'll commit to representing one of the player's
strategies symbolically. We could represent it as an extensive form table
of exceptions from a fixed default strategy, as a finite-size binary
decision diagram, or maybe some simple deterministic finite state machine
with a constant number of states. When we want to solve for what strategy
the first player should take, our query now resides in Sigma_2^P. We ask:
show that there exists a compact strategy such that, for all reply moves
some number of moves ahead into the future, simulating our strategy with
those replies always puts the first player in a desirable state (or maybe
please optimize a bound on the quality of that state). The commitment to a
compactly representable strategy drops the alternation of quantifiers in
exchange for a single existential quantifier, reducing the complexity from
PSPACE to what we can express in ASP.
In a mathematical story problem generation project (for a single-player
educational puzzle game) recently, we applied a similar strategy to reduce
a problem naturally stated as Exists-Forall-Exists to the Exists-Forall
style problem we knew how to express in disjunctive ASP. Oversimplifying,
you might say we just applied skolemization to the inner quantifier.
However, the finesse was in the particular compact symbolic representation
of the Skolem function.
=====
Date: June 10, 2015
From: Evgenii Balai
To: Adam Smith
Can you give an example of a game and corresponding strategy which should
be modelled using arbitrary nested quantifiers?
I don't have an example where it is not sufficient to use one quantifier.
Consider a game between two players, a and b.
The game may be in finite number of states S1,...,Sk.
There is a successor relation between states, where successor(P, Si,Sj)
reads as "there is a move by P in the game which takes the game from
state Si to Sj."
The fact that player X wins in state S_i can be defined in English as:
"X wins in S_i if there exists a state S_j such that
- A can move to S_j and
- the opponent of X looses in S_j"
Similarly, X looses in state S_i can be written as:
"X looses in S_i if for all S_j such that
- X can move from S_i to S_j
- the opponent of X wins in S_j "
Strategy:
A player should move from S_i to S_j if his opponents looses in S_j.
Both statements for win and loose have only one quantifier.
Here is a [sorted] ASP implementation of the above approach for game called
NIM( http://en.wikipedia.org/wiki/Nim ): http://pastie.org/10234416.
The game has only 6^3 states, but I think the approach will work for games
with up to 10^9 states (the computation seems linear in terms of the number
of states).
Of course we need certain assumptions to make this approach work (e.g, the
game starting from a state always ends with loose or win, but I do not
think they are difficult to overcome). Am I missing/oversimplifying something?
=====
Date: June 11, 2015
From: Adam Smith
To: Evgenii Balai
If you reify all of the states in the game you want to analyze (i.e.,
construct objects from function terms that might be unified with the logic
variables S_i, S_j) in your example, then, yes, determining the win/loss
property of a state is a linear time operation -- linear in the number of
states. Further, it's an operation that can be completely be resolved
during the grounding phase of ASP. In a sense, we aren't using ASP at this
point, just the datalog-style front end language.
Consider, instead, a game like Go where it would be impractical to actually
reify all of the feasible board states. If each point on a traditional
19x19 board can be empty or occupied by a white or black stone, a direct
encoding of the board state gives 3^361 (about 10^172) combinations. Even
though the analysis requires time linear in the number of states, the time
to construct the possible states is exponential in the size of the board.
How can we reason about board states without paying the cost of an
exponential in board size? Let's develop a factored representation of the
board over time:
#const max_time=200.
#const width=19.
time(1..max_time).
point((1..width,1..width)).
color(white;black).
{ covers(C,P,T) : color(C) } :- time(T); point(P).
(constraints to model game rules or model player choices are ignored here)
This representation only requires grounding time O(max_time * (width^2)). I
have a hunch that modeling the game rules and player choices requires a
similar asymptotic cost during grounding. As a function of board size in Go
(or heap size in Nim), this is an exponential improvement.
With a setup like this, we can solve for possible sequences of play that
are valid according to the rules. However, we'll have a hard time writing
down the constraint that the sequence represents an optimal strategy for
either player. This is where expressing nested quantification matters.
Right now, we're just asking to "show that there exists at least one play
trace (where some quantifier free constraints over that trace hold)". To
express the constraint that the trace represents perfect play we want
something more like "show that there exists an initial move for White such
that for all moves for Black there exists a reply move for White such that
for all reply moves for Black .... White wins". We can't do this
practically or theoretically in ASP for the case of Go. (If a future
iterations of ASP were backed by QBF solvers, maybe it would be
theoretically possible...)
A more realistic query might be to ask for example traces where the moves
by White are the best they could possibly be given some evaluation of the
board some constant number of steps in the future (rather than the end of
the game). If this constant is above 2, it's also outside of what we can
directly express in ASP.
The trick I alluded to previously (about compactly representable
strategies) could be paraphrased as this: Transforming a two player game
into a single player game by forcing one of the players to play according
to a compact set of efficiently evaluable rules. After the transformation,
we don't need to quantify over all possible replies to our moves, only
those replies we can perfectly predict with access to the rules of our
opponent's strategy.
In our analysis of the single-player educational puzzle game Refraction (
https://adamsmith.as/papers/fdg2013_shortcuts.pdf -- previously linked in
this thread), we estimated that a given board configuration requires about
400 bits of information to uniquely define, suggesting there are about
2^400 (about 10^120) states we should consider in our search for puzzle
designs exhibiting certain properties. In the paper linked, we considered
the puzzle design problem, which is a restricted form of two player game
played between a puzzle designer and a quality assurance tester
(representing potentially adversarial behavior of an end-user). Although
this game is played for only two moves (the designer proposes a puzzle, and
then the tester attempts to show a weakness in it), the branching factors
are enormous (the designer has about 2^400 available "moves" and the tester
has about 2^100 replies). By using a factored board representation and
disjunctive rules to express the necessary nesting of quantification, we
devised an encoding of the high-assurance puzzle design problem that was
able to generate puzzles in a fraction of a second for a later interactive
design tool: https://adamsmith.as/papers/uist2013_progression.pdf
If we set 10^9 as a generous upper limit on the number of ground atoms and
rules in our answer-set programs, then representation strategies that reify
game states (introducing a number of atoms at least 1x the number of game
states) become untenable as we look at state space complexity of
traditional games just beyond Nim:
https://en.wikipedia.org/wiki/Game_complexity#Complexities_of_some_well-known_games
(try sorting the table by State Space Complexity). Moving computation from
the grounder into the solver is what makes quantification tricky.
=====
Date: June 11, 2015
From: Peter Schueller
To: Evgenii Balai
If you really want to win, you need to look more into the future:
X wins if there is SOME state where it can move and Y loses =
X wins if there is SOME state where it can move and FOR ALL states
where Y can move X wins =
X wins if there is SOME state where it can move and FOR ALL states
where Y can move there is SOME state where X can move and Y loses ...
This way you need arbitrarily deeply nested quantifiers. Because with
one quantifier you can only "think ahead one step".
=====
Date: June 11, 2015
From: Evgenii Balai
To: Adam Smith
Thank you for the detailed reply!
I see now where the nested quantification comes from.
I didn't realize that even grounding becomes infeasible when all the states
are reified!
"show that there exists an initial move for White such that for all moves
for Black there exists a reply move for White such that for all reply moves
for Black .... White wins". We can't do this practically or theoretically
in ASP.
I wonder what makes this "theoretically impossible". Is it, roughly
speaking, because answering queries about arbitrary nested quantification
has a complexity which lies beyond polynomial hierarchy (am I right at all?),
while answering queries in ASP is in it?
=====
Date: June 11, 2015
From: Evgenii Balai
To: Peter Schueller
Yes, indeed, I see now that it is possible to define an optimal strategy
using nested quantification. I even think your representation is more
natural than what I used for NIM with only one quantifier.
=====
Date: June 11, 2015
From: Marcello Balduccini
To: Adam Smith and Evgenii Balai
> "show that there exists an initial move for White such that for all moves for
> Black there exists a reply move for White such that for all reply moves for
> Black .... White wins". We can't do this practically or theoretically in ASP
If I understand correctly, this should be equivalent to finding a conditional
plan. In that case, I believe it is theoretically possible to do it in ASP.
Son Tran at some point (some 15 years ago :D) had developed a conditional
planner in ASP. The catch, if I remember correctly, was the growth of the
grounding. I cannot remember if he ever published a paper on it. Hopefully
Son can chime in.
> Consider, instead, a game like Go where it would be impractical to actually
> reify all of the feasible board states. If each point on a traditional 19x19 board
> can be empty or occupied by a white or black stone, a direct encoding of the
> board state gives 3^361 (about 10^172) combinations. Even though the
> analysis requires time linear in the number of states, the time to construct
> the possible states is exponential in the size of the board.
For the past few months, a colleague and I have been working on StarCraft,
which I am told has even higher complexity. For games of this kind, search
algorithms only consider a sampling of the search space. One algorithm
that is currently popular is Monte Carlo Tree Search (MCTS ). One of the
things we are interested in seeing is if one can use MCTS for the search
and ASP for the domain model, and how the resulting algorithm compares with
current instances of MCTS.
> The trick I alluded to previously (about compactly representable strategies)
> could be paraphrased as this: Transforming a two player game into a single
> player game by forcing one of the players to play according to a compact set
> of efficiently evaluable rules. After the transformation, we don't need to
> quantify over all possible replies to our moves, only those replies we can
> perfectly predict with access to the rules of our opponent's strategy.
I have done some work along these lines and I was quite satisfied with the
approach. You can read about it here: http://www.mbal.tk/papers/bal09e.pdf.