\documentstyle[fleqn,epsf,aima-slides]{article}
\begin{document}
\begin{huge}
\titleslide{Game playing}{Chapter 5, Sections 1--5}
\sf
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Outline}
\blob Perfect play
\blob Resource limits
\blob $\alpha$--$\beta$ pruning
\blob Games of chance
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Games vs.~search problems}
``Unpredictable'' opponent $\Rightarrow$ solution is a contingency plan
Time limits $\Rightarrow$ unlikely to find goal, must approximate
Plan of attack:
\begin{itemize}
\item algorithm for perfect play (Von Neumann, 1944)
\item finite horizon, approximate evaluation (Zuse, 1945; Shannon, 1950; Samuel, 1952--57)
\item pruning to reduce costs (McCarthy, 1956)
\end{itemize}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Types of games}
\vspace*{0.3in}
\epsfxsize=1.05\textwidth
\fig{\file{figures}{game-types.ps}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Minimax}
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest {\em minimax value}\nl
= best achievable payoff against best play
E.g., 2-ply game:
\epsfxsize=0.8\textwidth
\fig{\file{figures}{minimax.ps}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Minimax algorithm}
\input{\file{algorithms}{minimax-algorithm}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Properties of minimax}
\q{Complete}
\q{Optimal}
\q{Time complexity}
\q{Space complexity}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Properties of minimax}
\q{Complete} Yes, if tree is finite (chess has specific rules for this)
\q{Optimal} Yes, against an optimal opponent. Otherwise??
\q{Time complexity} $O(b^m)$
\q{Space complexity} $O(bm)$ (depth-first exploration)
For chess, $b\approx 35$, $m \approx 100$ for ``reasonable'' games\nl
$\Rightarrow$ exact solution completely infeasible
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Resource limits}
Suppose we have 100 seconds, explore $10^4$ nodes/second\nl
$\Rightarrow$ \u{$10^6$} nodes per move
Standard approach:
\begin{itemize}
\item {\em cutoff test}\nl
e.g., depth limit (perhaps add {\em quiescence search})
\item {\em evaluation function}\nl
= estimated desirability of position
\end{itemize}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Evaluation functions}
\vspace*{0.3in}
\epsfxsize=0.8\textwidth
\fig{\file{figures}{chess-evaluation-bc.ps}}
For chess, typically {\em linear} weighted sum of \u{features}
\[
{\sc Eval}(s) = w_1 f_1(s) + w_2 f_2(s) + \ldots + w_n f_n(s)
\]
e.g., $w_1 = 9$ with \\
$f_1(s)$ = (number of white queens) -- (number of black queens)\\
etc.
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Digression: Exact values don't matter}
\vspace*{0.3in}
\epsfxsize=1.05\textwidth
\fig{\file{figures}{ordinal-utility.ps}}
Behaviour is preserved under any {\em monotonic} transformation of
{\sc Eval}
Only the order matters:\nl
payoff in deterministic games acts as an {\em ordinal utility} function
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Cutting off search}
{\sc MinimaxCutoff} is identical to {\sc MinimaxValue} except\nl
1. {\sc Terminal?} is replaced by {\sc Cutoff?}\nl
2. {\sc Utility} is replaced by {\sc Eval}
Does it work in practice?
\[
b^m = 10^6,\quad b=35\quad \Rightarrow \quad m=4
\]
4-ply lookahead is a hopeless chess player!
4-ply $\approx$ human novice\\
8-ply $\approx$ typical PC, human master\\
12-ply $\approx$ Deep Blue, Kasparov
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{$\alpha$--$\beta$ pruning example}
\vspace*{0.3in}
\epsfxsize=0.9\textwidth
\fig{\sfile{figures}{alpha-beta-progress1.ps}}
%%%%%%%%%%%% Overlay %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pheading{$\alpha$--$\beta$ pruning example}
\vspace*{0.3in}
\epsfxsize=0.9\textwidth
\fig{\sfile{figures}{alpha-beta-progress2.ps}}
%%%%%%%%%%%% Overlay %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pheading{$\alpha$--$\beta$ pruning example}
\vspace*{0.3in}
\epsfxsize=0.9\textwidth
\fig{\sfile{figures}{alpha-beta-progress3.ps}}
%%%%%%%%%%%% Overlay %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pheading{$\alpha$--$\beta$ pruning example}
\vspace*{0.3in}
\epsfxsize=0.9\textwidth
\fig{\sfile{figures}{alpha-beta-progress4.ps}}
%%%%%%%%%%%% Overlay %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pheading{$\alpha$--$\beta$ pruning example}
\vspace*{0.3in}
\epsfxsize=0.9\textwidth
\fig{\sfile{figures}{alpha-beta-progress5.ps}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Properties of $\alpha$--$\beta$}
Pruning {\em does not} affect final result
Good move ordering improves effectiveness of pruning
With ``perfect ordering,'' time complexity = $O(b^{m/2})$\nl
$\Rightarrow$ {\em doubles} depth of search\nl
$\Rightarrow$ can easily reach depth 8 and play good chess
A simple example of the value of reasoning about which
computations are relevant (a form of {\em metareasoning})
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Why is it called $\alpha$--$\beta$?}
\vspace*{0.1in}
\epsfxsize=0.4\textwidth
\fig{\file{figures}{alpha-beta-general.ps}}
$\alpha$ is the best value (to {\sc max}) found so far off the current path
If $V$ is worse than $\alpha$, {\sc max} will avoid it
$\Rightarrow$ prune that branch
Define $\beta$ similarly for {\sc min}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{The $\alpha$--$\beta$ algorithm}
Basically {\sc Minimax} + keep track of $\alpha$, $\beta$ + prune
\input{\file{algorithms}{alpha-beta-algorithm}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Deterministic games in practice}
Checkers: Chinook ended 40-year-reign of human world champion Marion
Tinsley in 1994. Used an endgame database defining perfect play for
all positions involving 8 or fewer pieces on the board, a total of
443,748,401,247 positions.
Chess: Deep Blue defeated human world champion Gary Kasparov
in a six-game match in 1997. Deep Blue searches 200 million positions
per second, uses very sophisticated evaluation, and undisclosed methods for
extending some lines of search up to 40 ply.
Othello: human champions refuse to compete against computers, who are
too good.
Go: human champions refuse to compete against computers, who are too
bad. In go, $b > 300$, so most programs use pattern knowledge bases to
suggest plausible moves.
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Nondeterministic games}
E..g, in backgammon, the dice rolls determine the legal moves
Simplified example with coin-flipping instead of dice-rolling:
\vspace*{0.3in}
\epsfxsize=0.55\textwidth
\fig{\file{figures}{expectiminimax-simple.ps}}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Algorithm for nondeterministic games}
{\sc Expectiminimax} gives perfect play
Just like {\sc Minimax}, except we must also handle chance nodes:
$\ldots$\\
\key{if} \var{state} is a chance node \key{then}\nl
\key{return} average of {\sc ExpectiMinimax-Value} of {\sc Successors}(\var{state})\\
$\ldots$
A version of $\alpha$--$\beta$ pruning is possible\\
but only if the leaf values are bounded. \q{Why}
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Nondeterministic games in practice}
Dice rolls increase $b$: 21 possible rolls with 2 dice\\
Backgammon $\approx$ 20 legal moves (can be 6,000 with 1-1 roll)
\[
{\rm depth}\ 4 = 20 \times (21 \times 20)^3 \approx 1.2\times 10^9
\]
As depth increases, probability of reaching a given node shrinks\nl
$\Rightarrow$ value of lookahead is diminished
$\alpha$--$\beta$ pruning is much less effective
{\sc TDGammon} uses depth-2 search + very good {\sc Eval}\nl
$\approx$ world-champion level
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Digression: Exact values DO matter}
\vspace*{0.3in}
\epsfxsize=1.05\textwidth
\fig{\file{figures}{chance-evaluation.ps}}
Behaviour is preserved only by {\em positive linear} transformation of
{\sc Eval}
Hence {\sc Eval} should be proportional to the expected payoff
%%%%%%%%%%%% Slide %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\heading{Summary}
Games are fun to work on! (and dangerous)
They illustrate several important points about AI
\blob perfection is unattainable $\Rightarrow$ must approximate
\blob good idea to think about what to think about
\blob uncertainty constrains the assignment of values to states
Games are to AI as grand prix racing is to automobile design
\end{huge}
\end{document}