lesson plan - oct 31 - games

1. Kinds of games
2. Minimax
3. Alpha-Beta Pruning

kinds of games - create grid:

1 players  |  zero-sum /  |  deterministic  |  perfect info | Game
2 players  | cooperative  |  / chance       |  hidden info  | 
k players  |
teams      |

what games have people in the class played in the past year?
how do they fill in the grid?
what games fill in the blanks?

--> You cannot have a game that is deterministic but with hidden
    information, because the rules of the game must specify the
    starting state

what about:
games involving physical skills (eg soccer?)
puzzles (eg rubic's cube) == 1 player deterministic 
single player + chance = games against nature

1 player deterministic game == A* search
1 player chance 

2 player zero-sum perfect info games: minimax

Theorem: expected utility of minimax against an imperfect opponent is no
less than the expected utility against a perfect opponent.

I.e.: you cannot beat a Vulcan by being irrational (Star Trek to the
contrary!)

Formal inductive proof due Monday!
Base case: 1 "ply" (player then opponent move).
What is it to be "irrational"?  Opponent does not min.
Inductive case: given true if k plies to end, prove for k+1.

What if you know in what way the opponent is irrational?
Come up with specific example from a game -- eg chess,
rock/paper/sissors, etc.


------------------------------------------------------------------------
TURN ON SLIDES:

Alpha-beta pruning.  Idea: in theory could search all options in
parallel, in practice usually done sequentially.  Pass best values
found so far to siblings, so can prune paths that we know will never
be taken.

alpha = highest value found so far along path for max
beta = lowest value found so far along path for max

MaxValue(state,alpha,beta) 
- No longer returns the true value of the state
- Instead: returns a number such that the state is at *least* this good

MaxValue(state, alpha, beta){
	if (terminal(state)) return utility(state);
	highest = - infinity;
	for (s in children(state)){
		best = max(best, MinValue(s,alpha,beta);
		if (best >= beta) return best;
		alpha = max(alpha,best);
	}
	return best;
}	

-----------------------------------------------------------------

Evaluation functions.
Common form: weighted sume of features
Example: Othello
features: different classes of squares on board (edges, corners)
How are weights determined?
- historical experience (chess)
- learning: adjust weights to minimize error between true and
estimated value of a state
- case study: Othello

Other techniques:
- caching: transposition tables
- end game databases
 - Ken Thompson built all 5-pieces Chess endgame databases without pawns, and many 6-pieces ones
 - Lewis Stiller has calculated 5 and 6-pieces chess endgames databases on a massively parallel machine but could not save them. He recorded the position with the maximum number of moves to win, called the monster
   --> forced win in 255 moves!

Kasparov versus Deep Blue and Deep Junior
* discuss matches
* photos and audio of Kasparov losing

------------

Mathematical Game Theory
- reducing a game to matrix form
- prisoner's dilemma
- economic games: auctions

Games of Chance
Background on probability theory.
Expectations.
Exercise: roll dice to demonstrate that expected number of rolls until
an event with probability p occurs is 1/p.
--> Then give proof.

- backgammon
    Tesauro's neurogammon
- games with chance and hidden state: poker
- probabilistic planning with perfect information: Monkeys and Bananas