Assignment 4: Simple Game-Playing
CSE 415: Introduction to Artificial Intelligence
The University of Washington, Seattle, Autumn 2011
This assignment builds on Assignment 3 and has the same associated reading: Chapter 5 (Search) of Introduction to Artificial Intelligence Using Python.
Due Friday, October 28 through Catalyst CollectIt at 2:00 PM.

In this assignment we expand upon the game of Tic-Tac-Toe to a more interesting generalization of it: "K-in-Row with Forbidden Squares." At the same time, we put our agents into more serious competition, adding lookahead (with the Minimax technique) and pruning (with the alpha-beta method) to the search. In addition, we add an element of fun and creativity that could serve as an opportunity to beat the Turing Test; this takes place via a more elaborate version of the "utterance feature" introduced in Assignment 3. The assignment has two parts: creating your agent and engaging your agent in a first round of competition play.

You should turn in either 3 or 4 files (depending upon whether or not you are doing the extra-credit items in Part II): (1) a file minimaxAB.py that contains your Alpha-Beta search implementation for Part I, (3) a file [YourUWNetID]KInARow.py with your game playing agent described in Part II, and (2) a file [YourUWNetID]-[OtherUWNetID]-transcript.txt that provides a transcript of your agent playing against that of another classmate, described in Part III. If you are doing the extra credit items in Part II, then also turn in your file ExtraCredit.txt that explains those special features of your agent.
 

PART I: Basic Alpha-Beta Pruning Exercise (20 points).
For this exercise, you will implement a minimax function with alpha beta pruning. In order to focus on that algorithm, we will use a very limited search space with a pre-defined static evaluation function.

You can find pseudocode for the Alpha-Beta search algorithm on wikipedia.

Our search space for this algorithm is a pre-defined 31-node perfect binary tree. Normally, in game playing, the states of the search must be created by the program, but in this exercise, they are provided in advance. Because the tree is small, we can ignore Minimax (and Alpha-Beta) search's depth parameter for now. You will want to put it back in for your KinaRowPlayer implementation in Part II.

The algorithm has a condition "if ... node is a terminal node...." The tree will be implemented here as a 32 element array. We ignore the first element to make the arithmetic simpler, and then identify each node by its index in the array. Node one is at index 1. By "terminal node" we mean a leaf of this given search tree. The terminal nodes in our tree are nodes 16 to 31, located at indices 16 through 31. The test we need is very simple. The predicate isTerminal(node) will be true if 16 =< node <= 31.

The algorithm requires that we get a the "static value" of terminal nodes. For this exercise, we will take as input a 16 element array that gives the static values for the bottom row of the 31 element tree. These are the leaf nodes as shown in the illustration.
For example: [-4,-8,5,4,3,2,-3,-7,7,-6,1,-2,-5,-1,6,8]
The returned value for the program will be a 32 element array of tuples or None values. Again, we will ignore the element at position zero just to make things simpler. Each element will have a tuple if the node was processed, or None if it was not processed because of alpha beta pruning. Each tuple is of the form (alpha, beta, nodeValue), where alpha is the final value of the variable "alpha" before the (recursive) call to alphaBeta returns, beta is the corresponding value of the "beta" variable, and nodeValue is the value (static or backed-up) for the node being considered. The input shown above would produce the returned value shown below (without the line breaks, of course, and assuming no bugs in my code):

[None, (2, inf, 2), (-inf, 2, 2), (2, 2, 2), (4, inf, 4), (2, 4, 2), (2, inf, 2), 
None, (-inf, -8, -8), (-8, 4, 4), (-inf, 2, 2), (2, -3, -3), (2, -6, -6), (2, 1, 1), 
None, None, (-inf, inf, -4), (-inf, -4, -8), (-8, inf, 5), (-8, 5, 4), (-inf, 4, 3), 
(-inf, 3, 2), (2, 4, -3), None, (2, inf, 7), (2, 7, -6), (2, inf, 1), 
None, None, None, None, None]
Here -inf serves as a representation for negative infinity, and inf serves as a representation for positive infinity. A template file is available, giving the utility functions so all you have to do is implement minimaxAB().

To test your code, we will call getMiniMaxAB() with various values for the leaf nodes, such as [-4,-8,5,4,3,2,-3,-7,7,-6,1,-2,-5,-1,6,8]
 

PART II: Creating a K-In-A-Row Player (60 points).
Create a program implementing an agent that can participate in a game of K-in-a-Row with Forbidden Squares (defined below). Your program should consist of a single file, with a name of the form [UWNetID]KInARow.py, where [UWNetID] is your own UWNetID. For example, my file would have tanimotoKInARow.py for its name.

In this part, you will create a program -- consisting mainly of a collection of specific functions, for playing games of "K in a Row". We define a K in a Row game as a kind of generalized Tic-Tac-Toe with the following features: (a) Just as in Tic-Tac-Toe, there are two players: one plays X and the other plays O; (b) the board is rectangular, but is not necessarily 3 by 3; it is mRows by nColumns, where these numbers are chosen by the referee at the beginning of the game; (c) a player wins by getting K in a row, where K is not necessarily 3; K can be any integer greater than 1 and less than or equal to the maximum of mRows and nColumns; (d) there can be "forbidden squares" on the board; these are chosen at the beginning of the game by the "referee"; a square on the board that is available is represented by a zero, whereas a forbidden square is represented by a number 3; (e) there can be "handicaps" in the initial state, meaning that some X and/or O tokens can be set up on the board by the referee in order either shape the succeeding play or change the balance of advantage and disadvantage to the players.

In addition to being able to play the game, your program should have a well-defined "personality". Some examples of possible personalities are these: friendly; harmless joker; blunt joker; paranoid; wisecracker; sage; geek; wimp; competitive freak; fortune-teller (based on the state of the game). The personality will be revealed during games via the "utterances" made by the program. (For more details, see the description of the makeMove function below.)

Your program must include the following functions. You can have helper functions if you like. Please keep all the functions required by your player in just one Python file that follows the naming convention mentioned earlier. For example, my player would be in a file tanimotoKInARow.py. This will facilitate your player's being part of the class tournament.

  1. prepare(k, initialState, maxMillisecPerMove, isPlayingX, debugging). This function will accept five arguments, and it will remember these values for the game that is about to be played.

    The first parameter, k, is the number of pieces in a row (or column or diagonal) needed to win the game. The initialState parameter is a legal game state that can be used by your player to determine the dimensions of the board, the locations of forbidden squares, and even the locations of any handicap items.

    The parameter maxMillsecPerMove represents a time limit that your move-making function will be required to respect.

    The boolean parameter isPlayingX will be True if your player is to play the X pieces and False if your player is to play the O pieces.

    The final parameter, debugging, tells your program whether to print diagnostic information when it makes its moves. This is explained under makeMove. Your prepare function should return a string. The string should either be "OK" or a string that gives some reason why your program will not be able to play the specified game, such as "I require at least 17 milliseconds to make my moves.", "Let's quit right now because nobody can get 7 in a row on a 3 by 5 board.".

    Note that your program does not really have to do much at all when its prepare method is called. The main thing it should do is return "OK". You might wonder what the purpose of it is. It has three purposes: one is to set the debugging mode, which is relevant to the reporting of alpha and beta cutoffs as descrbed below. Another is to disqualify any program whose author believes that the program is NOT ready to play a particular version of the game. For example, if your program can play K in a row for K=3 but not K>3, for some reason, then your program might return "I can't handle K>3." instead of "OK". The third purpose is to offer your agent an opportunity to do any initialization of tables or other structures without the "clock running." If the match is a timed match, then it might be nice to have free time for setting up for, say, Zobrist hashing, if you are using that. Another kind of preprocessing would be to make, for each of the four directions in which a win can occur, a list of all the squares on the board where such a winning line could actually start. Having these lists can save time in your static evaluation function.

  2. introduce(). This function will return a multiline string that introduces your player, giving its full name (you get to make that up), the name and UWNetID of its creator (you), and some words to describe its character.

  3. nickname(). This function should return a short version of the playing agent's name (16 characters or fewer). This name will be used to identify the player's moves in game transcripts.

  4. makeMove(currentState, currentRemark, timeLimit=10000). This is probably your most important function. It should return a triple of the form [move, newState, newRemark, stats]. The move is a triple describing the chosen move, and its format is the same as that used in Assignment 3. The newState is the result of making the move from the given currentState. It must be a complete state and not just a board. The currentRemark argument is a string representing a remark from the opponent on its last move. The timeLimit represents the number of milliseconds available for computing and returning the move.

    The newRemark to be returned must be a string. During a game, the strings from your agent and its opponent comprise a dialog. Your agent should contribute to this dialog in three ways:

    1. by convincingly representing the character that you have chosen or designed for your agent,
    2. by showing awareness of the game state and game dynamics (changes in the game state), and
    3. by responding in a convincing way to the opponent's remarks.

    A portion of your grade (10 points) will be based on how well your agent's remarks meet the first criterion. Criteria 2 and 3 are for extra credit. To get the extra credit, implement one, the other, or both, and then describe, in a separate file called ExtraCredit.txt what the features are and how they work. Something working that adds in a noticeable way to the dialog and that fits one of the two criteria is worth 5. A more thorough implementation that shows considertion for multiple features related to one of these two criteria is worth 10. The maximum extra credit here is therefore 20 points.

    The fourth component of the returned list is the stats item. For this assignment, the stats item should be a string that reports the number of states examined in the course of determining the move.

    If debugging mode has been set to True in the call to the prepare() function, then your makeMove function should return, as the newRemark, a string that explains how many alpha and beta cutoffs occurred at each level of the search. This is instead of the character-oriented remark.

  5. staticEval(state). This function will perform a static evaluation of the given state. The value returned should be high if the state is good for X and low if the state is good for O.

    A portion (approximately 20 points) of your grade for Part I will depend on how well your staticEval function works.

Representations of moves and states:
As in Assignment 3, we will use the following shorthand notation for representing a move: (isXtoMove, row, column). Example: (True, 2, 1) means that the player puts an X in row 2, column 1. (Middle of the bottom row of the Tic-Tac-Toe board).
A state is represented in the form of a pair [board, isXtoMove]. A board is represented as a list of lists. The Tic-Tac-Toe (3 in a row on a 3 by 3 board with no forbidden squares and no handicaps) initial state is:

[[[0, 0, 0],
  [0, 0, 0],
  [0, 0, 0]], True]
Here 0=blank (vacant), 1='X', 2='O', and 3='-' (forbidden). You can ignore the case of a forbidden square in this part of the assignment.

Testing:
You should test your player in a game situation by downloading both a gameMaster program and an opponent agent. These are available here.

PART III: Game Transcript (20 points). Using the gameMaster.py program to run a match, create a transcript of a game between your agent and the agent of another student in the class. Use the following game instance for your match.
The following is the representation of the initial board in a KinARow game called "5 in a Row on 7-by-7 board with Corners Forbidden".
[[3, 0, 0, 0, 0, 0, 3],
 [0, 0, 0, 0, 0, 0, 0], 
 [0, 0, 0, 0, 0, 0, 0], 
 [0, 0, 0, 0, 0, 0, 0], 
 [0, 0, 0, 0, 0, 0, 0], 
 [0, 0, 0, 0, 0, 0, 0], 
 [3, 0, 0, 0, 0, 0, 3]]
Here 0=blank (vacant), 1='X', 2='O', and 3='-' (forbidden). For the complete initial state, wrap this in an outer list that includes the boolean value True, indicating that it is X's move: [initialBoard, True].
Updates and Corrections If necessary, updates and corrections will be posted here and mentioned in class or on the mailing list. This page was last updated Oct. 24, 2011 (with an explanation of what the triples represent in Part I).