|
Assignment 4: Advanced Game Agents
|
|
CSE 415: Introduction to Artificial Intelligence The University of Washington, Seattle, Spring 2026 |
|
|
|
Due Monday, May 4, at 11:59 PM. Partnerships are optional for
this assignment but strongly encouraged.
Complete the planning
questionnaire anytime between Wednesday, April 22 and Wednesday night, April 29, for 1 point of participation
credit. Earlier is better. For example, partnering can become more difficult as
time goes on. (See the link to the questinnaire at the bottom of this page.)
MODES OF PLAY: Your agent should support three modes of play: (A) Demo: Demonstration Mode, in which time used is not critical but should not take longer than 3 sec. per move. The agent will carry on a conversation during the game. Optionally, your code may import an API for an LLM to assist with utterance generation. However, the staff might be unable to run your agent in this mode, due to a missing API key in the environment. (B) Autograder: In this mode, specific features will be enabled and disabled by the autograder to test your agent's functionality and assign points. (C) Competition: In this mode, the conversation will be turned off, except for each agent introducing itself. Timing will be important, to enable its use as a basis for elimination in some or all of the competition rounds. In this mode, your agent must not import any modules that require additional installations beyond the standard Python distribution from Python.org. |
|
Here is the starter code for this assignment. |
| A. Introduction K-in-a-Row In this assignment, you'll create a program that implements an agent that can participate in a game of K-in-a-Row with Forbidden Squares (defined below). Then you'll demonstrate your agent, providing one or more transcripts of the agent playing against another agent. |
PART I: Agent Code to Play K-in-a-Row (20 points).
Your program should consist of a single file with
a name like yourUWNetID_KInARow.py where the
first author's UWNetID is used.
Your program will consist mainly of a sub-class OurAgent of KAgent, providing collection of specific methods, for playing games of "K in a Row". We define a K in a Row game as a kind of generalized Tic-Tac-Toe with the following features:
The starter code consists of several files: File you will modify and turn in:
File you will modify and not turn in:
Files you should read and understand:
Other files:
Key Methods to Implement Some of the functionality you need to implement in your agent is templated or given simple versions in the starter code. For example: prepare(self, game_type, what_side_to_play, opponent_nickname,
expected_time_per_move=0.1, utterances_matter=True)
Called by the Game Master before play begins.
Use it to store the game type, your side ('X' or 'O'),
and any other information your agent needs across turns.
Your introduce(self) This function should return a multi-line string to introduce your agent. This should include your agent's name (you get to make this up), your name and netid, and anything else you'd like the agent to say at the start of the game.
make_move(self, state, current_utterance, time_limit,
use_alpha_beta=True,
use_zobrist_hashing=False, max_ply=3,
special_static_eval_fn=None)
This function determines which move to make given a game board (state). It is the main entry point into your game playing algorithm, and is called by the Game Master.
If the
You may assume that the In DEMO and COMPETITION modes the method should return
In AUTOGRADER mode, append a statistics dictionary as a
third element:
Your minimax function and alpha-beta pruning will be graded, in-part, based on the number of cutoffs they make in a controlled game-tree search. In AUTOGRADER mode, the autograder's "special" static evaluation function has to be used by your agent. Your move-generator should consider moves in a specific order when in this mode -- the same order that they are generated (but the order not in which they are selected, which is where the randomness comes in) by the RandomPlayer agent. Also in autograder mode, it will be important to respect the max-ply value that is passed in. The parameteruse_zobrist_hashing can be used by the
autograder to specify whether or not this feature (optionally
implemented) is turned on or off for a comparative test of its
effects. If you implement Zobrist hashing, and it works well, you
should have it on by default, and whether turned on or off, return
valid statistics. If you do not implement Zobrist hashing, your agent
can ignore the parameter and just return the default stats (-1, for
example) that are in specified in the agent_base starter code.
minimax(self, state, depth_remaining, time_limit, alpha, beta)
This function should be called by
The parameters The minimax function must return a move and a float value that corresponds to the evaluation of that move.
This is your static evaluation function. Given a game state, it should return a numeric evaluation of that state where a higher value means that the state is better for X and a lower value is a state better for O. Note that this is regardless of which piece your agent is playing. For this assignment we define that X will always be the maximizing player and O always the minimizing player. In normal game-play, your agent will know what type of game is being played, because
the game master will provide that information via the call to your agent's Your static evaluation function should be non-trivial and should return different evaluations for states that are clearly different. We will grade based on comparing your evaluations across states that are easily distinguished by a human as significantly better or worse, so don't worry too much about determining small differences between mostly even states. The amount of work done by your static evaluation function should depend on the type of game being played, which will be stored as a property of your agent. The game type has the value of k, which is the number of tokens in a line required to win the game. Your static evaluation function will probably need to make use of this value k. For example, if there is a line of k-1 Xs and O has not blocked the completion of k in a row, then this state should probably have a very high value.We provide little guidance for this function beyond the lecture slides. This will be the main opportunity for you to set your agent apart from the rest of the class for the tournament, so making sure that this function gives good results and runs quickly will go a long ways towards improving your agent's performance. When a call to make_move happens in AUTOGRADER mode, then your code should use the special static evaluation function given in the call instead of your own function. When not in AUTOGRADER mode, then no special static evaluation function will be provided in the call. not
For our testing, you may assume that the |
| PART II: Agent Code for Dialog (20 points). |
| Utterances:
Each player participates not only in the game being played, but also in a conversation that takes place at the same time. In each turn, the player returns both a move (and corresponding new game state) and a text string we call the utterance. The utterance should be "in character" with your agent's persona. Ideally, it is also somewhat intelligent in that it represents comments that are appropriate. The utterances may achieve any subset of the following behaviors: persona-specific, game-specific, game-state-specific, funny, serious, observant, didactic (teaching something about the agent technology or search techniques), or responsiveness to the utterances of the opponent. We expect each team or individual to make some effort to incorporate interesting technology or creativity into the production of the utterances. Possibilities include: (a) using an LLM with well-designed prompting to achieve both the desired persona and appropriate comments on the progress of the game; (b) designing a set of IF-THEN rules that trigger utterance patterns based on current conditions in the game or conversation; (c) instrumentation of the game search process -- numbers of states evaluated, alpha/beta cutoffs, counts of reuses of hashed state values using Zobrist hashing, etc., as the basis for insightful comments on the game. Describe how you designed your utterance-related features in your Learning Diary entry. Any of these approaches qualifies for the base Part II grade. The LLM-based approach is listed separately as an extra-credit option because of the additional effort involved in designing effective prompts and handling API integration. |
| PART III: Written Component (60 points). |
| HTML Transcript |
|
Create an HTML transcript of a match between your agent and that of another person or partnership in the class. (Do not have your agent play against it self for this transcript, and do not have it play against an agent from the starter code.) It's easy to create the transcript. Simply check that the variable UseHTML is set to True in Game_Master_Offline.py. Then whatever agents and game it runs will result in an HTML file being created. The title will be generated in a way that identifies the agents and the game type. Please convert the .html file to .pdf for submissiion. The game type for this transcript should be Five-in-a-Row On Seven-by-Seven Board with Corners Forbidden. The transcript will be one of the inputs used in grading the dialog ability of your agent. |
| Running a Game
You will normally use the file |
| Learning Diary Entry
Complete the
A4 Learning Diary Entry, using the
special template (included in the starter code),
and submit it as |
| Extra Credit Options
There are two ways to get extra credit in Part I and three ways to get extra credit in Part II. There are four more ways to obtain extra credit during the tournament: making the top 50 percent of agents, making it into the top 5 and making best agent in terms of playing the game. Agents developed with gen-AI assistance are fully eligible for the tournament. Tournament performance reflects the quality of your design decisions — choice of static evaluation features, search depth management, and prompting strategy — not the authorship of the code. Note that if there are too many tied games in any round using a particular time limit per move, then the round will be repeated with a shorter time limit, so that the stronger agents get a better chance to show their strength. We will also award 5 points of extra credit to 5 of the agents that, in the opinion of the staff, are the best in terms of dialog, with features such as relevance of comments to the game, including possible comments on technical issues during the search, clarity of persona, degree of apparent engagement in the dialog with the opponent, etc. Each item is worth 5 points, it's conceivable that some agent could not only win the tournament, but score in all the other extra credit, too, for a whopping 45 points of extra credit. If you do any extra credit items, document them in Section 11 (Extra Credit) of your Learning Diary entry, identifying what options you completed and describing them. in Part I:
in Part II:
|
| What to Turn In.
Turn in the following files on Gradescope:
|
| For 1 point of participation credit, fill out the A4 planning questionnaire anytime between Wednesday, April 22 and Wednesday night (midnight) April 29. It's here. |
| Updates and Corrections
If necessary, updates and corrections will be posted here and/or mentioned in class or in
ED.
|