CSE logo University of Washington Computer Science & Engineering
 490G Project 1
  CSE Home   About Us    Search    Contact Info 

Project 1 - Arithmetic Coder

Objectives

Overview

Arithmetic coding is a method of generating variable-length codes for lossless compression. It works well with small alphabets (like the binary images you’ll be using) and with alphabets with skewed probabilities. You’ll be writing the integer implementation (see section 4.4.3 of Sayood) and it will be adaptive, so the probabilities of getting a white pixel or a black pixel will be changing as the image is encoded and more information about it becomes known. You’ll also be using contexts. If a pixel has lots of black pixels surrounding it then the probability that it will be black is high. Pixels are put into contexts based on the color of their neighbors to take advantage of that high probability. The more skewed the probabilities are, the better arithmetic encoding works.

Assignment

You will be working on this assignemnt in teams of two. The assignment has three parts:

1. The code

We’ve provided an arithmetic coder for bitmaps that doesn’t compress yet. The encoder copies the input file to the compressed file and then decoder copies the compressed file into the output file. So when you run the code on an image you’ll notice that the compression rate it prints out is very close to 1 bpp, which is no compression at all. It’s your job to fill in the encode and decode methods (and any helper methods you might need like initializeTag()) in the ArithmeticCoder class. You’ll only be turning in ArithmeticCoder.java so don’t modify anything else!

The files you need

What we've provided:
You’ll only need to look at and understand some of these and they’re all fairly short:

Context Files:
We’ve provided a sample context file for you. When you run the code you need to specify the path to the input image, the compressed file, the context file (optional), and the output file. If you do not specify a context file the default context will be used and it looks like this:
AA
A ?

If this context file is used there will be 2^3 = 8 contexts. B=black W=white
BBBBBWBWWBWBWW WW
BWBWBWBW

You can make your own context files by editing a text file. ContextReader assumes that only 3 symbols are present in the file. A,?, and X. ? is the pixel being encoded. A’s are all the neighbors you want to include in the context. X is used if you want to skip a neighbor. If your context file looks like this: AX? Then the pixel 2 pixels to the left (the A) will be used in the context but NOT the pixel directly to the left (the X). The ? must be the last character in the context file. We’re encoding the pixels in raster order so we only know the color of neighboring pixels that are to the left and above the ? pixel since they’ve already been encoded, so only those pixels can be used in the contexts.

We’ve also provided a few bitmap images for you to test with. There is a page of text, a large picture, and small 9x11 pixel bitmap of the letter “B”. The small one should help with debugging.

Here is an overview of how the code works:
Encoder:

  1. Read image file into a double array of pixels
  2. Read in context file and create an array of all possible contexts
  3. Copy the bitmap header from the image file to the compressed file
  4. Also copy some information about the contexts being used to the compressed file
  5. Iterate through the double array, find each pixel’s context and encode that pixel

Decoder:

  1. Copy the bitmap header from the compressed file to the output file
  2. Read the context information from the compressed file and create the context array
  3. Make a double array of pixels that is initially all 0’s.
  4. Iterate through the double array, find each pixel’s context and decode that pixel
  5. Write the pixel array to the output file.

2. The report

When you have a working arithmetic coder experiment by using different context files Try different shapes and sizes. What happens to the compression rate? Write a 1-2 page report on your findings. Because you are using an adaptive coder, if there are very few pixels in a particular context then there are not enough statistics for the arithmetic coder to do a good job with that context. Thus, if there are too many contexts, then there is a chance that most of the contexts will have only a few pixels in them. This effect is called context dilution. Develop a study that tests whether or nor context dilution occurs with your coder. One product of your study is a plot showing on the x-axis the number of contexts, and on the y-axis the compression ratio.

3. The contest

When you think you’ve found the best context configuration submit your best context file as well. Your coder will be run with that context file on an image and the group with the best compression rate will receive a prize.

Due Date

The project is due at 11 pm on Monday, February 6. Go to the turn-in page to electronically send us ArithmeticCoder.java and your best context file. Please enter A as your quiz section. Hand in the written report at the beginning of class on Wednesday, February 8.


CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to Jenny Liu]