CSE logo University of Washington Computer Science & Engineering
 CSEP 590 A Project 1
  CSE Home   About Us    Search    Contact Info 

Project 1 - Arithmetic Coder

Objectives

Overview

Arithmetic coding is a method of generating variable-length codes for lossless compression. It works well with small alphabets (like the binary images you’ll be using) and with alphabets with skewed probabilities. You’ll be writing the integer implementation (see section 4.4.3 of Sayood) and it will be adaptive, so the probabilities of getting a white pixel or a black pixel will be changing as the image is encoded and more information about it becomes known. You’ll also be using contexts. If a pixel has lots of black pixels surrounding it then the probability that it will be black is high. Pixels are put into contexts based on the color of their neighbors to take advantage of that high probability. The more skewed the probabilities are, the better arithmetic encoding works.

Assignment

The assignment has three parts:

1. The code

We’ve provided an arithmetic coder for bitmaps that doesn’t compress yet. The encoder copies the input file to the compressed file and then decoder copies the compressed file into the output file. So when you run the code on an image you’ll notice that the compression rate it prints out is very close to 1 bpp, which is no compression at all. It’s your job to fill in the encode and decode methods (and any helper methods you might need like initializeTag()) in the ArithmeticCoder class. You’ll only be turning in ArithmeticCoder.java so don’t modify anything else!

The Java files for the project

The C# files for the project (written by Frank Fan)

What we've provided:
You’ll only need to look at and understand some of these and they’re all fairly short:

Context Files:
We’ve provided a sample context file for you. When you run the code you need to specify the path to the input image, the compressed file, the context file (optional), and the output file. If you do not specify a context file the default context will be used and it looks like this:
AA
A ?

If this context file is used there will be 2^3 = 8 contexts. B=black W=white
BB BB BW BW WB WB WW WW
B W B W B W B W

You can make your own context files by editing a text file. ContextReader assumes that only 3 symbols are present in the file. A,?, and X. ? is the pixel being encoded. A’s are all the neighbors you want to include in the context. X is used if you want to skip a neighbor. If your context file looks like this: AX? Then the pixel 2 pixels to the left (the A) will be used in the context but NOT the pixel directly to the left (the X). The ? must be the last character in the context file. We’re encoding the pixels in raster order so we only know the color of neighboring pixels that are to the left and above the ? pixel since they’ve already been encoded, so only those pixels can be used in the contexts.

We’ve also provided a few bitmap images for you to test with. There is a page of text, a large picture, and small 9x11 pixel bitmap of the letter “B”. The small one should help with debugging.

Here is an overview of how the code works:
Encoder:

  1. Read image file into a double array of pixels
  2. Read in context file and create an array of all possible contexts
  3. Copy the bitmap header from the image file to the compressed file
  4. Also copy some information about the contexts being used to the compressed file
  5. Iterate through the double array, find each pixel’s context and encode that pixel

Decoder:

  1. Copy the bitmap header from the compressed file to the output file
  2. Read the context information from the compressed file and create the context array
  3. Make a double array of pixels that is initially all 0’s.
  4. Iterate through the double array, find each pixel’s context and decode that pixel
  5. Write the pixel array to the output file.

2. The report

When you have a working arithmetic coder experiment by using different context files Try different shapes and sizes. What happens to the compression rate? Write a 1-2 page report on your findings. Because you are using an adaptive coder, if there are very few pixels in a particular context then there are not enough statistics for the arithmetic coder to do a good job with that context. Thus, if there are too many contexts, then there is a chance that most of the contexts will have only a few pixels in them. This effect is called context dilution. Develop a study that tests whether or nor context dilution occurs with your coder. One product of your study is a plot showing on the x-axis the number of contexts, and on the y-axis the compression ratio.

3. The contest

When you think you’ve found the best context configuration submit your best context file as well. Your coder will be run with that context file on an image and students who get the best compression rate will receive a prize.

4. Reference Compilers

Before you submit your project files, please make sure your code compiles on the following compilers:

  • Java files: Must compile on javac (version 1.6.0_03) on attu.cs.washington.edu
  • C# files: Must compile on Microsoft Visual Studio 2005 on aria.cs.washington.edu
  • Due Date

    The project is due on Wednesday, November 14.  Hand in the report at the beginning of the class on Thursday, November 15. 

  • Java project files submission
  • C# project files submission
  • Links

    For students who wish to program in C: You will need to develop a Java native interface that allows you to call C functions from Java methods.  The following tutorial  explains this:


    CSE logo Computer Science & Engineering
    University of Washington
    Box 352350
    Seattle, WA  98195-2350
    (206) 543-1695 voice, (206) 543-2969  [comments to Rahul (rahulv at u.washington.edu) ]