|
CSE Home | About Us | Search | Contact Info |
Arithmetic coding is a method of generating variable-length codes for lossless compression. It works well with small alphabets (like the binary images you’ll be using) and with alphabets with skewed probabilities. You’ll be writing the integer implementation (see section 4.4.3 of Sayood) and it will be adaptive, so the probabilities of getting a white pixel or a black pixel will be changing as the image is encoded and more information about it becomes known. You’ll also be using contexts. If a pixel has lots of black pixels surrounding it then the probability that it will be black is high. Pixels are put into contexts based on the color of their neighbors to take advantage of that high probability. The more skewed the probabilities are, the better arithmetic encoding works.
We’ve provided an arithmetic coder for bitmaps that doesn’t compress yet. The encoder copies the input file to the compressed file and then decoder copies the compressed file into the output file. So when you run the code on an image you’ll notice that the compression rate it prints out is very close to 1 bpp, which is no compression at all. It’s your job to fill in the encode and decode methods (and any helper methods you might need like initializeTag()) in the ArithmeticCoder class. You’ll only be turning in ArithmeticCoder.java so don’t modify anything else!
The Java files for the project
The C# files for the project (written by Frank Fan)
What we've provided:
You’ll only need to look at and understand some of these and they’re all fairly
short:
Context Files:
We’ve provided a sample context file for you. When you run the code you need to
specify the path to the input image, the compressed file, the context file
(optional), and the output file. If you do not specify a context file the
default context will be used and it looks like this:
AA
A ?
If this context file is used there will be 2^3 = 8 contexts. B=black W=white
BB | BB | BW | BW | WB | WB | WW | WW |
B | W | B | W | B | W | B | W |
You can make your own context files by editing a text file. ContextReader assumes that only 3 symbols are present in the file. A,?, and X. ? is the pixel being encoded. A’s are all the neighbors you want to include in the context. X is used if you want to skip a neighbor. If your context file looks like this: AX? Then the pixel 2 pixels to the left (the A) will be used in the context but NOT the pixel directly to the left (the X). The ? must be the last character in the context file. We’re encoding the pixels in raster order so we only know the color of neighboring pixels that are to the left and above the ? pixel since they’ve already been encoded, so only those pixels can be used in the contexts.
We’ve also provided a few bitmap images for you to test with. There is a page of text, a large picture, and small 9x11 pixel bitmap of the letter “B”. The small one should help with debugging.
Here is an overview of how the code works:
Encoder:
Decoder:
When you have a working arithmetic coder experiment by using different context files Try different shapes and sizes. What happens to the compression rate? Write a 1-2 page report on your findings. Because you are using an adaptive coder, if there are very few pixels in a particular context then there are not enough statistics for the arithmetic coder to do a good job with that context. Thus, if there are too many contexts, then there is a chance that most of the contexts will have only a few pixels in them. This effect is called context dilution. Develop a study that tests whether or nor context dilution occurs with your coder. One product of your study is a plot showing on the x-axis the number of contexts, and on the y-axis the compression ratio.
The project is due on Wednesday, November 14. Hand in the report at the beginning of the class on Thursday, November 15.
For students who wish to program in C: You will need to develop a Java native interface that allows you to call C functions from Java methods. The following tutorial explains this:
http://www.nag.co.uk/IndustryArticles/CallingCLibraryRoutinesfromJava.pdf
http://www.nag.com/doc/TechRep/html/Tr3_00/tr3_00.html
|
|
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 [comments to Rahul (rahulv at u.washington.edu) ] |