Project 1
Cool Sound Manipulation!
Due Friday October 19th







I. Introduction

In this project we are going to use the Stack and Queue data structures to do some cool sound manipulation. We are going to use a Stack to reverse a sound wave (so we can hide Satanic messages in our rock songs), and we are going to use a Queue to create a guitar sound (so we can have a rock song). Cool dude! But wait, I don't even know...
 

II. How Digital Sound Works

We will view sound as a continuous function of time from the positive real numbers (time) to the interval [-1.0, 1.0] (amplitude). Since a computer can't "hold" a function defined on the reals, we have to approximate the function. We do this by measuring (or "sampling") the sound several thousand times per second.

This process is called "Audio to Digital Conversion" or ADC. The number of times per second the sound is sampled is called the sample rate and is measured in Hertz. For example CDs are recorded at 44100 samples per second, or 44.1kHz. Wait a minute! Is this the right class? I thought this was CSE326.
 

III. Sox

Yes, this is CSE326. We're now going to show you how to use Stacks and Queues to manipulate digital sound. The only thing that stands between you and your creative potential is a UNIX command-line utility called sox. sox stands for "SOund eXchange" and it allows you convert between many different sound formats including .wav, .au, etc... In particular, sox allows you to convert to and from .dat sound files. .dat files are cool because they are human-readable, text-based, sound files. The general strategy will be this:

  1. Take a .wav sound file of your choosing (e.g. my-mother-singing.wav) This sound shouldn't be longer than a couple seconds, or your program will run out of memory. If you can't find a sound, you can use the first 25 seconds of Break on Through by The Doors.
  2. Convert it to a .dat file: sox my-mother-singing.wav my-mother-singing.dat
  3. Manipulate it with the program you will write: ./reverse my-mother-singing.dat my-mother-singing-backwards.dat
  4. Convert it back to .wav file: sox my-mother-singing.dat my-mother-singing.wav
  5. Listen to it! (You can use any sound player.)
That's all there is to it! But before we get too excited, let's first explain...
 

IV. The .dat File Format

The .dat file format starts with one line describing the sample rate of the sound file. This line is required. The rest of the file is composed of two columns of numbers. The first column consists of the time (measured in seconds) when the sampe was recorded, and the value of the sample, between -1.0 and 1.0. Here is the beginning of a sample .dat file. Notice that the nubmers in the first column increase by 1/44100 each step. This is because the sample rate is 44.1kHz.

; Sample Rate 44100
0             0
2.2675737e-05 0
4.5351474e-05 0
6.8027211e-05 0
9.0702948e-05 0
0.00011337868 0
0.00013605442 0
0.00015873016 0
0.0001814059  0
0.00020408163 0

Here is the same file, a little deeper on:

0.22693878    -0.0062561035
0.22696145    -0.0043945312
0.22698413    -0.0068664551
0.2270068     -0.011566162
0.22702948    -0.014556885
0.22705215    -0.014541626
0.22707483    -0.012191772
0.22709751    -0.012390137
0.22712018    -0.014541626
0.22714286    -0.01449585
0.22716553    -0.014770508
0.22718821    -0.015701294
0.22721088    -0.012954712
0.22723356    -0.012710571
0.22725624    -0.018157959
0.22727891    -0.01914978
0.22730159    -0.014572144
0.22732426    -0.012237549
0.22734694    -0.012435913
0.22736961    -0.010818481
 

V. Reversing a Sound File

Ok. Here's where you come in. You are going to take a .dat sound file, and output another .dat sound file, reversing the numbers in the second column (and only the numbers in the second column). This is going to create the trippy effect of "playing the sound backwards." First, implement a Stack conforming to the Stack.hh definitions. You may implement the stack as you like. Now write a program called reverse that reads in .dat file, pushing the numbers in the second column (amplitude) onto an instance of your Stack, line by line. Then, the program should create a new .dat file. The numbers in the second column will ge generated by popping them off the stack. This will reverse the original sound file. For details on what to turn in for this assignemnt and how, read section VII.
 

VI. Things get Cooler: The Karplus-Strong Algorithm

This time around we are going to create the .dat file from scratch. We are going to synthesize the sound (create the second column of the .dat file) using a method called the "Karplus-Strong" algorithm. This algorithm depends on Queues. So before we get started, why don't you implement a Queue conforming to the definition in Queue.hh.

What we want to do is create two Queues, Q1 and Q2. Into Q1 we will enqueue n random numbers between -1.0 and 1.0. In Q2 we will enqueue a single 0.0. At every stage of the algorithm this is what we do (in pseudo-code):
 

a <- Q1.dequeue();
b <- Q2.dequeue();
c <- 0.99 * average(a, b);
Q1.enqueue(c);
Q2.enqueue(a);
output(c);


... and we repeat this over and over m times. The following block diagram of the algorithm may make what's going on clearer:

This program should be called ks.

Basically, the "idea" behind the algorithm is that Q1 is initiall filled with a "burst" of sound n samples long. We want to keep sending out the same burst of sound over and over again. That's why we enqueue the output back into Q1, creating a "feedback loop." But we want the sound to get softer each time it goes through the loop so we multiply it by that constant 0.99 each time it goes through. Ok. That's fine. Q1 is a big feedback loop through which we are putting an initial burst of sound over and over again, making the sound slightly softer each time, but what is the point of Q2? Well, if you think about it, Q2 makes it so that every time we output a number, it is averaged with the number that came before it. This "smooths out" the guitar string sound over time, reducing the timbre of the sound with each period.

One more question, what are n and m? Well, n is the number of samples in the loop. If S is the sample rate, then S / n is the number of times the sound repeats itself per second. This is the so called "frequency" of the sound, and it is what we hear as pitch. Double the size Q1 and you will halve the frequency, lowering the pitch by one octave. Similarly, halve the size of the Queue and you will increase the pitch of your guitar sound by one octave.  Finally, m is the total number of samples in the output file. This implies that m / S is the number of seconds in the output file.

After you create the .dat file, convert it to a .wav with sox and listen to it! There's a lot of cool stuff to play around with here. Change the size of Q1 (or Q2!). What happens? Change the dampening constant from 0.99 to something else. What happens?

When you are done, turn in a copy of your source code. Your compiled source code should create a file with 5 seconds worth of "guitar sound" at a sample rate of 44.1kHz , at a frequency of 220Hz, and with a dampening constant of 0.99 (as in the pseudo-code example). For details on what to turn in for this assignemnt and how, read section VII.

VII. Logistics for Project 1
 


VIII. Fun Things to Think About
 


IX. Acknowledgements

Adrien Treuille would like to thank his own Data Structures professor, Timothy Snyder, who gave him the same assigment (including a better description of the Karplus-Strong algorithm) not so long ago...