A Mini Research Project for CSE 370 Course

Done By:

*

Brian Hoxie                   Alexander Cho                Hamed Esfahani

Q:  How would the string "CSE370" be represented at the physical level on a CD-ROM?

Introduction

        In the new world of computers, the CD-Rom has become a common media for storage of digital information.  We commonly use CD-Roms to store, read, and retrieve data.  I'm sure most of us have seen several CDs in our mailbox from time to time.  One questions that arises is, how is data stored on a CD-Rom?
Before looking at the intricate details of a CD-Rom, lets begin by going over the basic function of a CD-Rom player and how it is used to read digital data.
 

The Disk

    The physical layout of a CD-Rom is quite intricate.  Figure 1 contains a cross section picture of a CD-Rom.  The first thing one would notice is the bumps on the aluminum surface. These bumps are  where the digital data is stored in binary form.  The areas where the Aluminum is closer to the bottom (plastic), are called Lands, and the opposite bumps are called Pits.  Figure 2 shows a detailed depiction of the aluminum surface and the relative sizes for the pits and lands.  At first glance, the sizes used to store the data are incredibly small.  (A micron is a millionth of a meter!).  In order for a computer to read the data from these really small bumps, it uses a precise laser.
 
 

                                                                                                                                                                     Figure 1.                                                                                                                                        Figure 2.
 
 
 
 

Reading a CD.

    Data from a CD-Rom is read by spinning the disk around and basically bouncing a laser from the aluminum.  As the light from the laser hits the aluminum lands, it will be reflected into a prism and registered as voltage.  When the laser hits a pit, most of the light will be scattered and not be returned into the detector, thus registering no voltage.  The difference between a pit and a land will be registered as a 1 and no change will be represented as a 0.  Using this method, data is represented as 1’s and 0’s, or Binary numbers.

    We regularly see the speed of CD-Rom player as 10X, 20X, etc.  The speed of the spinning disk is determined by the location of the data that we want to read.  The disk is broken up into sectors, which we will discuss in the next section, much like a Hard Drive.  However, the layout is different than a hard drive in that a CD-Rom is set up to use one continuous track (5 km long!) to store the data (Figure 3). Using the concept of Constant Linear Velocity, a CD-Rom will have varied speeds so that it can read data from different areas of the disk without wasting space like in a Hard Drive Figure 4. Some drives are now created using the spin technique of a hard drive (Constant Angular Velocity) because the difference between 45X of the outer edges of the disk to the 50X of the inner is negligible and it is easier to manufacture.

                                                                                            Figure 3.                                                                                                                                                                                            Figure 4.
 

Click on this link to view a great animation of this process of reading the disk. Also, the figures below demonstrate this.

Data Format

     The data format of a CD-Rom is where things get a bit complicated.  Most common CDs are encoded using the ISO9660 format, which is an industry format that allows CD-Roms to be used in different machines and operating systems without installing special software or needing additional players.  First lets look at how a CD is broken up on the smallest level.  The first level is called a Channel. This is a small section of the CD that stores 8 bits of user data and is 17 bits in total size.  The 8 bits of data are converted into a 14 bit representation using EFM, or Eight to Fourteen modulation.  The reason for adding additional bits is due to the limited resolution of a CD-Rom's optics.  A problem can occur when a string of 1's is needed to be represented by constantly changing from pits to lands and vice versa.  The conversion from 8 bit to 14 bit using EFM solves this problem.  The 17 bit total comes from an additional 3 merge bits (0's) added to the end of the 14 bits to allow two 14 bit symbols to follow one another.  This is to prevent one symbol ending with a 1, and the next symbol starting with a 1 being merged together.
    The next level is called a frame.  Each frame consists of 588 channel bits and stores a total of 24 user bytes per frame. A typical frame representation is shown in Figure 4.

 

After that level comes the largest level, the sector.  Each sector in Mode 1 (the format used to encode data as opposed to Mode 2 for audio) can hold 2352 data bytes total or 2048 bytes of user data. A visual representation of a block is shown in Figure 5.
 
 

00
FF x 10
00
MIN
SEC
SECTOR
MODE
DATA
LAYERED ECC
12 bytes (synch)
4 bytes (ID)
2048 bytes
288 bytes
<--------------------------------------- 2352 bytes ----------------------------------------------->

 
 

Encoding "CSE370" on a CD-Rom

    In order to convert our string into digital data we must first convert the letters and numbers into binary representation.  We will use ASCII (American Standard Code for Information Interchange) system to encode the string.  The following codes were used for the conversion.

     Character                        ASCII Code
    C       :       0100 0011
    S       :       0101 0011
    E       :       0100 0101
    3       :       0011 0011
    7       :       0011 0111
    0       :       0011 0000

Now that we have binary representation, we need to use EFM to convert our ASCII bits into bits that can be read by a CD-Rom.  Using the EFM encoding table below we obtain the following binary representations for our ASCII Codes.


Note that is only the "Data" section of a typical sector or block on a CD-ROM.
 
 
 

References:

http://www.disctronics.co.uk/cdref/cd-rom/iso9660.htm
http://www.usbyte.com/common/compact_disk.htm
http://whatis.techtarget.com/definition/0,,sid9_gci211759,00.html
http://www.site.uottawa.ca/~lucia/courses/2131/notes/lect06.pdf
http://www.angelfire.com/al/freemb/images/efm1.gif
http://www.angelfire.com/al/freemb/images/efm2.gif
http://www.angelfire.com/al/freemb/images/efm3.gif

http://www.howstuffworks.com/cd.htm
White, Ron. How Computer Works  1997. Ziff Davis Press, p 192-5.
 
 
*  The word "BAH" simply means GOOD or NICE in Persian.  So, please don't think that our group name is completely meaningless or in any way uncreative.