V Lecture 17 — integer and floating point representations
V hw5 reminders
V skeleton code due tomorrow
* mem.h
mem_impl.h
getmem.c
freemem.c
get_mem_stats.c
print_heap.c
check_heap.c
bench.c
makefile
git.log
V put files in tar archive
* tar -cvf hw5.tar FILES…
V casting in freemem
* passed a void* pointing to space in the block after the header
V offset backwards by the size of the header
* (char*) p - sizeof(FreeBlock)
V and cast to a block struct so we can read its members
* (FreeBlock*) ((char*) p - sizeof(FreeBlock))
V it total
* FreeBlock *block = (FreeBlock*) ((char*) p - sizeof(FreeBlock))
// now we can read the size with block->size
V hexadecimal
V how do we read a number like 42?
* 4 tens + 2 ones
* 452 is 4 hundreds + 5 tens + 2 ones
V written another way
* 452 is 4•102 + 5•101 + 2•100
* this is why our number system is called base 10
V base 10 is nice, but we can use numbers with different bases
V binary is base 2 (binary numbers will be prefixed with 0b)
* 0b1010 is 1•23 + 0•22 + 1•21 + 0•20 = 8 + 2 = 10
V hexadecimal is just base 16
* why use base 16?
* conveniently, a hex digit can be represented in 4 bytes, so it lines up nicely
V base 16 presents a new problem — we need to be able to represent the numbers 0-15 each with a single digit
* 0-9 are already taken care of
V so we use a-f as 10-15
* counting in hexadecimal: 0x0, 0x1, 0x2, …, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf, 0x10, 0x11, …
V exercise: 0x8 + 0x5
* 0xd (8 + 5 = 13, which is d in hex)
V exercise: 0xb + 0xc
* 0x17 (b + c = 11 + 12 = 23 = 16 + 7 = 0x17)
V integer representation
V we need to represent integers with a fixed number of bits
V exercise: how many different values can we represent with 6 bits?
* 64 (each bit can be 0 or 1, so be have 6 bits with 2 values each = 2•2•2•2•2•2 = 26 = 64)
* we could represent the integers 0-63 (just let each bit pattern represent that binary integer)
V we could also represent a deck of cards (4 suits, 13 values)
V higher-order 2 bits for the suit (using 4 out 4 values)
* higher-order means leftmost
V lower-order 4 bits for the value (using 13 out of 16 values)
* lower-order means rightmost
* unsigned integers are easy, just use the binary values
* singed integers are tricky
V sign-and-magnitude
* most significant bit (i.e., leftmost bit) indicates sign
* remaining bits indicate magnitude (value)
* given 4 bits, -1 would be 0b1001
* exercise: what would -3 be? 0b1011
V problems:
* 2 representations of 0 (+0 is 0b0000 and -0 is 0b1000)
V arithmetic is clumsy
* 4 - 3 != 4 + (-3)
0b0100 - 0b0011 = 0b0001 = 1
0b0100 + 0b1011 = 0b1111 = -7
V better solution: two’s complement
V instead of giving the sign, the most significant bit (MSB) will have its normal value, but negative weight
* 0b0110 = 0•-23 + 1•22 + 1•21 + 0•20 = 4 + 2 = 6
* 0b1110 = 1•-23 + 1•22 + 1•21 + 0•20 = -8 + 4 + 2 = -2
V much nicer
* all negative integers still have MSB = 1
* single zero
V arithmetic just works
* 4 + (-3)
0b0100 + 0b1101 = 0b10001 = 0b0001 (discard overflow) = 1
V exercise: unsigned value for two’s complement -1 in 8 bits
* 8-bit -1 is 0b11111111 (-1 is always all ones)
* unsigned value is 255 (all ones is always 2n - 1)
* converting from signed to unsigned (or vice versa) can get you into trouble!
V floating point representation
V fractional binary numbers work in the same fashion as fractional decimal numbers
* 1.25 = 1•100 + 2•10-1 + 5•10-2
* 0b1.01 = 1•20 + 0•2-1 + 1•2-2 = 1 + 1/4 = 1.25
V can have repeating just like decimal
* 1/10 = 0b0.0001100110011[0011 ]…
V floating point values only represent numbers that can be written x • 2y
V like scientific notation
* not 0b0.000101 but 1.01 • 24
V prior to 1985, each line of processors had its own way of doing floating point
* often valued speed over accuracy
V in 1985, IEEE 754 floating point standard established
* now consistent across all processors
V values defined as (-1)s • M • 2E
* sign bit s determines if value is negative
* Significand (mantissa) M normally a fractional value in range [1.0,2.0)
* Exponent E weights value by a (possibly negative) power of two
V don’t have time to get into the details, but here are keys points
* for single precision (32 bits), we have s = 1 bit, E = 8 bits, M = 23 bits
* for double precision (64 bits), we have s = 1 bit, E = 11 bits, M = 52 bits
* since we have a finite number of bits, some values will have to be approximated
V special values
* zero: s == 0, E == 0, M == 0
* +∞, -∞: E == all ones, M == 0
* NaN (not a number): E = all ones, M != 0
* special values can pollute numerical computation