Memory contains bits (binary digits).
We identify operands (data to be operated on by the processor) by giving memory addresses - memory is an array.
The unit of addressing is the byte - memory is "byte addressable". An address is the index of a byte.
As well as the address, we need to specify the number of bits in the operand:
It is common for processors to require operand alignment - the address of an operand must be divisible by the operand's length. For example, word operands must be at addresses that are multiples of 4 (because a word is 4 bytes long).
Processors may be either big-endian or little-endian, which indicates the byte order - is the byte at the address the high-order or the low-order byte?
A string of N bits has 2^N different possible values.
N | 2^N | Slang |
---|---|---|
8 | 256 | NA |
10 | 1024 | 1K |
20 | ~1,000,000 | 1M |
30 | ~1,000,000,000 | 1G |
32 | ~4,000,000,000 | 4G |
We often use hexadecimal notation ("hex") to write down long bit strings. Each hex digit represents 4 bits. The hex digits are 0, 1, ..., 9, A, B, C, D, E, and F. Thus 0xFF is a string of eight 1's, 0x10 is 00010000, and 0xfedcba98 is 11111110110111001011101010011000.
A data encoding is a mapping from bit strings to values in the type of the encoding.
The processor "knows about" a few data encodings. Others are conventions used by the software running on the processor.
The processor can copy them from one address to another. It can also perform logical (bit) operations on them, e.g., AND and XOR. (The result is the bit-wise result of the operation on bits in corresponding positions in the two operands.) It can also test for equality, and can shift them left or right.
2's complement representation. For example, the signed byte corresponding to -2 is 11111110.
N bits can represent integers from -(2^(N-1)) to 2^(N-1)-1. (For example, 8 bits -> -128 to 127, 32 bits -> -2,147,483,648 to 2,147,483,647.)
Operations are arithmetic (add, subtract, etc.), comparison (less-than, equal, and greater-than), and sign-extending shifts. Overflow may occur on arithmetic instructions.
N bits can represent the integers from 0 to 2^(N+1) - 1. (For example, 32 bits -> 0 to 4,294,967,295.)
Operations are aritmetic, comparison, and 0-extended shift. While the result of, say, adding one can cause the value to go from very large to zero, the processor does not indicate that overflow has occurred.
Divide available bits into three fields:
Use | Single precision | Double precision |
---|---|---|
Sign | 1 | 1 |
Exponent | 8 | 11 |
Significand | 23 | 52 |
Value is (-1)^S * 2^E * F.
Normalize the number so that the signifcand is 1.xxxxx, then don't store the bit corresponding to the leading 1. (Why?)
Bias the encoding of the exponent so that it's smallest value is represented by the bit string 00...0, and successively higher values are obtained by adding 1 to it as an unsigned integer. (Why?)
ASCII |
Hex |
Symbol |
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F |
0 1 2 3 4 5 6 7 8 9 : ; < = > ? |
1. ASCII 32 (0x20) = Space 2. ASCII 48 (0x30) = '0'. The decimal digits are consecutive codes. 3. ASCII 65 (0x41) = 'A'. The characters are consecutive codes. 4. Lowercase codes are uppercase codes + 32 (0x20)
Unicode ("wide characters" to Microsoft) is a 16-bit encoding, allowing representations of a much larger set of characters. (See www.unicode.org.)
Other languages implement strings in other ways. For instance, Pascal strings are a 8-bit integer indicating the string length followed by consecutive 8-bit characters.
struct { int num; char c; int total; }and it is allocated memory starting at address 0x1000. Then V.num would be in bytes 0x1000-0x1003, V.c would be in byte 0x1004, and V.total in bytes 0x1008-100B.
Each variable in the program is allocated memory locations to hold its current value. The number
of bytes allocated depends on the variable type (e.g., one byte for char
and four
for int
or unsigned int
).
HLL statements modifying variables are translated into machine instructions that modify the memory
locations the variables occupy. For instance, if int x
has been allocated bytes 0x1000-0x1003,
the statement x++;
will be translated into machine instructions that add one to the 32-bits in
that memory location, interpreting them as a signed integer as it does so.