#### x86-64 Programming I

CSE 351 Spring 2022 Instructor:

**Ruth Anderson** 

#### **Teaching Assistants:**

Melissa Birchfield Jacob Christy Alena Dickmann **Kyrie Dowling** Ellis Haker Maggie Jiang Diya Joy Anirudh Kumar Jim Limprasert Armin Magness Hamsa Shankar Dara Stotland **Jeffery Tian** Assaf Vayner Tom Wu Angela Xu Effie Zheng



http://xkcd.com/409/

#### **Relevant Course Information**

- hw6 due TONIGHT (4/13) @ 11:59 pm
- Lab 1a <u>closes</u> TONIGHT (4/13) @ 11:59 pm
  - Submit pointer.c and lab1Asynthesis.txt
  - Make sure you check the Gradescope autograder output!
  - Can use late day tokens to submit up until Wed 11:59 pm
- Lab 1b, due Monday 4/18 at 11:59pm
  - No major programming restrictions, but should avoid magic numbers by using C macros (#define)
  - For debugging, can use provided utility functions print\_binary\_short() and print\_binary\_long()
  - Pay attention to the output of aisle\_test and store\_test – failed tests will show you actual vs. expected

#### **Reading Review**

- Terminology:
  - Instruction Set Architecture (ISA): CISC vs. RISC
  - Instructions: data transfer, arithmetic/logical, control flow
    - Size specifiers: b, w, 1, q
  - Operands: immediates, registers, memory
    - Memory operand: displacement, base register, index register, scale factor

#### **Review Questions**

- Assume that the register %rax currently holds the value 0x 01 02 03 04 05 06 07 08
- Answer the questions on Ed Lessons about the following instruction (<instr> <src> <dst>):

- Operation type:
- Operand types:
- Operation width:
- (extra) Result in %rax:

#### Roadmap



#### Definitions

- Architecture (ISA): The parts of a processor design that one needs to understand to write assembly code
  - What is directly visible to software
  - The "contract" or "blueprint" between hardware and software
- Microarchitecture: Implementation of the architecture
  - CSE/EE 469

### Instruction Set Architectures (Review)

- The ISA defines:
  - The system's state (e.g., registers, memory, program counter)
  - The instructions the CPU can execute
  - The effect that each of these instructions will have on the system state



### **General ISA Design Decisions**

- Instructions
  - What instructions are available? What do they do?
  - How are they encoded?
- Registers
  - How many registers are there?
  - How wide are they?
- Memory
  - How do you specify a memory location?

### **Instruction Set Philosophies (Review)**

- Complex Instruction Set Computing (CISC):
   Add more and more elaborate and specialized instructions as needed
  - Lots of tools for programmers to use, but hardware must be able to handle all instructions
  - x86-64 is CISC, but only a small subset of instructions encountered with Linux programs
- *Reduced Instruction Set Computing* (RISC):
   Keep instruction set small and regular
  - Easier to build fast hardware
  - Let software do the complicated operations by composing simpler ones

#### **Mainstream ISAs**

| (intel)    |                                                |  |  |
|------------|------------------------------------------------|--|--|
|            | x86                                            |  |  |
| Designer   | Intel, AMD                                     |  |  |
| Bits       | 16-bit, 32-bit and 64-bit                      |  |  |
| Introduced | 1978 (16-bit), 1985 (32-bit), 2003<br>(64-bit) |  |  |
| Design     | CISC                                           |  |  |
| Туре       | Register-memory                                |  |  |
| Encoding   | Variable (1 to 15 bytes)                       |  |  |
| Branching  | Condition code                                 |  |  |
| Endianness | Little                                         |  |  |

Macbooks & PCs (Core i3, i5, i7, M) <u>x86-64 Instruction Set</u>

# ARM

#### ARM

| Designer   | Arm Holdings                                                                                                                                                           |
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Bits       | 32-bit, 64-bit                                                                                                                                                         |
| Introduced | 1985                                                                                                                                                                   |
| Design     | RISC                                                                                                                                                                   |
| Туре       | Register-Register                                                                                                                                                      |
| Encoding   | AArch64/A64 and AArch32/A32<br>use 32-bit instructions, T32<br>(Thumb-2) uses mixed 16- and<br>32-bit instructions; ARMv7 user-<br>space compatibility. <sup>[1]</sup> |
| Branching  | Condition code, compare and branch                                                                                                                                     |
| Endianness | Bi (little as default)                                                                                                                                                 |

Smartphone-like devices (iPhone, iPad, Raspberry Pi) <u>ARM Instruction Set</u>

## RISC-V

#### **RISC-V**

| Designer   | University of California,<br>Berkeley |
|------------|---------------------------------------|
| Bits       | 32 • 64 • 128                         |
| Introduced | 2010                                  |
| Design     | RISC                                  |
| Туре       | Load-store                            |
| Encoding   | Variable                              |
| Endianness | Little <sup>[1][3]</sup>              |

Mostly research (some traction in embedded) <u>RISC-V Instruction Set</u>

#### **Architecture Sits at the Hardware Interface**



#### Writing Assembly Code? In 2022???

- Chances are, you'll never write a program in assembly, but understanding assembly is the key to the machine-level execution model:
  - Behavior of programs in the presence of bugs
    - When high-level language model breaks down
  - Tuning program performance
    - Understand optimizations done/not done by the compiler
    - Understanding sources of program inefficiency
  - Implementing systems software
    - What are the "states" of processes that the OS must manage
    - Using special units (timers, I/O co-processors, etc.) inside processor!
  - Fighting malicious software
    - Distributed software is in binary form

### **Assembly Programmer's View**



- Programmer-visible state
  - PC: the Program Counter (%rip in x86-64)
    - Address of next instruction
  - Named registers
    - Together in "register file"
    - Heavily used program data
  - Condition codes
    - Store status information about most recent arithmetic operation
    - Used for conditional branching

- Memory
  - Byte-addressable array
  - Code and user data
  - Includes the Stack (for supporting procedures)

#### x86-64 Assembly "Data Types"

- Integral data of 1, 2, 4, or 8 bytes
  - Data values
  - Addresses
- Floating point data of 4, 8, 10 or 2x8 or 4x4 or 8x2
  - Different registers for those (e.g. %xmm1, %ymm2)
  - Come from extensions to x86 (SSE, AVX, ...)
- No aggregate types such as arrays or structures
  - Just contiguously allocated bytes in memory
- Two common syntaxes
  - "AT&T": used by our course, slides, textbook, gnu tools, ...
  - "Intel": used by Intel documentation, Intel tools, ...
  - Must know which you're reading

Not covered In 351

#### What is a Register? (Review)

- A location in the CPU that stores a small amount of data, which can be accessed very quickly (once every clock cycle)
- Registers have *names*, not *addresses*
  - In assembly, they start with % (e.g. %rsi)
- Registers are at the heart of assembly programming
  - They are a precious commodity in all architectures, but especially x86

#### x86-64 Integer Registers – 64 bits wide

| %rax         | <sup>%</sup> eax | % <b>r8</b>  | %r8d  |
|--------------|------------------|--------------|-------|
| %rbx         | %ebx             | % <b>r9</b>  | %r9d  |
| %rcx         | %ecx             | % <b>r10</b> | %r10d |
| %rdx         | %edx             | % <b>r11</b> | %r11d |
| % <b>rsi</b> | %esi             | % <b>r12</b> | %r12d |
| %rdi         | %edi             | % <b>r13</b> | %r13d |
| % <b>rsp</b> | %esp             | % <b>r14</b> | %r14d |
| %rbp         | %ebp             | %r15         | %r15d |

Can reference low-order 4 bytes (also low-order 2 & 1 bytes)

#### Some History: IA32 Registers – 32 bits wide



#### Memory

#### Registers VS.

- Addresses Names VS.
  - 0x7FFFD024C3DC Srdi
- ✤ Big Small VS. ~ 8 GiB
- Slow VS.
  - ~50-100 ns
- Dynamic VS.
  - Can "grow" as needed while program runs

- - $(16 \times 8 B) = 128 B$
- Fast

sub-nanosecond timescale

Static

fixed number in hardware

#### Three Basic Kinds of Instructions (Review)

- 1) Transfer data between memory and register
  - Load data from memory into register
    - %reg = Mem[address]
  - Store register data into memory
    - Mem[address] = %reg

**Remember:** Memory is indexed just like an array of bytes!

- 2) Perform arithmetic operation on register or memory data
  - c = a + b; z = x << y; i = h & g;</pre>
- 3) Control flow: what instruction to execute next
  - Unconditional jumps to/from procedures
  - Conditional branches

### Instruction Sizes and Operands (Review)

- Size specifiers
  - b = 1-byte "byte", w = 2-byte "word",
     l = 4-byte "long word", q = 8-byte "quad word"
  - Note that due to backwards-compatible support for 8086 programs (16-bit machines!), "word" means 16 bits = 2 bytes in x86 instruction names
- Operand types
  - Immediate: Constant integer data (\$)
  - Register: 1 of 16 integer registers (%)
  - Memory: Consecutive bytes of memory at a computed address (())

#### x86-64 Introduction

- ✤ Data transfer instruction (mov)
- Arithmetic operations
- Memory addressing modes
  - swap example

#### **Moving Data**

- & General form: mov\_ source, destination
  - Really more of a "copy" than a "move"
  - Like all instructions, missing letter (\_) is the size specifier
  - Lots of these in typical code

#### **Operand Combinations**



- Cannot do memory-memory transfer with a single instruction
  - How would you do it?

### **Some Arithmetic Operations**

- Binary (two-operand) Instructions:
  - Maximum of one memory operand
  - Beware argument order!
  - No distinction between signed and unsigned
    - Only arithmetic vs. logical shifts

| F                               | ormat |     | Computation      |                    |
|---------------------------------|-------|-----|------------------|--------------------|
| addq                            | src,  | dst | dst = dst + src  | (dst += src)       |
| subq                            | src,  | dst | dst = dst – src  |                    |
| imulq                           | src,  | dst | dst = dst * src  | signed mult        |
| sarq                            | src,  | dst | dst = dst >> src | <b>A</b> rithmetic |
| shrq                            | src,  | dst | dst = dst >> src | Logical            |
| shlq                            | src,  | dst | dst = dst << src | (same as salq)     |
| xorq                            | src,  | dst | dst = dst ^ src  |                    |
| andq                            | src,  | dst | dst = dst & src  |                    |
|                                 | src,  |     | dst = dst   src  |                    |
| <b>L</b> operand size specifier |       |     |                  |                    |

#### **Practice Question**

- Which of the following are valid implementations of rcx = rax + rbx?
  - addq %rax, %rcx = movq %rax, %rcx addq %rbx, %rcx addq %rbx, %rcx

movq \$0, %rcx addq %rbx, %rcx addq %rax, %rcx  xorq %rax, %rax addq %rax, %rcx addq %rbx, %rcx



#### **Example of Basic Addressing Modes**

```
void swap(long* xp, long* yp)
{
    long t0 = *xp;
    long t1 = *yp;
    *xp = t1;
    *yp = t0;
}
```

| swap: |              |  |
|-------|--------------|--|
| movq  | (%rdi), %rax |  |
| movq  | (%rsi), %rdx |  |
| movq  | %rdx, (%rdi) |  |
| movq  | %rax, (%rsi) |  |
| ret   |              |  |

Compiler Explorer: https://godbolt.org/z/zc4Pcq

#### Summary

- x86-64 is a complex instruction set computing (CISC) architecture
  - There are 3 types of operands in x86-64
    - Immediate, Register, Memory
  - There are 3 types of instructions in x86-64
    - Data transfer, Arithmetic, Control Flow