

#### Review

- Derhhand scaling 1 ER power dissipation / area ~ constant - ~ parallel computers - old times auchile thre - instructions, assembly

### Demos

Demo: intro/Assembly Reading Comprehension

```
Demo: Source-to-assembly mapping
Code to try:
int main()
{
  int y = 0, i;
  for (i = 0; y < 100; ++i)
    v += i * i;
  return y;
}
```

Also try <a href="https://godbolt.org">https://godbolt.org</a> for direct source-to-assembly mapping

## Outline

#### Introduction

Notes Notes (unfilled, with empty boxes) Notes (source code on Github) About This Class Why Bother with Parallel Computers? Lowest Accessible Abstraction: Assembly Architecture of an Execution Pipeline Architecture of a Memory System Shared-Memory Multiprocessors

Machine Abstractions

Performance: Expectation, Experiment, Observation

Deutomana Oriented Languages and Abstractions

All of this can be built in about 4000 transistors. (e.g. MOS 6502 in Apple II, Commodore 64, Atari 2600)

So what exactly are Intel/ARM/AMD/Nvidia doing with the other billions of transistors?

## Execution in a Simple Processor



- ▶ [IF] Instruction fetch
- ▶ [ID] Instruction Decode
- ▶ [EX] Execution
- ▶ [MEM] Memory Read/Write
- ▶ [WB] Result Writeback

[Wikipedia ©]

## Solution: Pipelining

| IF              | ID | ΕX | MEM | WB  |     |     |     |    |
|-----------------|----|----|-----|-----|-----|-----|-----|----|
| ↓ <i>i</i>      | IF | ID | ΕX  | MEM | WB  |     |     |    |
| $t \rightarrow$ |    | IF | ID  | ΕX  | MEM | WB  |     |    |
|                 |    |    | IF  | ID  | ΕX  | MEM | WB  |    |
|                 |    |    |     | IF  | ID  | ΕX  | MEM | WB |

[Wikipedia ©]

## MIPS Pipeline: 110,000 transistors



## Hazards and Bubbles



Q: Types of Pipeline Hazards? (aka: what can go wrong?)

[Wikipedia ©]

#### Demo

# v1; C.6 v2; 1.4 conhol flow: 0.06 v2:nralen; 1.3

Demo: intro/Pipeline Performance Mysteries

## A Glimpse of a More Modern Processor



[David Kanter / Realworldtech.com]

## A Glimpse of a More Modern Processor: Frontend



## A Glimpse of a More Modern Processor: Backend



- New concept: Instruction-level parallelism ("ILP", "superscalar")
- Where does the IPC number from earlier come from?

[David Kanter / Realworldtech.com]

## A Glimpse of a More Modern Processor: Golden Cove





Demo: intro/More Pipeline Mysteries

## SMT/"Hyperthreading"





