Performance & Architecture Types: Lecture Notes

Parallel Computing: Performance

From the idea on the previous slide, we understand that, if our computer's CPU has $n$ cores, if we can split a program's code into threads that run in parallel on these $n$ cores, the program will run close to $n$ times faster.

Why close to $\boldsymbol n$ and not exactly $\boldsymbol n$ times faster? Because:

There are parts in every program that can't run in parallel, which means that only a portion of a program could be split over several cores. We have to run these parts only on one of the cores.
The computer needs to spend some extra time on the creation and management of each thread of execution on a core (because we need to prep the cores, copy some data around, etc.,) which reduces the overall speedup.

A pretty simple math formula that calculates the maximum possible speedup is called Amdahl's Law.