An interesting fact that we learn from this formula is that the number of CPUs (processors) in the device is a limiting factor: creating more and more threads won't make code run faster. To actually reach the maximum speedup, the program must create a number of threads that is equal to the number of existing CPUs (not more and not less) that will run that section of code.
Example: If a computer has 4 CPUs, and the portion of the program that is supposed to manifest parallelism is 30%, then the overall speedup of the whole program is:
$$ {Speedup = \frac{1}{(1-0.3)+\frac{0.3}{4}} = \frac{1}{0.7+0.075} = \frac{1}{0.775} = 1.2903,} $$so the program will run about 1.3 times faster.