3. A technique used in advanced
microprocessors where the microprocessor
begins executing a second instruction before
the first has been completed. That is, several
instructions are in the pipeline
simultaneously, each at a different
processing stage.
8. Pipelining doesn't decrease the time for a single
datum to be processed; it only increases the throughput
of the system when processing a stream of data
A pipelined system typically requires more resources
(circuit elements, processing units, computer memory,
etc.) than one that executes one batch at a time, because
its stages cannot reuse the resources of a previous stage.
Moreover, pipelining may increase the time it takes for
an instruction to finish.
9. A superscalar CPU architecture implements a form of parallelism called
instruction level parallelism within a single processor. It therefore allows
faster CPU throughput than would otherwise be possible at a given clock rate
A superscalar processor executes more than one instruction during a clock
cycle by simultaneously dispatching multiple instructions to redundant
functional units on the processor.
While a superscalar CPU is typically also pipelined, pipelining and
superscalar architecture are considered different performance enhancement
techniques.
The superscalar technique is traditionally associated with several identifying
characteristics (within a given CPU core):
Instructions are issued from a sequential instruction stream
CPU hardware dynamically checks for data dependencies between instructions
at run time (versus software checking at compile time)
The CPU accepts multiple instructions per clock cycle
10. Simple superscalar pipeline. By fetching and
dispatching two instructions at a time, a maximum of
two instructions per cycle can be completed.
11. Now.......What about
Data Hazard?
• Data hazards occur when the pipeline
changes the order of read/write accesses to
operands that differs from the normal
sequential order.
12. 1 2 3 4 5 6 7 8 9
ADD R1, R2, R3 IF ID IE MEM WB
Sub R4, R5, R1 IF ID IE MEM WB
SUB
AND R6, R1, R7 IF ID IE MEM WB
AND
OR R8, R1, R9 IF ID IE MEM WB
OR
XOR R10,R1,R11 IF ID IE MEM WE
XOR
13. All the instructions after the ADD use the result of the ADD instruction (in
R1). The ADD instruction writes the value of R1 in the WB stage .
SUB instruction reads the value during ID stage (IDsub). This problem is
called a data hazard.
The AND instruction is also affected by this data hazard. The write of R1
does not complete until the end of cycle 5 (shown bolded). Thus, the AND
instruction that reads the registers during cycle 4 (IDand) will receive the
wrong result.
The OR instruction can be made to operate without incurring a hazard by a
simple implementation technique. The technique is to perform register file reads
in the second half of the cycle, and writes in the first half. Because both
WB for ADD and IDor for OR are performed in one cycle 5, the write to
register file by ADD will perform in the first half of the cycle, and the read of
registers by OR will perform in the second half of the cycle.
The XOR instruction operates properly, because its register read occur in cycle
6 after the register write by ADD.
14.
15. read after write (RAW)
data hazard refers to a situation where an instruction refers to a result
that has not yet been calculated or retrieved. This can occur because
even though an instruction is executed after a previous instruction, the
previous instruction has not been completely processed through the
pipeline.
Example
i1. R2 <- R1 + R3
i2. R4 <- R2 + R3
However, in a pipeline, when we fetch the operands for the 2nd
operation, the results from the first will not yet have been saved, and
hence we have a data dependency.
We say that there is a data dependency with instruction 2, as it is
dependent on the completion of instruction 1.
SOLUTION :FORWARD`
16. Write After Read (WAR)
A write after read (WAR) data hazard represents a problem
with concurrent execution.
Example
i1. R4 <- R1 + R3
i2. R3 <- R1 + R2
If we are in a situation that there is a chance that i2 may be
completed before i1 (i.e. with concurrent execution) we must
ensure that we do not store the result of register 3 before i1
has had a chance to fetch the operands
17. Write After Write (WAW)
A write after write (WAW) data hazard may occur in a
concurrent execution environment.
For example:
i1. R2 <- R1 + R2
i2. R2 <- R4 + R7
i2 tries to write an operand before it is written by i1. The
writes end up being performed in the wrong order, leaving the
value written by i1 rather than the value written by i2 in the
destination
We must delay the WB (Write Back) of i2 until the
execution of i1
Now .....why RAR not DATA Hazard?
18. There are several main solutions and algorithms used to resolve data
hazards:
insert a pipeline bubble whenever a read after write (RAW)
dependency is encountered, guaranteed to increase latency, or
utilize out-of-order execution to potentially prevent the need for
pipeline bubbles
utilize register forwarding to use data from later stages in the pipeline
In the case of out-of-order execution, the algorithm used can be:
scoreboarding, in which case a pipeline bubble will only be needed
when there is no functional unit available
the Tomasulo algorithm, which utilizes register renaming allowing the
continual issuing of instructions