Overview of the Instruction Pipeline of the DEC Alpha 21164 in Software Writer PDF417 in Software Overview of the Instruction Pipeline of the DEC Alpha 21164

3.2 Overview of the Instruction Pipeline of the DEC Alpha 21164 use software barcode pdf417 integrating todeploy pdf417 2d barcode in software Windows Forms Table 3.1. Conditions and actions in the CDC 6600 scoreboard Condition checking Issue step: Unit free Ua = 0 and no WAW (Pi = 0) Scoreboard setting Issue step: Unit busy Ua = 1; record Fi , F j , Fk Record Q j , Qk and R j , Rk (e.g., R j = 1 if P j = 0 else if P j = b then R j = 0 and Q j = b) Record Pi = Ua Dispatch: Send F j and Fk to unit a Execute: Write result: Set Ua = 0 and Pi = 0.

Dispatch: Source re Software pdf417 2d barcode gisters ready: R j and Rk = 1 Execute: At end, ask for permission to write Write result: If some issued but not dispatched instruction before this instruction in program order is such that either its Q j or Qk = a, then stall. false, the instruct ion, and its successors, is stalled until both conditions become true. 2. Dispatch: When the instruction is issued, the execution unit is reserved (becomes busy).

Operands are read in the execution unit when they are both ready (i.e., are not results of still-executing instructions).

This prevents RAW hazards. 3. Execution: One or more cycles, depending on functional unit s latency.

When execution completes, the unit noti es the scoreboard that it is ready to write the result. 4. Write result: Before writing, the scoreboard checks for WAR hazards.

If one exists, the unit is stalled until all WAR hazards are cleared (note that an instruction in progress, i.e., whose operands have been read, won t cause a WAR).

Of course, there are opportunities for optimization such as buffering and forwarding (see the exercises), and also there are occurrences of extra stalls, for example, when two units in the same group want to store results in the same cycle. In order to control the four steps in the back-end, the scoreboard needs to know: r For each functional unit, whether it is free or busy: ag U for unit a. a r For each instruction in ight, including the one that wishes to issue: b The names of the result F and source F , F registers.

i j k b The names Qj , Qk of the units (if any) producing values for Fj , Fk . b Flags Rj , Rk indicating whether the source registers are ready. b Its status, that is, whether it is issued, dispatched, executing, or ready to write.

r For each result register F , the name of the unit, if any, that will produce its i results, say Pi (this is slightly redundant, but facilitates the hardware check). Then, the scoreboard functions as in Table 3.1.

Note that the write result step requires an associative search.. Table 3.2. Snapshot of the scoreboard Unit Mul1 Mul2 Add Reg. Unit Inst. status Exec Issue Exec 0 U 1 1 1 2 Fi 4 6 8 4 Mul1 Fj 0 4 2 6 Mul2 Fk 2 8 12 8 Add Rj 1 0 0 10 Rj 1 1 0 12 Qj.

Superscalar Processors Mul1 14 16 EXAMPLE 3:. Assume that the two multipliers have a latency of 6 cycles and the adder has a latency of 1 cycle. Only registers with even numbers are shown in the following table, because it is assumed that these are oating-point operations using pairs of registers for operands. Assume all registers and units are free to start with: R4 R0 R2 R6 R4 R8 R8 R2 + R12 R4 R14 + R16 # will use multiplier 1 # will use multiplier 2; RAW with i1 # will use adder; WAR with i2 # will use adder; WAW with i1.

i1: i2: i3: i4:. The progression of the instruction through the rst few cycles is elaborated on below. Table 3.2 is a snapshot of the scoreboard after cycle 6.

r Cycle 1: i1 is issued, and the rst row of the scoreboard is as shown in Table 3.2 except that the instruction status is issue. Note that the contents (i.

e., the Pi ) of register 4 indicate the Mul1 unit. r Cycle 2: The status of i1 becomes dispatch, and instruction i2 is issued.

This is indicated by the second row of the table and the contents of register 6. r Cycle 3: Instruction i1 is in execute mode, but i2 cannot be dispatched, because of the RAW dependency (Rj = 0). Instruction i3 can issue, though, as shown in the third row and the contents of register 8.

r Cycle 4: Instruction i3 can dispatch, but i4 cannot issue, for two reasons: the adder is busy, and there is a WAW dependency with i1 (this latter condition is easily checked by looking at the contents of register 4). r Cycle 6 (no change in the scoreboard during cycle 5): Instruction i1 is still executing, instruction i2 is still in the issue stage, and instruction i3 asks for permission to write. The permission will be refused because of the WAR dependency with instruction i2.

r The situation will remain the same until instruction i1 asks for permission to write, at cycle 8, which is granted immediately. r At the next cycle, (i.e.

, cycle 9), instruction i2 will be dispatched and instruction i3 will be able to write, allowing instruction i4 to issue, and so on. Note that despite all these restrictions, most notably the requirements that instructions issue in order, instructions can complete out of order..

Copyright © . All rights reserved.