INSTRUCTION LEVEL PARALLELISM Steps In Executing an Instruction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Fetch instruction from memory 2. Decode instruction 3. Compute addresses of memory operands and/or branch targets 4. Fetch operands from registers and/or memory to ALU 5. Execute ALU operation or compute branch condition 6. Store results from ALU to registers and/or memory 7. Adjust PC Four-Stage Pipeline for Reg-Reg Instructions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ S1: Fetch instruction from memory; adjust PC S2: Decode instruction; speculatively fetch operands from registers to ALU S3: Execute ALU operation S4: Store results from ALU to registers Assumptions: a) Fixed-length instructions b) Fixed-format instructions (i.e., the same bits specify register addresses in all instructions that have them) c) For bypass, the ALU saves the last two results computed Four-Stage Pipeline for Reg-Reg and Load/Store Instructions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ S1: Fetch instruction from memory; adjust PC S2: Decode instruction; speculatively fetch operands from registers to ALU; speculatively compute addresses of memory operands S3: Execute ALU operation OR load/store operand from/to memory to/from ALU S4: Store results from ALU to registers Assumptions: a) Fixed-length instructions b) Fixed-format instructions (i.e., the same bits specify register and memory addresses in all instructions that have them) c) For bypass, the ALU saves the last two results computed or loaded d) No indirect addressing Four-Stage Pipeline for Reg-Reg, Load/Store, and Branch Instructions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ S1: Fetch instruction from memory; predictively adjust PC S2: Decode instruction; speculatively fetch operands from registers to ALU; speculatively compute addresses of memory operands; speculatively compute branch address S3: Execute ALU operation OR load/store operand from/to memory to/from ALU OR compute branch condition S4: Store results from ALU to registers; readjust PC and introduce stalls if predictive adjustment was incorrect Assumptions: a) Fixed-length instructions b) Fixed-format instructions (i.e., the same bits specify register, memory, and branch addresses in all instructions that have them) c) For bypass, the ALU saves the last two results computed or loaded d) No indirect addressing CS-323-12/03/18