Share on Facebook

In computer architecture, a delay slot is an instruction slot that gets executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. Thus, by design, the instructions appear to execute in an illogical or incorrect order. It is typical for assemblers to automatically reorder instructions by default, hiding the awkwardness from assembly developers and compilers.

Velib bike station Steven F. Udvar-Hazy Center: P-40 Warhawk with "sharktooth" nose Steven F. Udvar-Hazy Center: main hall panorama (P-40 et al) Steven F. Udvar-Hazy Center: P-40 Warhawk with "sharktooth" nose Steven F. Udvar-Hazy Center: Lockheed SR-71 Blackbird port panorama (Bowlus 1-S-2100 Senior Albatross "Falcon" overhead) Project 365 #57: 260209 Something is Missing... Steven F. Udvar-Hazy Center: B-29 Superfortress "Enola Gay" panorama Steven F. Udvar-Hazy Center: main hall panorama Marina Bay Steven F. Udvar-Hazy Center: main hall panorama (P-40 et al) Steven F. Udvar-Hazy Center: Profile view of the SR-71 Blackbird, F-4 Corsair, Peashooter, among others Steven F. Udvar-Hazy Center: Grumman F6F-3 Hellcat Steven F. Udvar-Hazy Center: P-40 Warhawk & F-4 Corsair hanging over the SR-71 Blackbird, among others Steven F. Udvar-Hazy Center: South hangar panorama, including Vought OS2U-3 Kingfisher seaplane, B-29 Enola Gay, among others Steven F. Udvar-Hazy Center: B-29 Superfortress "Enola Gay" panorama Steven F. Udvar-Hazy Center: P-40 Warhawk, SR-71 Blackbird, Naval Aircraft Factory N3N seaplane, Space Shuttle Enterprise Steven F. Udvar-Hazy Center: South hangar panorama, including Vought OS2U-3 Kingfisher seaplane, B-29 Enola Gay, among others 3 10 essentials RPSI Santa Special at Matt Goff Bridge Steven F. Udvar-Hazy Center: Vought F4U-1D Corsair, with P-40 Warhawk in background Steven F. Udvar-Hazy Center: Photomontage of  SR-71 on the port side Steven F. Udvar-Hazy Center: Vought F4U-1D Corsair, with P-40 Warhawk and SR-71 Blackbird in background Steven F. Udvar-Hazy Center: SR-71 Blackbird (tail view) Steven F. Udvar-Hazy Center: Lockheed SR-71 Blackbird port panorama (P-40 Warhawk & Bowlus 1-S-2100 Senior Albatross "Falcon" overhead) Steven F. Udvar-Hazy Center: Profile view of the SR-71 Blackbird, F-4 Corsair, P-40 Warhawk, among others Steven F. Udvar-Hazy Center: main hall panorama (SR-71, Space Shuttle, et al) 220/365 - The 365 Days Project Steven F. Udvar-Hazy Center: Photomontage of main entrance view, including P-40 Warhawk & F-4 Corsair up front, SR-71 Background below in the near distance, and the Space Shuttle Enterprise beyond PLAYED LIKE A FIDDLE (AND SCHOOLED LIKE A CHILD) Battle of the Bulge:  The 2nd Command Post of Major Desobry TSA-approved one-arm bandits vegas airport landing slots LAX To Vancouver LAX To Vancouver LAX To Vancouver LAX To Vancouver LAX To Vancouver The finishing touches Project 365(2) Day 29 Fun and Games at Didcot Scary Drilling Shear Line And Storm Drain Installation Debris Following Shear Line Blast Blasted Shear Line Visible And Drilling Production Holes Placing Mats Prior To Shear Line Blast Drilling Shear Line Debris Following Shear Line Blast Shoulder Reconstruction, Widening And Storm Drain Installation low ceiling in Sydney A Lazy Sunday
Images Source: Flickr. Images licensed under the Creative Commons CC-BY-SA
From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer architecture, a delay slot is an instruction slot that gets executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. Thus, by design, the instructions appear to execute in an illogical or incorrect order. It is typical for assemblers to automatically reorder instructions by default, hiding the awkwardness from assembly developers and compilers.

Branch delay slots [edit]

When a branch instruction is involved, the location of the following delay slot instruction in the pipeline may be called a branch delay slot. Branch delay slots are found mainly in DSP architectures and older RISC architectures. MIPS, PA-RISC, ETRAX CRIS, SuperH, and SPARC are RISC architectures that each have a single branch delay slot; PowerPC, ARM, and the more recently designed Alpha do not have any. DSP architectures that each have a single branch delay slot include the VS DSP, µPD77230 and TMS320C3x. The SHARC DSP and MIPS-X use a double branch delay slot; such a processor will execute a pair of instructions following a branch instruction before the branch takes effect.

The following example shows delayed branches in assembly language for the SHARC DSP. Registers R0 through R9 are cleared to zero in order by number (the register cleared after R6 is R7, not R9). No instruction executes more than once.

     R0 = 0;
     CALL fn (DB);      /* call a function, below at label "fn" */
     R1 = 0;            /* first delay slot */
     R2 = 0;            /* second delay slot */
     /***** discontinuity here (the CALL takes effect) *****/

     R6 = 0;            /* the CALL/RTS comes back here, not at "R1 = 0" */
     JUMP end (DB);
     R7 = 0;            /* first delay slot */
     R8 = 0;            /* second delay slot */
     /***** discontinuity here (the JUMP takes effect) *****/

     /* next 4 instructions are called from above, as function "fn" */
fn:  R3 = 0;
     RTS (DB);          /* return to caller, past the caller's delay slots */
     R4 = 0;            /* first delay slot */
     R5 = 0;            /* second delay slot */
     /***** discontinuity here (the RTS takes effect) *****/

end: R9 = 0;

The goal of a pipelined architecture is to complete an instruction every clock cycle. To maintain this rate, the pipeline must be full of instructions at all times. The branch delay slot is a side effect of pipelined architectures due to the branch hazard, i.e. the fact that the branch would not be resolved until the instruction has worked its way through the pipeline. A simple design would insert stalls into the pipeline after a branch instruction until the new branch target address is computed and loaded into the program counter. Each cycle where a stall is inserted is considered one branch delay slot. A more sophisticated design would execute program instructions which are not dependent on the result of the branch instruction. This optimization can be performed in software at compile time by moving instructions into branch delay slots in the in-memory instruction stream, if the hardware supports this. Another side effect is that special handling should be taken care of managing breakpoint on instructions as well as stepping while debugging within branch delay slot.

The ideal number of branch delay slots in a particular pipeline implementation is dictated by the number of pipeline stages, the presence of register forwarding, what stage of the pipeline the branch conditions are computed, whether or not a branch target buffer (BTB) is used and many other factors. Software compatibility requirements dictate that an architecture may not change the number of delay slots from one generation to the next. This inevitably requires that newer hardware implementations contain extra hardware to ensure that the architectural behavior is followed despite no longer being relevant.

Load delay slot [edit]

A load delay slot is an instruction which executes immediately after a load (of a register from memory) but does not see the result of the load. Load delay slots are very uncommon because load delays are highly unpredictable on modern hardware. A load may be satisfied from RAM or from a cache, and may be slowed by resource contention. Load delays were seen on very early RISC processor designs. The MIPS I ISA (implemented in the R2000 and R3000 microprocessors) suffers from this problem.

The following example is MIPS I assembly code, showing both a load delay slot and a branch delay slot.

  lw   v0,4(v1)   # load word from address v1+4 into v0
  nop             # useless load delay slot
  jr   v0         # jump to the address specified by v0
  nop             # useless branch delay slot

External links [edit]

Wikipedia content is licensed under the GNU Free Document License or Creative Commons CC-BY-SA
Loading...
Loading...