HALICERY

free-time coding, hardware dev, articles

Top
Home 8042 Blogs About
Home IntelEssential The 8086 8086 programming model

Last modified: Sun Jun 21 17:17:34 UTC+0200 2026 © A. Tarpai


The programming model of the 8086

Intro

We are at the end of the 70's, in the era of newer 16-bit architecture microprocessors emerging from previous 8-bit CPU-s (Motorola 6800 to 68000, Zilog Z80 to Z8000, Intel 8080 to 8086).

Generally, registers, ALU operations, data buses are 16-bit. The microprocessor has 16 physical address lines for maximum 64K byte-addressable memory.

Memory addresses encoded in instructions calculate a 16-bit value and that is the physical memory address, driving directly the address bus.

      16-bit CPU
+----------------------+
|                      |
|                      |
| 16-bit registers     |
|                      |
|                      |                             +-----------+
| 16-bit instr address-|----------------------/----> |           |
|                      |                   16-bit    |   64 KB   |
|                      |                   address   |    RAM    |
+----------------------+                             +-----------+

The 8086 is an extended version of this 16-bit CPU.

The 8086 hardware model: Segment Registers (SR)

The 8086 is a 16-bit architecture CPU. Registers, ALU operations, data movements are 16-bit wide and every address for memory reference encoded in instructions calculate a 16-bit offset value: that is a 64K address space.

The segmented memory model and segment registers (SR) is a hardware addition in the 8086 cpu, outside of the running program's scope.

An implementation to be able to relocate anywhere this 64 KB address space within a larger, 1 MB by using on-chip hardware adders and additional 4 physical address lines. Note that programs can still access max 64KB continuous memory:

                                                        1MB RAM
                                                     +-----------+
                                                     |           |
                                                     |           |
                                                     +-----------+
                                                     |           |       16-bit
                                                     |   64 KB   | <---- EA OFFSET
                                        segment      |           |
                                        base  ---->  +-----------+
                                        SR x 16      |           |
                                                     |           |
    16-bit 8086 CPU                                  |           |
+----------------------+---------------+             |           |
|                      |   on-chip SR  |             |           |
|                      |    ________   |             |           |
| 16-bit registers     |   |___CS___|  |             |           |
|                      |   |___DS___|  |             |           |
|                      |   |___SS___|  |             |           |
| 16-bit instr OFFS  --|-> |___ES___|  | -----/----> |           |
|                      |     16-bit    |   20-bit    |           |
|                      |               |   address   |           |
+----------------------+---------------+             +-----------+
                                                       16 x 64KB

There are 4 segregs for different types of memory access (code, data, stack and an extra data eg. for shared memory). The memory segmentation scheme is mostly designed for modular programming.

For every memory reference that drives the memory address bus (including code fetch, implicit stack operations), the CPU will implicitly use and add one of the on-chip SR to the 16-bit SEGMENT OFFSET to obtain a 20-bit EA. For the 8086 this is the physical address.

The adder is 20-bit wide, SR BASE is used as a 20-bit value in the following way in the addition:

8086 PHYSICAL ADDRESS PATH

                +---------------+         +---------------+
                |  16-BIT OFFS  |         |   16-BIT SR   |
                +---------------+         +---------------+
                  |   |   |   |             |   |   |   |
            +---+---+---+---+---+         +---+---+---+---+---+
            | 0 |   |   |   |   |         |   |   |   |   | 0 |   "SR BASE"
            +---+---+---+---+---+         +---+---+---+---+---+
                      |                             |
                      |          20-bit adder       |
                      |             ___             |
                      +------------/ + \------------+
                                   \___/
                                     |
                           A19       |        A0
                           +---+---+---+---+---+
                           |                   |
                           +---+---+---+---+---+
                               20-BIT PHYSICAL
                                  ADDRESS

In this sense, every memory address is only a virtual address for the running program; i.e. the final physical address depends on the current value in SR. Combined with several SR for different type of memory access, segmentation is a mini memory management system.

Segment registers are 16-bit wide and provide the base address as if shifted by 4. This gives 16-byte granularity within the 1MB address space. Segments can overlap and the same physical address has many possible SR:OFFS combinations.

8086 CPU REGISTER MODEL

     15                0                          15                0
     +--------+--------+                          +-----------------+
AX   |   AH       AL   | ACCUMULATOR              |       CS        |  CODE SEGMENT
     +--------+--------+                          +-----------------+
BX   |   BH       BL   | BASE                     |       DS        |  DATA SEGMENT
     +--------+--------+                          +-----------------+
CX   |   CH       CL   | COUNT                    |       SS        |  STACK SEGMENT
     +--------+--------+                          +-----------------+
DX   |   DH       DL   | DATA                     |       ES        |  EXTRA SEGMENT
     +--------+--------+                          +-----------------+

     +-----------------+                          +-----------------+
     |       SP        | STACK POINTER            | FLAGH     FLAGL | STATUS FLAGS
     +-----------------+                          +-----------------+
     |       BP        | "STACK" BASE POINTER     |       IP        | INSTRUCTION POINTER
     +-----------------+                          +-----------------+
     |       SI        | SOURCE INDEX
     +-----------------+
     |       DI        | DESTINATION INDEX
     +-----------------+

   General Purpose Registers

There are 8 x 16-bit General Purpose Registers, 4 Segment Registers and the Flags. The Instruction Pointer is not accessible directly, but it is push-ed, set by JMP, etc.

General Purpose Registers: any of them can be source/destination operand for any instruction with the 3-bit REG-field. When the operation size is 8-bit (W=0), the REG-field encodes the 8 byte-registers.

8086 Memory Adressing

8086 memory addressing modes can use up to 3 optional components: 2 registers and a constant immediate displacement value following the instruction byte. This base + index + displacement scheme was designed to facilitate accessing complex data structures, resulting in quite compact code. Base register can be BX or BP, index register is SI or DI. So more precisely, it is [BX/BP + SI/DI + DISP]. An 8-bit displacement byte is sign-extended to 16-bit value before the addition.

             x x x x     (B) BASE REGISTER        BP or BX register (with implied SS:BP and DS:BX)
             x x x x     (I) INDEX REGISTER       SI or DI register
    +        x x x x     (D) DISPLACEMENT         8- or 16-bit value
_______________________________________________________________________________________________________

           0 X X X X     16-bit OFFSET            <-- 64K wrap around

    +      X X X X 0     (S) SEGMENT REGISTER     16-bit register: implied or overridden by prefix
_______________________________________________________________________________________________________

           X X X X X     20-bit physical address  <-- 1MB wrap-around

Instructions encode an unsigned, 16-bit OFFSET value.

The address component adder is 16-bit: eventual overflow bit is lost, resulting in 64K wrap-around before passing this value to the 20-bit BIU adder (this wrap-around is properly emulated on the 32-bit 386, see RealMode).

Note that SR + OFFS wraps around the 20-bit physical address on 8086, wrapping back almost 64K to the beginning of the memory range:

     0 X X X X     16-bit OFFS                            8 0 0 0    OFFS
 +   X X X X 0     16-bit SEGMENT REGISTER          +   F F 0 0      SR
_____________________________________________      _________________________
     X X X X X     20-bit physical address            1 0 7 0 0 0    <-- overflow bit truncated, wrap-around
                                                        0 7 0 0 0    <-- 20-bit physical address

Later processors with wider BIU adder did not emulate this feature. The A20-emulation first used with the 286: by gating the 286 CPU A20 line with external hardware - eg. the 8042 UPI on the AT board - simply inhibits driving A20, causing apparent address wrap-around. See A20.

I/O space

The I/O space is not segmented. The CPU drives only A[15..0] with max 64 KB I/O address space.

8086 Dedicated and Reserved Memory Locations

16-bit Adressing Modes: the MODR/M byte

The 8086 is a two-operand machine: instructions can operate on two operands.

          DIRECTION
REG  <--------------->  REG/MEM
           WIDTH

In two directions (D-bit), moving/operating on byte or word data (W-bit). One operand is always located in one of the registers (REG field), the second is another register or memory address (MODR/M field). When MOD=11, R/M is a register.

All this is encoded in OPCODE and the following MODR/M byte:

+---+---+---+---+---+---+---+---+    +---+---+---+---+---+---+---+---+     +-------------+  +-------------+
|      OPCODE           | D | W |    |  MOD  |    REG    |    R/M    |     |  [DISP-LO]  |  |  [DISP-HI]  |
+---+---+---+---+---+---+---+---+    +---+---+---+---+---+---+---+---+     +-------------+  +-------------+

D = 0: the REG field identifies the source operand
D = 1: the REG field identifies the destination operand

W = 0: Instruction operates on BYTE data
W = 1: Instruction operates on WORD data


R/M Field Encoding for MEMORY ADDRESS CALCULATION:                    R/M Field Encoding for REGISTER (MOD=11)

R/M    MOD=00          MOD=01               MOD=10                    R/M    W=0   W=1

000   [ BX + SI ]     [ BX + SI + d8 ]     [ BX + SI + d16 ]          000    AL    AX
001   [ BX + DI ]     [ BX + DI + d8 ]     [ BX + DI + d16 ]          001    CL    CX
010   [ BP + SI ]     [ BP + SI + d8 ]     [ BP + SI + d16 ]          010    DL    DX
011   [ BP + DI ]     [ BP + DI + d8 ]     [ BP + DI + d16 ]          011    BL    BX
100   [ SI ]          [ SI + d8 ]          [ SI + d16 ]               100    AH    SP
101   [ DI ]          [ DI + d8 ]          [ DI + d16 ]               101    CH    BP
110   [ d16]*         [ BP + d8 ]          [ BP + d16 ]               110    DH    SI
111   [ BX ]          [ BX + d8 ]          [ BX + d16 ]               111    BH    DI

* = Direct Memory Addressing, 2 byte DISP follows and that is the OFFSET

MOD    DISPLACEMENT
00     0 byte DISP
01     1 byte DISP: sign-extended to 16-bit
10     2 byte DISP

The default and implied SR is DS except when BP is encoded: then SS.

This gives the following addressing modes with implied SR:

MOD  R/M   Effective Address               eg. code bytes for MOV mem to DX:

00   000   DS:[BX + SI]                    8B 10            MOV  DX, [BX + SI]
00   001   DS:[BX + DI]                    8B 11            MOV  DX, [BX + DI]
00   010   SS:[BP + SI]                    8B 12            MOV  DX, [BP + SI]
00   011   SS:[BP + DI]                    8B 13            MOV  DX, [BP + DI]
00   100   DS:[SI]                         8B 14            MOV  DX, [SI]
00   101   DS:[DI]                         8B 15            MOV  DX, [DI]
00   110   DS:[d16]*                       8B 16 34 12      MOV  DX, [1234h]*
00   111   DS:[BX]                         8B 17            MOV  DX, [BX]

01   000   DS:[BX + SI + d8]               8B 50 12         MOV  DX, [BX + SI + 12h]
01   001   DS:[BX + DI + d8]               8B 51 12         MOV  DX, [BX + DI + 12h]
01   010   SS:[BP + SI + d8]               8B 52 12         MOV  DX, [BP + SI + 12h]
01   011   SS:[BP + DI + d8]               8B 53 12         MOV  DX, [BP + DI + 12h]
01   100   DS:[SI + d8]                    8B 54 12         MOV  DX, [SI + 12h]
01   101   DS:[DI + d8]                    8B 55 12         MOV  DX, [DI + 12h]
01   110   SS:[BP + d8]                    8B 56 12         MOV  DX, [BP + 12h]
01   111   DS:[BX + d8]                    8B 57 12         MOV  DX, [BX + 12h]

10   000   DS:[BX + SI + d16]              8B 90 34 12      MOV  DX, [BX + SI + 1234h]
10   001   DS:[BX + DI + d16]              8B 91 34 12      MOV  DX, [BX + DI + 1234h]
10   010   SS:[BP + SI + d16]              8B 92 34 12      MOV  DX, [BP + SI + 1234h]
10   011   SS:[BP + DI + d16]              8B 93 34 12      MOV  DX, [BP + DI + 1234h]
10   100   DS:[SI + d16]                   8B 94 34 12      MOV  DX, [SI + 1234h]
10   101   DS:[DI + d16]                   8B 95 34 12      MOV  DX, [DI + 1234h]
10   110   SS:[BP + d16]                   8B 96 34 12      MOV  DX, [BP + 1234h]
10   111   DS:[BX + d16]                   8B 97 34 12      MOV  DX, [BX + 1234h]

* DS:Direct Memory

Note that for register to register move there are two opcode possibilities based on the direction flag (D) and swapping the two REG Field Encoding. These are equivalent:

8B D0            MOV  DX, AX  (REG <-- R/M)
89 C0            MOV  DX, AX  (R/M <-- REG)

Segment Override Prefix

Default segment register: for all data-manipulation instructions specification of a segment register is optional. Compact code.
Default segment register is SS for the effective addresses containing BP. DS for all other effective addresses.

This can be overridden with Segment Override Prefix:

+---+---+---+---+---+---+---+---+
| 0   0   1   R   R   1   1   0 |    OPCODE   ...
+---+---+---+---+---+---+---+---+

   Segment Override Prefix

26h  ES segment override prefix RR=00
2Eh  CS segment override prefix RR=01
36h  SS segment override prefix RR=10
3Eh  DS segment override prefix RR=11

The same RR field values are used in MOV to/from SR and PUSH/POP SR instructions.

8086 Instruction bytes

+--------------+  +--------------+  +--------------+  +--------------+  +--------------+  +--------------+
| INSTRUCTION  |  |  SEGMENT     |  |    OPCODE    |  |    MODR/M    |  |    DISP      |  | IMMEDIATE    |
| PREFIX       |  |  OVERRIDE    |  |              |  |              |  |              |  |              |
+--------------+  +--------------+  +--------------+  +--------------+  +--------------+  +--------------+

   REP/LOCK                                                                   0..2             0..2

8086 FLAGS

6 Status Flags
3 Control Flags

* Note to Motorola programmers: MOV does not affect the Flags.

On RESET all cleared.

  15   14   13   12   11   10    9    8    7    6    5    4    3    2    1    0
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|    |    |    |    | OF | DF | IF | TF | SF | ZF |    | AF |    | PF |    | CF |  PUSHF/POPS
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
                  FLAGH                                   FLAGL
                                         <------------------------------------->
                                          8080/8085 FLAGS
                                          LAHF/SAHF implemented only for compatibility.
                                          Even IA-32/64 has it. One usage: move FPU Cn.

CF (carry flag)
ZF (zero flag)
SF (sign flag)
AF (auxiliary carry flag) tested only by decimal arithmetic adjust instructions (AAA, AAS, DAA, DAS)
PF (parity flag) = number of 1-bits, set if even. Meant for checking data transmission errors.

OF (overflow flag) a 8086 addition for signed-arithmetic support
DF (direction flag) clearing DF causes string instructions to auto-increment
IF (interrupt-enable flag)
TF (trap flag) puts the processor into single-step mode

There are a lots of compatibility decisions here. From the iAPX 86, 88 USER'S MANUAL AUGUST 1981: