Last modified: Sun Jun 21 17:17:34 UTC+0200 2026 © A. Tarpai
The programming model of the 8086
Intro
We are at the end of the 70's, in the era of newer 16-bit architecture microprocessors emerging from previous 8-bit CPU-s (Motorola 6800 to 68000, Zilog Z80 to Z8000, Intel 8080 to 8086).
Generally, registers, ALU operations, data buses are 16-bit. The microprocessor has 16 physical address lines for maximum 64K byte-addressable memory.
Memory addresses encoded in instructions calculate a 16-bit value and that is the physical memory address, driving directly the address bus.
16-bit CPU
+----------------------+
| |
| |
| 16-bit registers |
| |
| | +-----------+
| 16-bit instr address-|----------------------/----> | |
| | 16-bit | 64 KB |
| | address | RAM |
+----------------------+ +-----------+
The 8086 is an extended version of this 16-bit CPU.
The 8086 hardware model: Segment Registers (SR)
The 8086 is a 16-bit architecture CPU. Registers, ALU operations, data movements are 16-bit wide and every address for memory reference encoded in instructions calculate a 16-bit offset value: that is a 64K address space.
The segmented memory model and segment registers (SR) is a hardware addition in the 8086 cpu, outside of the running program's scope.
An implementation to be able to relocate anywhere this 64 KB address space within a larger, 1 MB by using on-chip hardware adders and additional 4 physical address lines. Note that programs can still access max 64KB continuous memory:
1MB RAM
+-----------+
| |
| |
+-----------+
| | 16-bit
| 64 KB | <---- EA OFFSET
segment | |
base ----> +-----------+
SR x 16 | |
| |
16-bit 8086 CPU | |
+----------------------+---------------+ | |
| | on-chip SR | | |
| | ________ | | |
| 16-bit registers | |___CS___| | | |
| | |___DS___| | | |
| | |___SS___| | | |
| 16-bit instr OFFS --|-> |___ES___| | -----/----> | |
| | 16-bit | 20-bit | |
| | | address | |
+----------------------+---------------+ +-----------+
16 x 64KB
There are 4 segregs for different types of memory access (code, data, stack and an extra data eg. for shared memory). The memory segmentation scheme is mostly designed for modular programming.
For every memory reference that drives the memory address bus (including code fetch, implicit stack operations), the CPU will implicitly use and add one of the on-chip SR to the 16-bit SEGMENT OFFSET to obtain a 20-bit EA. For the 8086 this is the physical address.
The adder is 20-bit wide, SR BASE is used as a 20-bit value in the following way in the addition:
8086 PHYSICAL ADDRESS PATH
+---------------+ +---------------+
| 16-BIT OFFS | | 16-BIT SR |
+---------------+ +---------------+
| | | | | | | |
+---+---+---+---+---+ +---+---+---+---+---+
| 0 | | | | | | | | | | 0 | "SR BASE"
+---+---+---+---+---+ +---+---+---+---+---+
| |
| 20-bit adder |
| ___ |
+------------/ + \------------+
\___/
|
A19 | A0
+---+---+---+---+---+
| |
+---+---+---+---+---+
20-BIT PHYSICAL
ADDRESS
In this sense, every memory address is only a virtual address for the running program; i.e. the final physical address depends on the current value in SR. Combined with several SR for different type of memory access, segmentation is a mini memory management system.
Segment registers are 16-bit wide and provide the base address as if shifted by 4. This gives 16-byte granularity within the 1MB address space. Segments can overlap and the same physical address has many possible SR:OFFS combinations.
8086 CPU REGISTER MODEL
15 0 15 0
+--------+--------+ +-----------------+
AX | AH AL | ACCUMULATOR | CS | CODE SEGMENT
+--------+--------+ +-----------------+
BX | BH BL | BASE | DS | DATA SEGMENT
+--------+--------+ +-----------------+
CX | CH CL | COUNT | SS | STACK SEGMENT
+--------+--------+ +-----------------+
DX | DH DL | DATA | ES | EXTRA SEGMENT
+--------+--------+ +-----------------+
+-----------------+ +-----------------+
| SP | STACK POINTER | FLAGH FLAGL | STATUS FLAGS
+-----------------+ +-----------------+
| BP | "STACK" BASE POINTER | IP | INSTRUCTION POINTER
+-----------------+ +-----------------+
| SI | SOURCE INDEX
+-----------------+
| DI | DESTINATION INDEX
+-----------------+
General Purpose Registers
There are 8 x 16-bit General Purpose Registers, 4 Segment Registers and the Flags. The Instruction Pointer is not accessible directly, but it is push-ed, set by JMP, etc.
- A = ACCUMULATOR. Several instructions implicitly use AX/AL for compact code.
- B = BASE. Used as Base register for the Data Segment. Implicit for eg. XLAT.
- C = COUNT. Implicit use for LOOP, string instructions.
- D = DATA. Implicit Accumulator extension for MUL/DIV AX:DX, used for IN/OUT
- SP = STACK POINTER. For Stack operations, implicit.
- BP = BASE POINTER (like BX but for the STACK). Fast and compact method to access all local variables on stack, with addressing modes using Stack Segment.
- SI/DI = SOURCE/DESTINATION INDEX. Addressing modes and Implicit for string instructions.
General Purpose Registers: any of them can be source/destination operand for any instruction with the 3-bit REG-field. When the operation size is 8-bit (W=0), the REG-field encodes the 8 byte-registers.
8086 Memory Adressing
8086 memory addressing modes can use up to 3 optional components: 2 registers and a constant immediate displacement value following the instruction byte. This base + index + displacement scheme was designed to facilitate accessing complex data structures, resulting in quite compact code. Base register can be BX or BP, index register is SI or DI. So more precisely, it is [BX/BP + SI/DI + DISP]. An 8-bit displacement byte is sign-extended to 16-bit value before the addition.
x x x x (B) BASE REGISTER BP or BX register (with implied SS:BP and DS:BX)
x x x x (I) INDEX REGISTER SI or DI register
+ x x x x (D) DISPLACEMENT 8- or 16-bit value
_______________________________________________________________________________________________________
0 X X X X 16-bit OFFSET <-- 64K wrap around
+ X X X X 0 (S) SEGMENT REGISTER 16-bit register: implied or overridden by prefix
_______________________________________________________________________________________________________
X X X X X 20-bit physical address <-- 1MB wrap-around
Instructions encode an unsigned, 16-bit OFFSET value.
The address component adder is 16-bit: eventual overflow bit is lost, resulting in 64K wrap-around before passing this value to the 20-bit BIU adder (this wrap-around is properly emulated on the 32-bit 386, see RealMode).
Note that SR + OFFS wraps around the 20-bit physical address on 8086, wrapping back almost 64K to the beginning of the memory range:
0 X X X X 16-bit OFFS 8 0 0 0 OFFS
+ X X X X 0 16-bit SEGMENT REGISTER + F F 0 0 SR
_____________________________________________ _________________________
X X X X X 20-bit physical address 1 0 7 0 0 0 <-- overflow bit truncated, wrap-around
0 7 0 0 0 <-- 20-bit physical address
Later processors with wider BIU adder did not emulate this feature. The A20-emulation first used with the 286: by gating the 286 CPU A20 line with external hardware - eg. the 8042 UPI on the AT board - simply inhibits driving A20, causing apparent address wrap-around. See A20.
I/O space
The I/O space is not segmented. The CPU drives only A[15..0] with max 64 KB I/O address space.
8086 Dedicated and Reserved Memory Locations
- 0H through 7FH (128 bytes): interrupt processing (32 vectors)
- FFFF0H through FFFFFH (16 bytes): system reset processing
16-bit Adressing Modes: the MODR/M byte
The 8086 is a two-operand machine: instructions can operate on two operands.
DIRECTION
REG <---------------> REG/MEM
WIDTH
In two directions (D-bit), moving/operating on byte or word data (W-bit). One operand is always located in one of the registers (REG field), the second is another register or memory address (MODR/M field). When MOD=11, R/M is a register.
All this is encoded in OPCODE and the following MODR/M byte:
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +-------------+ +-------------+ | OPCODE | D | W | | MOD | REG | R/M | | [DISP-LO] | | [DISP-HI] | +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +-------------+ +-------------+ D = 0: the REG field identifies the source operand D = 1: the REG field identifies the destination operand W = 0: Instruction operates on BYTE data W = 1: Instruction operates on WORD data R/M Field Encoding for MEMORY ADDRESS CALCULATION: R/M Field Encoding for REGISTER (MOD=11) R/M MOD=00 MOD=01 MOD=10 R/M W=0 W=1 000 [ BX + SI ] [ BX + SI + d8 ] [ BX + SI + d16 ] 000 AL AX 001 [ BX + DI ] [ BX + DI + d8 ] [ BX + DI + d16 ] 001 CL CX 010 [ BP + SI ] [ BP + SI + d8 ] [ BP + SI + d16 ] 010 DL DX 011 [ BP + DI ] [ BP + DI + d8 ] [ BP + DI + d16 ] 011 BL BX 100 [ SI ] [ SI + d8 ] [ SI + d16 ] 100 AH SP 101 [ DI ] [ DI + d8 ] [ DI + d16 ] 101 CH BP 110 [ d16]* [ BP + d8 ] [ BP + d16 ] 110 DH SI 111 [ BX ] [ BX + d8 ] [ BX + d16 ] 111 BH DI * = Direct Memory Addressing, 2 byte DISP follows and that is the OFFSET MOD DISPLACEMENT 00 0 byte DISP 01 1 byte DISP: sign-extended to 16-bit 10 2 byte DISP
The default and implied SR is DS except when BP is encoded: then SS.
This gives the following addressing modes with implied SR:
MOD R/M Effective Address eg. code bytes for MOV mem to DX: 00 000 DS:[BX + SI] 8B 10 MOV DX, [BX + SI] 00 001 DS:[BX + DI] 8B 11 MOV DX, [BX + DI] 00 010 SS:[BP + SI] 8B 12 MOV DX, [BP + SI] 00 011 SS:[BP + DI] 8B 13 MOV DX, [BP + DI] 00 100 DS:[SI] 8B 14 MOV DX, [SI] 00 101 DS:[DI] 8B 15 MOV DX, [DI] 00 110 DS:[d16]* 8B 16 34 12 MOV DX, [1234h]* 00 111 DS:[BX] 8B 17 MOV DX, [BX] 01 000 DS:[BX + SI + d8] 8B 50 12 MOV DX, [BX + SI + 12h] 01 001 DS:[BX + DI + d8] 8B 51 12 MOV DX, [BX + DI + 12h] 01 010 SS:[BP + SI + d8] 8B 52 12 MOV DX, [BP + SI + 12h] 01 011 SS:[BP + DI + d8] 8B 53 12 MOV DX, [BP + DI + 12h] 01 100 DS:[SI + d8] 8B 54 12 MOV DX, [SI + 12h] 01 101 DS:[DI + d8] 8B 55 12 MOV DX, [DI + 12h] 01 110 SS:[BP + d8] 8B 56 12 MOV DX, [BP + 12h] 01 111 DS:[BX + d8] 8B 57 12 MOV DX, [BX + 12h] 10 000 DS:[BX + SI + d16] 8B 90 34 12 MOV DX, [BX + SI + 1234h] 10 001 DS:[BX + DI + d16] 8B 91 34 12 MOV DX, [BX + DI + 1234h] 10 010 SS:[BP + SI + d16] 8B 92 34 12 MOV DX, [BP + SI + 1234h] 10 011 SS:[BP + DI + d16] 8B 93 34 12 MOV DX, [BP + DI + 1234h] 10 100 DS:[SI + d16] 8B 94 34 12 MOV DX, [SI + 1234h] 10 101 DS:[DI + d16] 8B 95 34 12 MOV DX, [DI + 1234h] 10 110 SS:[BP + d16] 8B 96 34 12 MOV DX, [BP + 1234h] 10 111 DS:[BX + d16] 8B 97 34 12 MOV DX, [BX + 1234h] * DS:Direct Memory
Note that for register to register move there are two opcode possibilities based on the direction flag (D) and swapping the two REG Field Encoding. These are equivalent:
8B D0 MOV DX, AX (REG <-- R/M) 89 C0 MOV DX, AX (R/M <-- REG)
Segment Override Prefix
Default segment register: for all data-manipulation instructions specification of a segment register is optional. Compact code.
Default segment register is SS for the effective addresses containing BP. DS for all other effective addresses.
This can be overridden with Segment Override Prefix:
+---+---+---+---+---+---+---+---+ | 0 0 1 R R 1 1 0 | OPCODE ... +---+---+---+---+---+---+---+---+ Segment Override Prefix 26h ES segment override prefix RR=00 2Eh CS segment override prefix RR=01 36h SS segment override prefix RR=10 3Eh DS segment override prefix RR=11
The same RR field values are used in MOV to/from SR and PUSH/POP SR instructions.
8086 Instruction bytes
+--------------+ +--------------+ +--------------+ +--------------+ +--------------+ +--------------+ | INSTRUCTION | | SEGMENT | | OPCODE | | MODR/M | | DISP | | IMMEDIATE | | PREFIX | | OVERRIDE | | | | | | | | | +--------------+ +--------------+ +--------------+ +--------------+ +--------------+ +--------------+ REP/LOCK 0..2 0..2
8086 FLAGS
6 Status Flags
3 Control Flags
* Note to Motorola programmers: MOV does not affect the Flags.
On RESET all cleared.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| | | | | OF | DF | IF | TF | SF | ZF | | AF | | PF | | CF | PUSHF/POPS
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
FLAGH FLAGL
<------------------------------------->
8080/8085 FLAGS
LAHF/SAHF implemented only for compatibility.
Even IA-32/64 has it. One usage: move FPU Cn.
CF (carry flag)
ZF (zero flag)
SF (sign flag)
AF (auxiliary carry flag) tested only by decimal arithmetic adjust instructions (AAA, AAS, DAA, DAS)
PF (parity flag) = number of 1-bits, set if even. Meant for checking data transmission errors.
OF (overflow flag) a 8086 addition for signed-arithmetic support
DF (direction flag) clearing DF causes string instructions to auto-increment
IF (interrupt-enable flag)
TF (trap flag) puts the processor into single-step mode
There are a lots of compatibility decisions here. From the iAPX 86, 88 USER'S MANUAL AUGUST 1981:
- "The registers, flags and program counter in the 8080/8085 CPUs all have counterparts in the 8086 and 8088."
- "The AF, CF, PF, SF, and ZF flags are the same in both CPU families."
- "This 8080/8085 to 8086 mapping allows most existing 8080/8085 program code to be directly translated into 8086/8088 code."