HALICERY

free-time coding, hardware dev, articles

Top
Home 8042 Blogs About
Home IntelEssential The 386: IA-32 Protected Mode Expand Down

Last modified: Mon Jun 29 08:07:05 UTC+0200 2026 © A. Tarpai


Expand down segments

Stack overflow (eg. too many PUSH) is a common problem. Intel 286 p-mode was designed to catch and signal this violation through Exeption and to be able to increase stack size.

Expandable down segments are primarily made for the stack that pre-decrements SP before pushing data.

286 16-bit Expand down

The E=1 bit in DATA/STACK SEGMENT DESCRIPTORs (S=1 X=0) means valid offset must be above limit – see Descriptors.

          __
    FFFF    |
    ....    |
    ....    |  E=1 EXPAND-DOWN
    ....    |  VALID OFFSET > LIMIT
    ....  __|
    LIMIT   |
    ....    |
    ....    |
    ....    |  E=0
    ....    |  VALID OFFSET <= LIMIT
    ....    |
    ....    |
    0000  __|

Why the hell do we need ED?

I can just set up a stack as E-UP and it works. Consider a 10K Size (2800h) stack allocated with E-UP type:

Start = alloc(Size)

|          |
+----------+  <--- LIMIT = 2800h - 1 = 27FFh
|          |
|   Size   |
|          |
|          |
+--Start---+  <--- BASE = Start
|          |
|          |

Then set SP = 2800 and start pushing..

SP = 2800
start pushing words...


2800  | INVALID               ^
-------------------           |
27FE  |                       |POP
27FC  |                       |
      |                       |      |
      | VALID                        |
0002  |                              | PUSH
0000  |                              |
-------------------                  v
FFFE  | INVALID

Both too many POP and PUSH will violate LIMIT so it works just fine (too many POP is already a severe error so lets just talk about PUSH).

But the problem is.. on stack overflow we cannot change segment parameters to expand for more push: the last address was 0. This segment is not expandable.

This could only work if a lower limit can be set and start moving data from higher address.

With E-D it would be like this. We start push-ing from SP = 0:

SP = 0
start pushing words...


0000  | INVALID               ^
-------------------           |
FFFE  |                       |POP
FFFC  |                       |
      |                       |      |
      | VALID                        |
D802  |                              | PUSH
D800  |                              |
-------------------                  v
D7FE  | INVALID

With this setup it's easy and possible to add more space to the stack by decreasing segment limit.

But where is BASE and what is the LIMIT for 10K E-D?

Set up as expandable stack

Lets start with the same alloc of 10K.

Start = alloc(Size)

|          |  <--- Start + Size
+----------+
|          |
|   Size   |
|          |
|          |
+--Start---+  <--- LIMIT = −Size − 1 = NOT Size
|          |
|          |  <---  BASE = Start + Size − 64K

The first PUSH should move data to the top of allocated area. PUSH is pre-decrement. ED makes sure offsets only above limit are accepted. So we have to start from SP = 0, thus the first PUSH offset will be FFFE (16-bit pushes words).

BASE is 64K lower, than Start + Size. The reason is that 286 BASE values are 24-bit and the offset addition is unsigned:

+---+---+---+---+---+---+
| 0   0 |  16-BIT OFFS  |
+---+---+---+---+---+---+
            +
+---+---+---+---+---+---+
|     24-BIT SR BASE    |
+---+---+---+---+---+---+

LIMIT is 64K − Size. But that is D800, which is a valid offset byte, so minus one will be D7FF to get exactly 10K valid addresses above LIMIT. On 16-bit this is a simple NOT-operation (−a−1 = NOT a).

LIMIT = NOT 2800 = D7FF

Segment Expansion

            |           |                                        |           |
            |           |                                        |           |
        ^   |           |                                SP=0    +-----------+
        |   |   NO-NO!  |                          PUSH  FFFE|   | xxxxxxxxx |
        |   |    AREA   |                          PUSH  FFFC|   | xxxxxxxxx |  allocated
64K     |   +-----------+  <-- LIMIT               PUSH  ..  |   | xxxxxxxxx |  stack
OFFS    |   | xxxxxxxxx |                          PUSH  ..  |   | xxxxxxxxx |
range   |   | xxxxxxxxx |  allocated               PUSH  ..  |   +-----------+  <-- LIMIT
        |   | xxxxxxxxx |  memory                  PUSH  ..  |   |   NO-NO!  |
        |   | xxxxxxxxx |                                    v   |    AREA   |
        0   +-----------+                                        |           |
            |           |                                        |           |
            |           |                                        |           |
            +-----------+ 0                                      +-----------+ 0
              EXPAND-UP                                           EXPAND-DOWN

On limit violation both types can be properly extended:

EXPAND limit up:

Eg. buffer overrun: cpu can catch this with exception. Then allocate a larger block, copy and free previous content, set segment parameters to new BASE and increase segment limit. The program causing limit violation can gracefully continue (all pointer offsets and content unchanged for the program). On 386+ with paging this is even more efficient: map a new page for extension and continue, no need to copy previous content.

EXPAND limit down:

Hardware can catch stack overflow (eg. too many PUSH). But without the EXPAND-DOWN segment feature it would be impossible to keep all pointer offsets and content unchanged for the running program: expansion is done by decreasing segment limit. Initial SP has to be 0 to work.

Remark. Highest privilege stack limit violation ends up in cpu shut-down: there is no space to push parameters for exception handlers for the CPU.

386 Expand down

B=1 32-bit Expand down Stack

Setting up BASE

When B = 1, the 32-bit offset checked against limit and added to SS-BASE is from 0000_0000 to FFF_FFFF.

Expand down stack initially sets ESP = 0. The first dword push will decrement ESP by 4 and move data to offset FFFF_FFFC.

              ESP = 0
push...       ESP = FFFF_FFFC      = -4
push...       ESP = FFFF_FFF8      = -8
push...       ESP = FFFF_FFF4      = -12
..            ..
..            ..

On the 32-bit 386, both BASE- and OFFSET values are 32-bit (SR is not used for address-space extension).

+---+---+---+---+---+---+---+---+
|         32-BIT OFFS           |
+---+---+---+---+---+---+---+---+
                +
+---+---+---+---+---+---+---+---+
|         32-BIT SR BASE        |
+---+---+---+---+---+---+---+---+

This addition can be considered signed, eg. the value FFFFFFFF = −1 (the addition wraps around 32-bit).

It will make it quite simple to set up BASE.

Setting up LIMIT: granularity and the G-bit

Consider the following table for calculating LIMIT20 value of segment descriptors, when G = 1, for a desired 4K-granularity size (the N value).

G = 1
LIMIT20 LIMIT32 E = 0 EXPAND UP E = 1 EXPAND DOWN
valid offset range Segment size valid offset range Segment size
Bytes N×4K Bytes N×4K
F_FFFFFFFF_FFFF0000_0000 - FFFF_FFFF 4GB 1M N/A 0 0
F_FFFEFFFF_EFFF0000_0000 - FFFF_EFFF 4GB-4KB 1M-1 FFFF_F000 - FFFF_FFFF 4KB 1
F_FFFDFFFF_DFFF0000_0000 - FFFF_DFFF 4GB-8KB 1M-2 FFFF_E000 - FFFF_FFFF 8KB 2
......... ... ... ... ... ...
......... ... ... ... ... ...
......... ... ... ... ... ...
......... ... ... ... ... ...
0_00020000_2FFF0000_0000 - 0000_2FFF 12KB 3 0000_3000 - FFFF_FFFF 4GB-16KB 1M-3
0_00010000_1FFF0000_0000 - 0000_1FFF 8KB 2 0000_2000 - FFFF_FFFF 4GB-8KB 1M-2
0_00000000_0FFF0000_0000 - 0000_0FFF 4KB 1 0000_1000 - FFFF_FFFF 4GB-4KB 1M-1

LIMIT20 calculation for G=1 based on N = 4K size:

4K is not coincidental: matches 4K page-size – as in a more advanced system (virtual) memory allocation happens in N × 4K-pages.

Result: set up EXPAND DOWN 32-BIT Stack Segment (E=1 B=1) to physical address

Only with G=1 and in 4K Size increments (Size = N x 4K)

Size = N x 4K
Start = alloc(Size)

|          |  <--- BASE = Start + Size
+----------+
|          |
|   Size   |
|          |
|          |
+--Start---+  <--- G=1 LIMIT20 = NOT N
|          |
|          |

Max Size is 4GB − 4KB.

B=0 16-bit Expand down

About limit check and Expand Down Segments

On the 32-bit 386 OFFSET-s, LIMIT-s and comparators are all 32-bit.

If limit check is simply a 32-bit OFFSET > LIMIT comparison, a 16-bit Expand down segment would make memory unprotected up to 4GB above LIMIT. There has to be another limit check for 16-bit B=0 EXPAND DOWN segments, such as a legacy stack.

Consider:

E = 1 EXPAND-DOWN
B = 0
LIMIT = 8000 (32K)


32-bit OFFSET-s:
           __
FFFF_FFFF    |  4GB
...._....    |
...._....    |
...._....    |  SHOULD BE
...._....    |  INVALID!
...._....    |
0001_0002    |
0001_0001    |
0001_0000  __|
0000_FFFF    |
0000_....    |
0000_....    |  VALID: OFFSET > LIMIT
0000_....    |
0000_....  __|
0000_8000    |  <-- LIMIT
0000_....    |
0000_....    |
0000_....    |  INVALID
0000_....    |
0000_....    |
0000_....    |
0000_0000  __|
...._....          <-- 32-bit BASE
...._....

16-bit segments are supposed to be used for legacy 16-bit code. For 16-bit legacy addressing modes this is not an issue, OFFSET-HI is zeroed after effective address calculation and offsets will wrap-around. For implicit stack operations through a B=0 SS (push/pop) this is not an issue, the processor does zeroing OFFSET-HI automatically.

But 32-bit addressing modes and offsets can address way higher, 16-bit segment descriptors can be loaded running D=1 code, D=0 code can use 67h prefix for 32-bit addressing modes any time.

The B-bit meaning for Expand Down Segments

The B-bit affects all B=0 E=1 (16-bit expand down) segment limit check. It has no effect and not enforced on normal, expand up E=0 segments (both 16- and 32-bit offsets must be within limit).

When B = 0 AND E = 1, there is an extra step before limit check. OFFSET-HI must be zero:

                 B = 0                        B = x
                 E = 1                        E = 0

         31                  0        31                  0
         +-------------------+        +-------------------+
         |      OFFSET       |        |      OFFSET       |
         +-------------------+        +-------------------+
                   |                            |
                   v                            |
         +-------------------+                  |
         |  0 0 0 0 x x x x  |                  |
         +-------------------+                  |
               OFFSET-HI                        |
               ZERO CHECK                       |
                   |                            |
                   v                            v
         +-------------------+        +-------------------+
         |  0 0 0 0 x x x x  |        |  x x x x x x x x  |
         +-------------------+        +-------------------+
              LIMIT check                 LIMIT check


         +-------------------+        +-------------------+
         |      SS:BASE      |        |      SS:BASE      |
 +       +-------------------+        +-------------------+
________________________________________________________________
         +-------------------+        +-------------------+
         | EFFECTIVE ADDRESS |        | EFFECTIVE ADDRESS |
         +-------------------+        +-------------------+

It makes sure a defined 16-bit expand down segment – as intentionally max. 64K – cannot be over-addressed with 32-bit offset instructions.

Tested in boot block code for both SS and other data segments:

OFFSET-HI check

Set B=0 E=1 and segment LIMIT to 0 (max): theoretically, offset 0001_FFFF is waaay above zero - yet fails LIMIT check. Exc.12 for SS and Exc.13 for other data segments.

OFFSET-HI is checked before limit

It did not help to set limit above 64K (E=1 B=0). Addressing above 64K results in LIMIT violation. I.e. OFFSET-HI check is not dependent on limit and performed for all B=0 E=1 segments.

Test a E=0 B=0 expand up segment

Set segment limit higher than 64K (eg. 000F_FFFF, 386 can do that) and test offset 0001_0000 below limit. NO LIMIT VIOLATION accessing above 64K, i.e. OFFSET-HI zero is not checked, when E=0.

Result: set up EXPAND DOWN 16-BIT Stack Segment (E=1 B=0) to physical address

When B=0, the 32-bit offset checked against limit and added to SS-BASE is from 0000_0000 to 0000_FFFF.

Setting BASE: for the first PUSH the 32-bit offset value will be 0000_FFFE (WORD PUSH) or 0000_FFFC (DWORD PUSH). So we shoot 64K below Start + Size:

Start = alloc(Size)

|          |
+----------+
|          |
|   Size   |
|          |
|          |
+--Start---+  <--- LIMIT = −Size − 1 = NOT Size
|          |
|          |  <---  BASE = Start + Size − 64K

Not so surprisingly – same as the 16-bit 286 calculation (emulation).

Max Size is 64K − 1 byte (E=1).

Setting LIMIT: byte-granularity (G=0) is more than enough to specify an expandable 16-bit segment. NB: setting max 64K limit, which can be done only with an expand up segment (E=0), effectively turns off limit checking (16-bit SP wraps around).

G=0 Byte-granularity LIMIT is straightforward. But in case allocation works in 4K-increments, the G=1 LIMIT possibilities are:

LIMIT32        G=1 LIMIT20   stack size   N     valid SP range

0000_FFFF      0000F         0            0     N/A
0000_EFFF      0000E         4K           1     0000_F000 - 0000_FFFF
0000_DFFF      0000D         8K           2     0000_E000 - 0000_FFFF
...            ...           ...         ..     ...
...            ...           ...         ..     ...
...            ...           ...         ..     ...
...            ...           ...         ..     ...
0000_2FFF      00002         52K         13     0000_3000 - 0000_FFFF
0000_1FFF      00001         56K         14     0000_2000 - 0000_FFFF
0000_0FFF      00000         60K (max)   15     0000_1000 - 0000_FFFF

G=1: 16-BIT EXPAND DOWN LIMIT20 = 15 - N. For G=1 B=0 E=1 16-bit stack the max size is 60K.

When stack grows into the SP range 0000_0000 - 0000_0FFF we get limit violation.