Although no longer common, the processor used in the original IBM PC, the Intel 8088, provides the basis for understanding all subsequent processors in the Intel 80x86 family (and thus for all later versions of the IBM PC). Some of the restrictions of the 8088 have been removed and some additional registers have been provided, but all the features of the original 8088 apply to all subsequent members of the Intel 80x86 family and thus the 8088 provides a convenient basis for studying this family.
We will start with an overview of the 8088's memory (greatly simplified when compared with subsequent 80x86 processors). We will then discuss the various registers available within the 8088. Finally we will combine these two elements in a discussion of stack processing as it is performed on the 8088.
An original IBM PC could contain 1 Megabyte of byte-addressable memory. (The later "AT-class" computers can address contain more, but at this point we will restrict ourselves to the base PC/XT processors). "Byte-addressable" means that each byte has its own unique address. Addresses always start at zero and go up in steps of one; so every number from zero up to 1 "Meg" is an identifier or "address" of a unique 8-bit storage location.
1 Megabyte is two to the power of 20 (220) bytes or 1,048,576 bytes. Another way of looking at it is to say, 1 Megabyte is the number of locations that can be uniquely addressed by 20 bits. (The prefix "mega" normally means 1 million - but in computer systems, this is only approximate.) (An other common term of measurement, the Kilobyte, is similarly derived; 1 Kilobyte being the number of bytes uniquely identifiable by a 10 bit pattern, 1024 bytes.)
Unfortunately, the CPU used in the IBM PC uses a 16-bit address bus. That is "normal" addresses generated by the IBM PC's CPU (called an 8088) only contain 16 bits. A 16-bit address can only uniquely identify 64 Kilobytes (as opposed to the full 1 Megabyte that requires 20 bits for addressing). With a 16-bit address, we can only access a range of 64K bytes within the full 1M bytes of the computer's memory. However, by using those 16 bits as an "offset" or distance from some arbitrarily selectable beginning location, the 64K "segment" that we can address may be anywhere within the full 1M byte.
+-------------------------------+ ___ | | . | | . | | . | | . | | . | | . | | . | | . | | . | | . | | . start | | of -->+-------------------------------+ 1M Memory segment | | | 64K Segment | 20-bit addressable | | space | 16-bit addressable | . | space | . | | . +-------------------------------+ . | | . | | . | | . | | . | | . | | . | | . +-------------------------------+ ___
In practice, a program may use several segments, each up to 64K bytes long and each with its own initial "segment address". A segment can begin at any address that is a multiple of 10h (16d) within the computer's 1 Megabyte range. Any address that is a multiple of 10h is called a "paragraph" address. "Paragraph" or "segment" addresses are stored as 4 hexadecimal digits and represent a physical byte address that is 10h times larger than their value, e.g. a segment that starts at segment address 1B37h actually starts at real-memory byte address 1B370h.
"Segment" and "offset" addresses that are both 16-bit numbers can be combined to form a 20-bit (5-hexadecimal-digit) address. Given the way the segment and offset values overlap, many different segment and offset combinations can resolve to the same physical address. (These different address combinations are called "aliases".)
For example, a byte located at an offset of 29D4h within a segment at segment address 1B37h (1B37:29D4), would have an "absolute" or physical address identical to a byte located at offset 29C4h within a segment at segment address 1B38h (1B38:29C4):
1B37 segment 1B38 segment + 29D4 offset + 29C4 offset ------ ------ 1DD44h absolute 1DD44h absolute
Given different segment registers 1DD4h and 1D00h, we can find the equivalent alias offsets:
1DD44h absolute 1DD44h absolute - 1DD4 segment - 1D00 segment ------ ------ 0004 offset 0D44 offset
The above shows that 1B37:29D4, 1B38:29C4, 1DD4:0004, and 1D00:0D44 are all aliases for the same 1DD44h 20-bit real memory location.
Exercise: Given address 1490:0040, find the alias address that uses segment 1350, i.e. find the ???? in 1350:????. Answer: 14900+0040=14940 and 14940-13500=1440 therefore the answer is alias address 1350:1440 is equivalent to 1490:0040.
We will use the Intel assembly language convention that a memory address is a number or register enclosed in square brackets. For example, if register BX contains the value 0059 and ES contains 0069, then:
MOV AX,BX ; copies the contents of register BX (0059) into AX MOV AX,[BX] ; copies the contents of memory at hex offset 0059 into AX MOV AX,10 ; copies the hex value 10 into register AX MOV AX,[10] ; copies the contents of memory at hex offset 10 into AX MOV AX,[ES:10] ; copies the contents of memory at 0069:0010 into AX
The general purpose registers can each be used as a single 16-bit register or as two 8-bit registers. As a 16-bit register, a general purpose register's name ends in an "X"; e.g. AX. As an 8-bit register, its name ends in either an "H", if it is the "high order" half of the 16-bits, or an "L", if it is the "low order" half; e.g. AH or AL.
Except for BX, general-purpose registers cannot be used to address memory, i.e. "MOV AX,[BX]" is permitted, but "MOV AX,[CX]" is not. The default segment register pairing for memory address [BX] is DS, as in [DS:BX].
Since 20-bit addresses are made up of a segment and an offset, the Intel hardware contains both types of registers and a complete address needs both a segment and an offset. By default, each offset register is paired with a segment register so that you can omit the segment register and only specify the offset register. (You can also specify a different segment register if you need to.) Most of our simple .COM-style programs will use the default segment registers.
You cannot perform arithmetic on segment address registers. You have to move the segment value to a general purpose register, do the math, then move the value back again. You cannot load segment registers directly from memory. You must load a general purpose register from memory and then move the value from there to the segment register.
The Intel flags are a collection of bits that can be referenced individually and that are set by various instructions to indicate certain conditions. The conditional JUMP instructions check these flags, as shown in the list below. The most commonly used flags are these:
For a full list of DEBUG flags, see http://mirror.href.com/thestarman/asm/debug/8086REGs.htm#FLAGS
The stack is a temporary LIFO (Last In, First Out) area for storage; when words are stored in or "pushed on to" the stack and then retrieved from or "popped off" the stack, the last value "pushed" becomes the first value "popped".
In stack terminology, the latest thing pushed on the stack is said to be at the "top" of the stack; previously pushed values are said to be hidden "below" the top of the stack. In the Intel architecture, the stack SS:SP grows from high memory down to low memory. Every PUSH causes the Stack Pointer (SP) to decrement to become the new "top of stack" address. In the Intel world, the "top" of the stack is actually at a "lower" memory address, compared to items already pushed on the stack. Remember that Intel stacks push SP downward in memory, and pop SP upward in memory.
Instructions that push word values on to the stack are: CALL, INT and PUSH. The corresponding instructions that pop word values off the stack are RET, IRET and POP. A subroutine called with CALL (which pushes the return address on the stack) will return using RET (which pops the return value off the stack). A service called with INT (which pushes the flags and the return address on the stack) will return using IRET (which pops the return address and the flags off the stack).
CALL and RET are associated with modular programming and subroutine usage. A subroutine is a sequence of instructions that is "called" into execution by instructions somewhere else and that, when finished its task, should "return" to the place from which it was called. It is therefore necessary, when "calling" a subroutine, to save the address from where the call is taking place; in this way when the subroutine is completed, it will be able to return to the correct location. This is further complicated by the fact that a subroutine may call another subroutine, so multiple return addresses have to be saved.
Consider the following instruction sequence:
address: code: a0 . ; start executing here . . . . a1 CALL B ; call subroutine B at address b1 a2 . . . . . b1 B: . ; start of subroutine "B" . . . . b2 CALL C ; call subroutine C at address c1 b3 . . . b4 RET ; return from subroutine . . . . c1 C: . ; start of subroutine "C" . . . . c2 RET ; return from subroutine
In the above example, the program begins at memory address a0 and instructions are executed sequentially until the CALL at address [a1] is reached.
The CALL at address [a1] causes the current program counter (which contains address [a2]) to be pushed on to the stack and then the CALL jumps to the address of subroutine B at [b1].
Instructions are executed sequentially from address [b1] until the next CALL at address [b2] is reached. The CALL instruction at address [b2] causes the current program counter (which contains the address [b3]) to be pushed onto the top of the stack above [a2] and then the CALL jumps to the address of subroutine C at [c1]. (Remember that when we say "top" and "above", we are using generic stack terminology; the Intel stack SS:SP grows downward in memory and it actually stores the top of the stack in memory addresses below the previously pushed items!)
+--------+ logical top of stack ---> | [b3] | +--------+ | [a2] | +--------+ | . |
Instructions are executed sequentially from [c1] until the subroutine RET instruction at [c2] is reached. The RET instruction removes ("pops") the return value, [b3], from the top of the stack (so [a2] is again at the top of the stack) and then "jumps" to that address just removed from the stack (i.e. jumps to the return address [b3]).
+--------+ logical top of stack ---> | [a2] | +--------+ | . |
N.B. The previous top value is not erased or removed from the stack. (It's still in memory.) All that happens when the stack is "popped" is that a pointer to the "top of the stack" is moved down one position to point to the previously pushed value. (Remember that when we say "moved down", we are using generic stack terminology; the Intel stack SS:SP grows downward in memory, so that a stack "pop" causes the SP offset register to move up in memory!)
With the completion of subroutine "C", execution has returned to the statement following the "CALL" to subroutine "C" (namely to [b3]). Instruction execution continues sequentially from [b3] until the "RET" (return) instruction at [b4] is reached. As before, the "RET" instruction removes ("pops") the top value (address [a2]) from the stack and jumps to this address.
Instruction execution now continues with the instruction following the original "CALL" (to subroutine "B").
As previously noted, on Intel 8088 hardware, "real" memory addresses are 20-bits, a combination of a 16-bit segment and a 16-bit offset address. For code, the 20-bit address is derived from a combination of the CS (Code Segment) and IP (Instruction Pointer) registers.
To CALL a subroutine anywhere in the full 20-bit (1MB) address space, both a CS segment and IP offset must be given to create the 20-bit destination address. A full two-word 20-bit segment:offset address is called a "far" address (or "far pointer").
If the CALL instruction and the subroutine being called are in the same 64KB segment (have the same CS value), then only the IP offset address needs to be specified in the CALL statement. A partial single-word 16-bit offset-only address with no segment is called a "near" address (or "near pointer").
Addresses pushed on the stack by CALL (and popped by RET) must either be two 16-bit words CS:IP for a "far" CALL (different CS code segment) or just one 16-bit word IP for a "near" CALL (same CS code segment). Subroutines are therefore classified as "near" or "far" subroutines depending upon whether or not they are in the same 64KB code segment as the CALL statement that calls them. A far CALL pushes a two-word return address (CS:IP); a near CALL pushes only a one-word return address (IP). The RET (return) statement on a subroutine has to be coded correctly to pop what was pushed; you can't have a "near" subroutine return using a "far" RET statement (or vice-versa).
In Intel assembly language, declarations on the subroutines allow the assembler program to keep track of whether CALL should be a near or far call, and whether RET should be a near or far return. The programmer need only write "CALL" and "RET" and the assembler program will substitute the correct near or far opcodes.
The "PUSH" instruction can be used to save the contents of a 16-bit register on the top of the stack. This is convenient when we need to use and change the contents of a register but don't want to lose its current value. We "PUSH" its current value on to the stack, use the register for something else, and then use the "POP" instruction to restore its original value.
This temporary storage and reloading of register contents is a frequent requirement and is especially common within a subroutine. We "call" a subroutine to perform some specific requirement. A subroutine should behave as a "blackbox"; it should perform its function without any observable side-effects to the higher level code that calls it. Specifically, all registers should have the same values when we "get back" from a subroutine as when we "called" it (unless the explicit function of the subroutine is to modify a register). Since it is normal for a subroutine to need to use and modify the contents of registers, a subroutine should save the contents of all registers to be used (normally by "pushing" them on to the stack) before using them; before "returning" to the calling code, the subroutine should restore the original values to these registers.
Here is another code skeleton for a main program that calls a subroutine, where the subroutine needs to use the AX and BX registers and so has to save these registers on the stack:
address: code: . . . . . . m1 CALL S ; call subroutine S m2 . . . . . . . s1 S: PUSH AX ; start of subroutine "S" - save registers s2 PUSH BX s3 . ; subroutine is now free to use and modify AX and BX . . . . . . s4 POP BX ; restore old values of BX and AX before s5 POP AX ; returning to calling program s6 RET ; return to calling program . . . .
Note that, with the above "code skeleton", when program execution reaches the instruction at address [s3], the logical stack looks like:
+----------+ | original | logical top of stack ---> | BX | | value | +----------+ | original | | AX | | value | +----------+ | address | | [m2] | | | +----------+ | . | | . |
Therefore, in restoring the registers before returning, BX must be restored ("popped") before AX. Note also, failure to "pop" one of these registers would result in the RET instruction attempting to use the original value in AX as its "return address" (instead of using the correct return address [m2]). The programmer must ensure that every PUSH in a subroutine is matched by a POP before the subroutine returns.
An INT (Interrupt) instruction is a "software interrupt" that calls an operating system service routine located at in a fixed, reserved area of memory - the Interrupt Vector Table.
As far as stack usage is concerned, the INT and IRET (interrupt return) instructions do the same push/pop as a "far" CALL and "far" RET, except that the flag register is also saved on the stack (along with CS:IP) by INT and restored by IRET. Thus, INT pushes three words (six bytes) on the stack, and IRET pops those three words (six bytes) again.
Here is what the logical stack looks like after an INT causes an interrupt service routine to execute:
+----------+ | return | logical top of stack ---> | offset | | address | +----------+ | return | | segment | | address | +----------+ | original | | flag | | values | +----------+ | . |
Note that the Flags were pushed first, followed by CS, followed by IP. The IRET pops these values in the reverse order (LIFO).
Remember that all the above stack diagrams refer to generic stack terminology; the Intel stack SS:SP grows downward in memory and it actually stores the top of the stack in memory addresses below the previously pushed items!