Conversion of an Assembler source program in the IBM PC / MS-DOS environment into a machine executable form requires an "Assembly" program to convert into an "object" file and then one or more "Linker" programs to convert "object" files into an "executable" program file. The standard "Assembly" program is MicroSoft's MASM; most IBM PC Assembler source programs are written to meet the MASM coding requirements.
There are several non-commercial/shareware/freeware Assembler programs and Linker programs also available. To be significantly useful, such programs need to be compatible with the "source" format expected by the MASM standard. One such Assembler is the freeware "Arrow Assembler"; the "Linker" associated with Arrow Assembler is called "VAL" which performs the functions of both LINK/TLINK and EXE2BIN.
You can see a PowerPoint slide set and an AVI file video on using the ASM and VAL programs here: 28MASM_Arrow_ASM.pps 28Arrow.avi
Within the Arrow-system, generation of a .COM file (instead of a .EXE) is only a matter of adding a /com switch to the VAL command:
C:\>ASM
Arrowsoft Assembler
Public Domain v2.00c (64K Model)
Copyright (c) 1986, 1987 Arrowsoft Systems, Inc.
Input filename (.asm): GCD_EXE.ASM
Object filename (GCD_EXE.obj):
Listing filename(nul.lst): GCD_EXE.LST
Cross-reference (nul.crf):
Free memory=49418, Warnings=0, Errors=0
Assembly Ended.
C:\>VAL /CO
VAL 8086 linker - Mar 27 1995
OBJ file(s): GCD.OBJ
COM file[c:\\gcd.com]:
LST file:
LIB file(s):
C:\>GCD
[... program runs ...]
Do not use these names or symbols for your own data or labels:
$ DF GROUP ORG
* DGROUP GT %OUT
+ DOSSEG HIGH PAGE
_ DQ IF PARA
. DS IF1 PROC
/ DT IF2 PTR
= DUP IFB PUBLIC
? DW IFDEF PURGE
[ ] DWORD IFGIF QWORD
.186 ELSE IFDE .RADIX
.286 END IFIDN RECORD
.286P ENDIF IFNB REPT
.287 ENDM IFNDEF .SALL
.386 ENDP INCLUDE SEG
.386P ENDS INCLUDELIB SEGMENT
.387 EQ IRP .SEQ
.8086 EQU IRPC .SFCOND
.8087 .ERR LABEL SHL
ALIGN .ERR1 .LALL SHORT
.ALPHA .ERR2 LARGE SHR
AND .ERRB LE SIZE
ASSUME .ERRDEF LENGTH SMALL
AT .ERRDIF .LFCOND STACK
BYTE .ERRE .LIST @STACK
.CODE .ERRIDN LOCAL .STACK
@CODE .ERRNB LOW STRUC
@CODESIZE .ERRNDEF LT SUBTTL
COMM .ERRNZ MACRO TBYTE
COMMENT EVEN MASK .TFCOND
.CONST EXITM MEDIUM THIS
.CREF EXTRN MOD TITLE
@CURSEG FAR .MODEL TYPE
@DATA @FARDATA NAME .TYPE
.DATA .FARDATA NE WIDTH
@DATA? @FARDATA? NEAR WORD
.DATA? .FARDATA? NOT @WORDSIZE
@DATASIZE @FILENAME NOTHING .XALL
DB FWORD OFFSET .XCREP
DD GE OR .XLIST
XOR
Here's an example of assembling some files downloaded to my C:\ directory.
I start up a DOS window in that directory, and type this to assemble and link the file "noecho.asm" in the current directory:
C:\>asm
Arrowsoft Assembler
Public Domain v2.00c (64K Model)
Copyright (c) 1986, 1987 Arrowsoft Systems, Inc.
Input filename (.asm): noecho
Object filename (noecho.obj):
Listing filename(nul.lst): noecho.lst
Cross-reference (nul.crf):
Free memory=50996, Warnings=0, Errors=0
Assembly Ended.
C:\>val /com
VAL 8086 linker - Mar 27 1995
OBJ file(s): noecho
COM file[C:\noecho.com]:
LST file:
LIB file(s):
C:\>noecho
Bark! Bark! Bark! Ended on: d
C:\>
I recommend using the command lines given at the start of ONEPAGE.ASM, since that cuts down on the unnecessary verbosity. Here is the same assembly and link, using command line options and file names to remove all the unnecessary verbosity:
C:\>asm /s noecho ;
C:\>val /com noecho ;
C:\>noecho
Bark! Bark! Bark! Ended on: d
C:\>
Here are all the files:
C:\>dir noecho.*
Directory of C:\
NOECHO OBJ 146 07/07/00 12:28 NOECHO.OBJ
NOECHO ASM 2,396 07/07/00 12:21 noecho.asm
NOECHO COM 69 07/07/00 12:28 NOECHO.COM
3 file(s) 2,611 bytes
C:\>
Below is an excerpt of the output written into the LST file by the Assembler. (Normally there is no LST file written; you can request one.) The LST file shows you both your assembler source code and the resulting machine code.
Note the standard five columns: Location, Code, Labels, Mnemonics, Operands. Relocatable instructions are flagged with "R" after the Code. The result is quite similar to the format we've been using for LMC code and mnemonics, except that Intel instructions are not all "one mailbox" long.
LINE LOC CODE BYTES LABELS MNEM OPERANDS
---- ---- ----------------- ------ ------- -------------------
19 0000 Noecho segment
20 assume CS:Noecho,DS:Noecho
21 0100 org 100h
23 0100 start:
26 0100 C6 06 0141 R 78 90 mov dog,'x'
30 0106 80 3E 0141 R 64 while: cmp dog,'d'
31 010B 74 09 je barkout
33 010D B4 07 mov ah,07h
34 010F CD 21 int 21h
35 0111 A2 0141 R mov dog,al
37 0114 EB F0 jmp while
38 0116 barkout:
42 0116 BA 0123 R mov dx,offset barkmsg
43 0119 B4 09 mov ah,09h
44 011B CD 21 int 21h
46 011D B0 00 mov al,0h
47 011F B4 4C mov ah,4Ch
48 0121 CD 21 int 21h
54 0123 0D 0A barkmsg db 0Dh,0Ah
55 0125 42 61 72 6B 21 20 42 db 'Bark! Bark! Bark! Ended on: '
56 61 72 6B 21 20 42 61
57 72 6B 21 20 45 6E 64
58 65 64 20 6F 6E 3A 20
60 0141 ?? dog db ?
61 0142 0D 0A db 0Dh,0Ah
62 0144 24 db "$"
64 0145 Noecho ends
65 end start
SEGMENTS and GROUPS:
-NAME- SIZE ALIGN COMBINE CLASS
NOECHO --------------------------- 0145 PARA NONE
SYMBOLS:
-NAME- TYPE VALUE ATTR
BARKMSG -------------------------- L BYTE 0123 NOECHO
BARKOUT -------------------------- L NEAR 0116 NOECHO
DOG ------------------------------ L BYTE 0141 NOECHO
START ---------------------------- L NEAR 0100 NOECHO
WHILE ---------------------------- L NEAR 0106 NOECHO
Free memory=49920, Warnings=0, Errors=0
Assembly Ended.
And, for your education and amusement, here's a DEBUG dump of the assembled NOECHO.COM program memory, using DEBUG's "U" (UnAssemble) command to turn the machine code in the *.COM file back into assembler mnemonics (you can try doing this yourself):
C:\>debug noecho.com
-u
1353:0100 C606410178 MOV BYTE PTR [0141],78
1353:0105 90 NOP
1353:0106 803E410164 CMP BYTE PTR [0141],64
1353:010B 7409 JZ 0116
1353:010D B407 MOV AH,07
1353:010F CD21 INT 21
1353:0111 A24101 MOV [0141],AL
1353:0114 EBF0 JMP 0106
1353:0116 BA2301 MOV DX,0123
1353:0119 B409 MOV AH,09
1353:011B CD21 INT 21
1353:011D B000 MOV AL,00
1353:011F B44C MOV AH,4C
-u
1353:0121 CD21 INT 21
1353:0123 0D0A42 OR AX,420A
1353:0126 61 DB 61
1353:0127 726B JB 0194
1353:0129 2120 AND [BX+SI],SP
1353:012B 42 INC DX
1353:012C 61 DB 61
1353:012D 726B JB 019A
1353:012F 2120 AND [BX+SI],SP
1353:0131 42 INC DX
1353:0132 61 DB 61
1353:0133 726B JB 01A0
1353:0135 2120 AND [BX+SI],SP
1353:0137 45 INC BP
1353:0138 6E DB 6E
1353:0139 64 DB 64
1353:013A 65 DB 65
1353:013B 64 DB 64
1353:013C 206F6E AND [BX+6E],CH
1353:013F 3A20 CMP AH,[BX+SI]
-u
[...etc...]
-q
C:\>
You will note that, as I've been saying, all the labels are missing from the *.COM file - all we have are the machine instructions. Also, DEBUG doesn't know which bytes are instructions and which are data, so it tries to turn everything into instruction mnemonics, even the data.
As you see from our instruction "MOV DX,0123", our BARKMSG string starts at offset 0123h, so everything from memory offset 0123 onward is actually data, not instructions. Some of that data just happens to have the same bit patterns as instructions, and DEBUG tries to display it as such.