Conversion of an Assembler source program in the IBM PC / MS-DOS environment into a machine executable form requires an "Assembly" program to convert into an "object" file and then one or more "Linker" programs to convert "object" files into an "executable" program file. The standard "Assembly" program is MicroSoft's MASM; most IBM PC Assembler source programs are written to meet the MASM coding requirements.
There are several non-commercial/shareware/freeware Assembler programs and Linker programs also available. To be significantly useful, such programs need to be compatible with the "source" format expected by the MASM standard. One such Assembler is the freeware "Arrow Assembler"; the "Linker" associated with Arrow Assembler is called "VAL" which performs the functions of both LINK/TLINK and EXE2BIN.
You can see a PowerPoint slide set and an AVI file video on using the ASM and VAL programs here: 28MASM_Arrow_ASM.pps 28Arrow.avi
Within the Arrow-system, generation of a .COM file (instead of a .EXE) is only a matter of adding a /com switch to the VAL command:
C:\>ASM Arrowsoft Assembler Public Domain v2.00c (64K Model) Copyright (c) 1986, 1987 Arrowsoft Systems, Inc. Input filename (.asm): GCD_EXE.ASM Object filename (GCD_EXE.obj): Listing filename(nul.lst): GCD_EXE.LST Cross-reference (nul.crf): Free memory=49418, Warnings=0, Errors=0 Assembly Ended. C:\>VAL /CO VAL 8086 linker - Mar 27 1995 OBJ file(s): GCD.OBJ COM file[c:\\gcd.com]: LST file: LIB file(s): C:\>GCD [... program runs ...]
Do not use these names or symbols for your own data or labels:
$ DF GROUP ORG * DGROUP GT %OUT + DOSSEG HIGH PAGE _ DQ IF PARA . DS IF1 PROC / DT IF2 PTR = DUP IFB PUBLIC ? DW IFDEF PURGE [ ] DWORD IFGIF QWORD .186 ELSE IFDE .RADIX .286 END IFIDN RECORD .286P ENDIF IFNB REPT .287 ENDM IFNDEF .SALL .386 ENDP INCLUDE SEG .386P ENDS INCLUDELIB SEGMENT .387 EQ IRP .SEQ .8086 EQU IRPC .SFCOND .8087 .ERR LABEL SHL ALIGN .ERR1 .LALL SHORT .ALPHA .ERR2 LARGE SHR AND .ERRB LE SIZE ASSUME .ERRDEF LENGTH SMALL AT .ERRDIF .LFCOND STACK BYTE .ERRE .LIST @STACK .CODE .ERRIDN LOCAL .STACK @CODE .ERRNB LOW STRUC @CODESIZE .ERRNDEF LT SUBTTL COMM .ERRNZ MACRO TBYTE COMMENT EVEN MASK .TFCOND .CONST EXITM MEDIUM THIS .CREF EXTRN MOD TITLE @CURSEG FAR .MODEL TYPE @DATA @FARDATA NAME .TYPE .DATA .FARDATA NE WIDTH @DATA? @FARDATA? NEAR WORD .DATA? .FARDATA? NOT @WORDSIZE @DATASIZE @FILENAME NOTHING .XALL DB FWORD OFFSET .XCREP DD GE OR .XLIST XOR
Here's an example of assembling some files downloaded to my C:\ directory.
I start up a DOS window in that directory, and type this to assemble and link the file "noecho.asm" in the current directory:
C:\>asm Arrowsoft Assembler Public Domain v2.00c (64K Model) Copyright (c) 1986, 1987 Arrowsoft Systems, Inc. Input filename (.asm): noecho Object filename (noecho.obj): Listing filename(nul.lst): noecho.lst Cross-reference (nul.crf): Free memory=50996, Warnings=0, Errors=0 Assembly Ended. C:\>val /com VAL 8086 linker - Mar 27 1995 OBJ file(s): noecho COM file[C:\noecho.com]: LST file: LIB file(s): C:\>noecho Bark! Bark! Bark! Ended on: d C:\>
I recommend using the command lines given at the start of ONEPAGE.ASM, since that cuts down on the unnecessary verbosity. Here is the same assembly and link, using command line options and file names to remove all the unnecessary verbosity:
C:\>asm /s noecho ; C:\>val /com noecho ; C:\>noecho Bark! Bark! Bark! Ended on: d C:\>
Here are all the files:
C:\>dir noecho.* Directory of C:\ NOECHO OBJ 146 07/07/00 12:28 NOECHO.OBJ NOECHO ASM 2,396 07/07/00 12:21 noecho.asm NOECHO COM 69 07/07/00 12:28 NOECHO.COM 3 file(s) 2,611 bytes C:\>
Below is an excerpt of the output written into the LST file by the Assembler. (Normally there is no LST file written; you can request one.) The LST file shows you both your assembler source code and the resulting machine code.
Note the standard five columns: Location, Code, Labels, Mnemonics, Operands. Relocatable instructions are flagged with "R" after the Code. The result is quite similar to the format we've been using for LMC code and mnemonics, except that Intel instructions are not all "one mailbox" long.
LINE LOC CODE BYTES LABELS MNEM OPERANDS ---- ---- ----------------- ------ ------- ------------------- 19 0000 Noecho segment 20 assume CS:Noecho,DS:Noecho 21 0100 org 100h 23 0100 start: 26 0100 C6 06 0141 R 78 90 mov dog,'x' 30 0106 80 3E 0141 R 64 while: cmp dog,'d' 31 010B 74 09 je barkout 33 010D B4 07 mov ah,07h 34 010F CD 21 int 21h 35 0111 A2 0141 R mov dog,al 37 0114 EB F0 jmp while 38 0116 barkout: 42 0116 BA 0123 R mov dx,offset barkmsg 43 0119 B4 09 mov ah,09h 44 011B CD 21 int 21h 46 011D B0 00 mov al,0h 47 011F B4 4C mov ah,4Ch 48 0121 CD 21 int 21h 54 0123 0D 0A barkmsg db 0Dh,0Ah 55 0125 42 61 72 6B 21 20 42 db 'Bark! Bark! Bark! Ended on: ' 56 61 72 6B 21 20 42 61 57 72 6B 21 20 45 6E 64 58 65 64 20 6F 6E 3A 20 60 0141 ?? dog db ? 61 0142 0D 0A db 0Dh,0Ah 62 0144 24 db "$" 64 0145 Noecho ends 65 end start SEGMENTS and GROUPS: -NAME- SIZE ALIGN COMBINE CLASS NOECHO --------------------------- 0145 PARA NONE SYMBOLS: -NAME- TYPE VALUE ATTR BARKMSG -------------------------- L BYTE 0123 NOECHO BARKOUT -------------------------- L NEAR 0116 NOECHO DOG ------------------------------ L BYTE 0141 NOECHO START ---------------------------- L NEAR 0100 NOECHO WHILE ---------------------------- L NEAR 0106 NOECHO Free memory=49920, Warnings=0, Errors=0 Assembly Ended.
And, for your education and amusement, here's a DEBUG dump of the assembled NOECHO.COM program memory, using DEBUG's "U" (UnAssemble) command to turn the machine code in the *.COM file back into assembler mnemonics (you can try doing this yourself):
C:\>debug noecho.com -u 1353:0100 C606410178 MOV BYTE PTR [0141],78 1353:0105 90 NOP 1353:0106 803E410164 CMP BYTE PTR [0141],64 1353:010B 7409 JZ 0116 1353:010D B407 MOV AH,07 1353:010F CD21 INT 21 1353:0111 A24101 MOV [0141],AL 1353:0114 EBF0 JMP 0106 1353:0116 BA2301 MOV DX,0123 1353:0119 B409 MOV AH,09 1353:011B CD21 INT 21 1353:011D B000 MOV AL,00 1353:011F B44C MOV AH,4C -u 1353:0121 CD21 INT 21 1353:0123 0D0A42 OR AX,420A 1353:0126 61 DB 61 1353:0127 726B JB 0194 1353:0129 2120 AND [BX+SI],SP 1353:012B 42 INC DX 1353:012C 61 DB 61 1353:012D 726B JB 019A 1353:012F 2120 AND [BX+SI],SP 1353:0131 42 INC DX 1353:0132 61 DB 61 1353:0133 726B JB 01A0 1353:0135 2120 AND [BX+SI],SP 1353:0137 45 INC BP 1353:0138 6E DB 6E 1353:0139 64 DB 64 1353:013A 65 DB 65 1353:013B 64 DB 64 1353:013C 206F6E AND [BX+6E],CH 1353:013F 3A20 CMP AH,[BX+SI] -u [...etc...] -q C:\>
You will note that, as I've been saying, all the labels are missing from the *.COM file - all we have are the machine instructions. Also, DEBUG doesn't know which bytes are instructions and which are data, so it tries to turn everything into instruction mnemonics, even the data.
As you see from our instruction "MOV DX,0123", our BARKMSG string starts at offset 0123h, so everything from memory offset 0123 onward is actually data, not instructions. Some of that data just happens to have the same bit patterns as instructions, and DEBUG tries to display it as such.