Assembler Programming Tools

Conversion of an Assembler source program in the IBM PC / MS-DOS environment into a machine executable form requires an "Assembly" program to convert into an "object" file and then one or more "Linker" programs to convert "object" files into an "executable" program file. The standard "Assembly" program is MicroSoft's MASM; most IBM PC Assembler source programs are written to meet the MASM coding requirements.

There are several non-commercial/shareware/freeware Assembler programs and Linker programs also available. To be significantly useful, such programs need to be compatible with the "source" format expected by the MASM standard. One such Assembler is the freeware "Arrow Assembler"; the "Linker" associated with Arrow Assembler is called "VAL" which performs the functions of both LINK/TLINK and EXE2BIN.


Use of the ASM Arrow Assembler and VAL Link Editor
in building .COM programs

You can see a PowerPoint slide set and an AVI file video on using the ASM and VAL programs here: 28MASM_Arrow_ASM.pps 28Arrow.avi

Within the Arrow-system, generation of a .COM file (instead of a .EXE) is only a matter of adding a /com switch to the VAL command:

      C:\>ASM
      Arrowsoft Assembler
      Public Domain v2.00c (64K Model)
      Copyright (c) 1986, 1987 Arrowsoft Systems, Inc.

      Input filename  (.asm): GCD_EXE.ASM
      Object filename (GCD_EXE.obj):
      Listing filename(nul.lst): GCD_EXE.LST
      Cross-reference (nul.crf):
      Free memory=49418, Warnings=0, Errors=0

      Assembly Ended.

      C:\>VAL    /CO
      VAL 8086 linker - Mar 27 1995
      OBJ file(s):  GCD.OBJ
      COM file[c:\\gcd.com]:
      LST file:
      LIB file(s):

      C:\>GCD
      [... program runs ...]

ASM Arrow Assembler reserved words

Do not use these names or symbols for your own data or labels:

   $         DF        GROUP      ORG
   *         DGROUP    GT         %OUT
   +         DOSSEG    HIGH       PAGE
   _         DQ        IF         PARA
   .         DS        IF1        PROC
   /         DT        IF2        PTR
   =         DUP       IFB        PUBLIC
   ?         DW        IFDEF      PURGE
   [  ]      DWORD     IFGIF      QWORD
   .186      ELSE      IFDE       .RADIX
   .286      END       IFIDN      RECORD
   .286P     ENDIF     IFNB       REPT
   .287      ENDM      IFNDEF     .SALL
   .386      ENDP      INCLUDE    SEG
   .386P     ENDS      INCLUDELIB SEGMENT
   .387      EQ        IRP        .SEQ
   .8086     EQU       IRPC       .SFCOND
   .8087     .ERR      LABEL      SHL
   ALIGN     .ERR1     .LALL      SHORT
   .ALPHA    .ERR2     LARGE      SHR
   AND       .ERRB     LE         SIZE
   ASSUME    .ERRDEF   LENGTH     SMALL
   AT        .ERRDIF   .LFCOND    STACK
   BYTE      .ERRE     .LIST      @STACK
   .CODE     .ERRIDN   LOCAL      .STACK
   @CODE     .ERRNB    LOW        STRUC
   @CODESIZE .ERRNDEF  LT         SUBTTL
   COMM      .ERRNZ    MACRO      TBYTE
   COMMENT   EVEN      MASK       .TFCOND
   .CONST    EXITM     MEDIUM     THIS
   .CREF     EXTRN     MOD        TITLE
   @CURSEG   FAR       .MODEL     TYPE
   @DATA     @FARDATA  NAME       .TYPE
   .DATA     .FARDATA  NE         WIDTH
   @DATA?    @FARDATA? NEAR       WORD
   .DATA?    .FARDATA? NOT        @WORDSIZE
   @DATASIZE @FILENAME NOTHING    .XALL
   DB        FWORD     OFFSET     .XCREP
   DD        GE        OR         .XLIST
                                  XOR

ASM and VAL Example

Here's an example of assembling some files downloaded to my C:\ directory.

I start up a DOS window in that directory, and type this to assemble and link the file "noecho.asm" in the current directory:

    C:\>asm
    Arrowsoft Assembler
    Public Domain v2.00c (64K Model)
    Copyright (c) 1986, 1987 Arrowsoft Systems, Inc.

    Input filename  (.asm): noecho
    Object filename (noecho.obj):
    Listing filename(nul.lst): noecho.lst
    Cross-reference (nul.crf):
    Free memory=50996, Warnings=0, Errors=0

    Assembly Ended.

    C:\>val /com
    VAL 8086 linker - Mar 27 1995
    OBJ file(s):  noecho
    COM file[C:\noecho.com]:
    LST file:
    LIB file(s):

    C:\>noecho
    Bark! Bark! Bark! Ended on: d
    C:\>

I recommend using the command lines given at the start of ONEPAGE.ASM, since that cuts down on the unnecessary verbosity. Here is the same assembly and link, using command line options and file names to remove all the unnecessary verbosity:

    C:\>asm /s noecho ;
    C:\>val /com noecho ;
    C:\>noecho
    Bark! Bark! Bark! Ended on: d
    C:\>

Here are all the files:

    C:\>dir noecho.*
     Directory of C:\
    NOECHO   OBJ           146  07/07/00  12:28 NOECHO.OBJ
    NOECHO   ASM         2,396  07/07/00  12:21 noecho.asm
    NOECHO   COM            69  07/07/00  12:28 NOECHO.COM
	     3 file(s)          2,611 bytes
    C:\>

Below is an excerpt of the output written into the LST file by the Assembler. (Normally there is no LST file written; you can request one.) The LST file shows you both your assembler source code and the resulting machine code.

Note the standard five columns: Location, Code, Labels, Mnemonics, Operands. Relocatable instructions are flagged with "R" after the Code. The result is quite similar to the format we've been using for LMC code and mnemonics, except that Intel instructions are not all "one mailbox" long.

LINE LOC   CODE BYTES               LABELS  MNEM    OPERANDS
---- ----  -----------------        ------  ------- -------------------
  19 0000                           Noecho  segment 
  20                                        assume  CS:Noecho,DS:Noecho 
  21 0100                                   org     100h            
  23 0100                           start: 
  26 0100  C6 06 0141 R 78 90               mov     dog,'x'
  30 0106  80 3E 0141 R 64          while:  cmp     dog,'d'
  31 010B  74 09                            je      barkout
  33 010D  B4 07                            mov     ah,07h
  34 010F  CD 21                            int     21h 
  35 0111  A2 0141 R                        mov     dog,al
  37 0114  EB F0                            jmp     while 
  38 0116                           barkout:
  42 0116  BA 0123 R                        mov     dx,offset barkmsg 
  43 0119  B4 09                            mov     ah,09h          
  44 011B  CD 21                            int     21h 
  46 011D  B0 00                            mov     al,0h           
  47 011F  B4 4C                            mov     ah,4Ch          
  48 0121  CD 21                            int     21h 
  54 0123  0D 0A                    barkmsg db      0Dh,0Ah
  55 0125  42 61 72 6B 21 20 42             db      'Bark! Bark! Bark! Ended on: ' 
  56       61 72 6B 21 20 42 61     
  57       72 6B 21 20 45 6E 64     
  58       65 64 20 6F 6E 3A 20     
  60 0141  ??                       dog     db      ?
  61 0142  0D 0A                            db      0Dh,0Ah
  62 0144  24                               db      "$"
  64 0145                           Noecho  ends 
  65                                        end start       

SEGMENTS and GROUPS:

        -NAME-                          SIZE    ALIGN   COMBINE CLASS

NOECHO ---------------------------      0145    PARA    NONE

SYMBOLS:            

        -NAME-                          TYPE    VALUE   ATTR         

BARKMSG --------------------------      L BYTE  0123    NOECHO
BARKOUT --------------------------      L NEAR  0116    NOECHO
DOG ------------------------------      L BYTE  0141    NOECHO
START ----------------------------      L NEAR  0100    NOECHO
WHILE ----------------------------      L NEAR  0106    NOECHO

Free memory=49920, Warnings=0, Errors=0
Assembly Ended.

And, for your education and amusement, here's a DEBUG dump of the assembled NOECHO.COM program memory, using DEBUG's "U" (UnAssemble) command to turn the machine code in the *.COM file back into assembler mnemonics (you can try doing this yourself):

    C:\>debug noecho.com
    -u
    1353:0100 C606410178    MOV     BYTE PTR [0141],78
    1353:0105 90            NOP
    1353:0106 803E410164    CMP     BYTE PTR [0141],64
    1353:010B 7409          JZ      0116
    1353:010D B407          MOV     AH,07
    1353:010F CD21          INT     21
    1353:0111 A24101        MOV     [0141],AL
    1353:0114 EBF0          JMP     0106
    1353:0116 BA2301        MOV     DX,0123
    1353:0119 B409          MOV     AH,09
    1353:011B CD21          INT     21
    1353:011D B000          MOV     AL,00
    1353:011F B44C          MOV     AH,4C
    -u
    1353:0121 CD21          INT     21
    1353:0123 0D0A42        OR      AX,420A
    1353:0126 61            DB      61
    1353:0127 726B          JB      0194
    1353:0129 2120          AND     [BX+SI],SP
    1353:012B 42            INC     DX
    1353:012C 61            DB      61
    1353:012D 726B          JB      019A
    1353:012F 2120          AND     [BX+SI],SP
    1353:0131 42            INC     DX
    1353:0132 61            DB      61
    1353:0133 726B          JB      01A0
    1353:0135 2120          AND     [BX+SI],SP
    1353:0137 45            INC     BP
    1353:0138 6E            DB      6E
    1353:0139 64            DB      64
    1353:013A 65            DB      65
    1353:013B 64            DB      64
    1353:013C 206F6E        AND     [BX+6E],CH
    1353:013F 3A20          CMP     AH,[BX+SI]
    -u
    [...etc...]
    -q

    C:\>

You will note that, as I've been saying, all the labels are missing from the *.COM file - all we have are the machine instructions. Also, DEBUG doesn't know which bytes are instructions and which are data, so it tries to turn everything into instruction mnemonics, even the data.

As you see from our instruction "MOV DX,0123", our BARKMSG string starts at offset 0123h, so everything from memory offset 0123 onward is actually data, not instructions. Some of that data just happens to have the same bit patterns as instructions, and DEBUG tries to display it as such.