Chapter 4

Defining and Using Simple
Data Types


This chapter covers the concepts essential for working with simple data types in assembly-language programs. The first section shows how to declare integer variables. The second section describes basic operations including moving, loading, and sign-extending numbers, as well as calculating. The last section describes how to do various operations with numbers at the bit level, such as using bitwise logical instructions and shifting and rotating bits.

The complex data types introduced in the next chapter arrays, strings, structures, unions, and records use many of the operations illustrated in this chapter. Floating-point operations require a different set of instructions and techniques. These are covered in Chapter 6, “Using Floating-Point and Binary Coded Decimal
Numbers.”

Declaring Integer Variables

An integer is a whole number, such as 4 or 4,444. Integers have no fractional part, as do the real numbers discussed in Chapter 6. You can initialize integer variables in several ways with the data allocation directives. This section explains how to use the SIZEOF and TYPE operators to provide information to the assembler about the types in your program. For information on symbolic integer constants, see “Integer Constants and Constant Expressions” in Chapter 1.

Allocating Memory for Integer Variables

When you declare an integer variable by assigning a label to a data allocation directive, the assembler allocates memory space for the integer. The variable’s name becomes a label for the memory space. The syntax is:

[[name]] directive initializer

 

The following directives indicate the integer’s size and value range:

Directive

Description of Initializers

BYTE, DB (byte)

Allocates unsigned numbers from 0 to 255.

SBYTE (signed byte)

Allocates signed numbers from –128 to +127.

WORD, DW (word = 2 bytes)

Allocates unsigned numbers from
0 to 65,535 (64K).

SWORD (signed word)

Allocates signed numbers from
–32,768 to +32,767.

DWORD, DD (doubleword = 4 bytes),

Allocates unsigned numbers from
0 to 4,294,967,295 (4 megabytes).

SDWORD (signed doubleword)

Allocates signed numbers from
–2,147,483,648 to +2,147,483,647.

FWORD, DF (farword = 6 bytes)

Allocates 6-byte (48-bit) integers. These values are normally used only as pointer variables on the 80386/486 processors.

QWORD, DQ (quadword = 8 bytes)

Allocates 8-byte integers used with 8087-family coprocessor instructions.

TBYTE, DT (10 bytes),

Allocates 10-byte (80-bit) integers if the initializer has a radix specifying the base of the number.

 

See Chapter 6 for information on the REAL4, REAL8, and REAL10 directives that allocate real numbers.

The SIZEOF and TYPE operators, when applied to a type, return the size of an integer of that type. The size attribute associated with each data type is:

Data Type

Bytes

BYTE, SBYTE

1

WORD, SWORD

2

DWORD, SDWORD

4

FWORD

6

QWORD

8

TBYTE

10

 

The data types SBYTE, SWORD, and SDWORD tell the assembler to treat the initializers as signed data. It is important to use these signed types with high-level constructs such as .IF, .WHILE, and .REPEAT, and with PROTO and INVOKE directives. For descriptions of these directives, see the sections “Loop-Generating Directives,” “Declaring Procedure Prototypes,” and “Calling Procedures with INVOKE” in Chapter 7.

The assembler stores integers with the least significant bytes lowest in memory. Note that assembler listings and most debuggers show the bytes of a word in the opposite order high byte first.

Figure 4.1 illustrates the integer formats.

   

Figure 4 . 1     Integer Formats

Although the TYPEDEF directive’s primary purpose is to define pointer variables (see “Defining Pointer Types with TYPEDEF” in Chapter 3), you can also use TYPEDEF to create an alias for any integer type. For example, these declarations

char    TYPEDEF SBYTE
long    TYPEDEF DWORD
float   TYPEDEF REAL4
double  TYPEDEF REAL8

allow you to use char, long, float, or double in your programs if you prefer the C data labels.

Data Initialization

You can initialize variables when you declare them with constants or expressions that evaluate to constants. The assembler generates an error if you specify an initial value too large for the variable type.

A ? in place of an initializer indicates you do not require the assembler to initialize the variable. The assembler allocates the space but does not write in it. Use ? for buffer areas or variables your program will initialize at run time.

You can declare and initialize variables in one step with the data directives, as these examples show.

integer         BYTE    16          ; Initialize byte to 16
negint          SBYTE   -16         ; Initialize signed byte to -16
expression      WORD    4*3         ; Initialize word to 12
signedexp       SWORD   4*3         ; Initialize signed word to 12
empty           QWORD   ?           ; Allocate uninitialized long int
                BYTE    1,2,3,4,5,6 ; Initialize six unnamed bytes
long            DWORD   4294967295  ; Initialize doubleword to
                                    ;   4,294,967,295
longnum         SDWORD  -2147433648 ; Initialize signed doubleword
                                    ;   to -2,147,433,648
tb              TBYTE   2345t       ; Initialize 10-byte binary number

For information on arrays and on using the DUP operator to allocate initializer lists, see “Arrays and Strings” in Chapter 5.

Working with Simple Variables

Once you have declared integer variables in your program, you can use them to copy, move, and sign-extend integer variables in your MASM code. This section shows how to do these operations as well as how to add, subtract, multiply, and divide numbers and do bit-level manipulations with logical, shift, and rotate instructions.

Since MASM instructions require operands to be the same size, you may need to operate on data in a size other than that originally declared. You can do this with the PTR operator. For example, you can use the PTR operator to access the high-order word of a DWORD-size variable. The syntax for the PTR operator is

type PTR expression

where the PTR operator forces expression to be treated as having the type specified. An example of this use is

        .DATA
num     DWORD   0
        .CODE

        mov     ax, WORD PTR num[0] ; Loads a word-size value from
        mov     dx, WORD PTR num[2] ;   a doubleword variable

Copying Data

The primary instructions for moving data from operand to operand and loading them into registers are MOV (Move), XCHG (Exchange), CWD (Convert Word to Double), and CBW (Convert Byte to Word).

Moving Data

The most common method of moving data, the MOV instruction, is essentially a copy instruction, since it always copies the source operand to the destination operand without affecting the source. After a MOV instruction, the source and destination operands contain the same value.

The following example illustrates the MOV instruction. As explained in “General-Purpose Registers,” Chapter 1, you cannot move a value from one location in memory to another in a single operation.

; Immediate value moves
        mov     ax, 7       ; Immediate to register
        mov     mem, 7      ; Immediate to memory direct
        mov     mem[bx], 7  ; Immediate to memory indirect

; Register moves
        mov     mem, ax     ; Register to memory direct
        mov     mem[bx], ax ; Register to memory indirect
        mov     ax, bx      ; Register to register
        mov     ds, ax      ; General register to segment register

; Direct memory moves
        mov     ax, mem     ; Memory direct to register
        mov     ds, mem     ; Memory to segment register

; Indirect memory moves
        mov     ax, mem[bx] ; Memory indirect to register
        mov     ds, mem[bx] ; Memory indirect to segment register

; Segment register moves
        mov     mem, ds     ; Segment register to memory
        mov     mem[bx], ds ; Segment register to memory indirect
        mov     ax, ds      ; Segment register to general register

 

The following example shows several common types of moves that require two instructions.

; Move immediate to segment register
        mov     ax, DGROUP  ; Load AX with immediate value
        mov     ds, ax      ; Copy AX to segment register

; Move memory to memory
        mov     ax, mem1    ; Load AX with memory value
        mov     mem2, ax    ; Copy AX to other memory

; Move segment register to segment register
        mov     ax, ds      ; Load AX with segment register
        mov     es, ax      ; Copy AX to segment register

The MOVSX and MOVZX instructions for the 80386/486 processors extend and copy values in one step. See “Extending Signed and Unsigned Integers,” following.

Exchanging Integers

The XCHG (Exchange) instruction exchanges the data in the source and destination operands. You can exchange data between registers or between registers and memory, but not from memory to memory:

        xchg    ax, bx       ; Put AX in BX and BX in AX
        xchg    memory, ax   ; Put "memory" in AX and AX in "memory"
;       xchg    mem1, mem2   ; Illegal- can't exchange memory locations

Extending Signed and Unsigned Integers

Since moving data between registers of different sizes is illegal, you must “sign-extend” integers to convert signed data to a larger size. Sign-extending means copying the sign bit of the unextended operand to all bits of the operand’s next larger size. This widens the operand while maintaining its sign and value.

8086-based processors provide four instructions specifically for sign-extending. The four instructions act only on the accumulator register (AL, AX, or EAX), as shown in the following list.

Instruction

Sign-extend

CBW (convert byte to word)

AL to AX

CWD (convert word to doubleword)

AX to DX:AX

CWDE (convert word to doubleword extended)*

AX to EAX

CDQ (convert doubleword to quadword)*

EAX to EDX:EAX

*Requires an extended register and applies only to 80386/486 processors.

 

On the 80386/486 processors, the CWDE instruction converts a signed 16-bit value in AX to a signed 32-bit value in EAX. The CDQ instruction converts a signed 32-bit value in EAX to a signed 64-bit value in the EDX:EAX register pair.

This example converts signed integers using CBW, CWD, CWDE, and CDQ.

        .DATA
mem8    SBYTE   -5
mem16   SWORD   +5
mem32   SDWORD  -5
        .CODE
        .
        .
        .
        mov     <