Standard Library -- Control Structures


UCR StdLib v2.0: Control Structures
5.1 - Interface
5.2 - Overview of Control Structures
5.3 - ForLp, Next
5.3.1 - Calling Conventions and Assertions
5.3.2 - Syntax & Examples
5.4 - ForEach, EndFor, Yield, and Iterators
5.4.1 - Calling Conventions and Assertions
5.4.2 - Syntax and Examples
5.5 - Implementation of the FORLP..NEXT Macros
5.6 - Implementation of the FOREACH Macros


UCR StdLib v2.0: Control Structures

Although MASM provides "High Level Macros" that support IF statements, REPEAT..UNTIL loops, and WHILE loops, it does not support two other important control structures: FOR/NEXT loops and FOREACH/ENDFOR/Iterator loops. The macros provided in the control structures package covers this absence.

The Standard Library actually supports some additional control structures. See the description of exceptions for information on the try..endtry blocks.


5.1 Interface

To access the routines in the declarations package, your assembly language module must include the file "control.a" during assembly. You can accomplish this with either of the following include statements in your assembly code:



	include	control.a
or
	include	ucrlib.a

Note that the "ucrlib.a" include file automatically includes all include files associated with the UCR Standard Library.

The control.a include file exports several symbols. The UCR Standard Library prefaces all "private" names with a dollar sign ("$"). You should not call any routine in this package that begins with this symbol unless otherwise advised. To avoid name conflicts, you should not define any symbols in your programs that begin with a dollar sign ("$"). Note that future versions of the stdlib (that remain compatible with this release) may change "private" names. To remain compatible with future releases, you must not refer to these "private" names within your programs.

Source code appearing in this chapter is current as of Version Two, Release 40. There may be minor changes between this source code and the current release.


5.2 Overview of Control Structures

The control structures header file provides macros that make it easy to create and use FORLP..NEXT loops and FOREACH..ENDFOR loops. These macros also simplify the creation of iterators (for use with the FOREACH..ENDFOR loops).


5.3 ForLp, Next

The ForLp and Next statements let you create traditional FOR..NEXT loops within your assembly language programs (the name FORLP was chosen because MASM already uses the "FOR" keyword). Such loops iterate under control of a "loop control variable".

A typical FORLP..NEXT loop takes the form:




                forlp   i, 1, 10
                 .
                 .
                 .
          < loop body >
                 .
                 .
                 .
                next

This particular loop initializes "i" with the starting value ("1") and then compares this against the ending value ("10"). If i's initial value is less than or equal to ten, the ForLp statement executes the statements in the loop body. Upon hitting the "NEXT" statement, this code increments "i" by one and transfers control to the beginning of the loop where it once again checks to see if "i" is greater than 10. This repeats until i's value exceeds 10.

Although the ForLp..Next construct doesn't generate completely horrible code, you can almost always do better by directly coding the loop yourself in assembly language. The ForLp..Next construct is intended for those situations where convenience is more important than performance.


5.3.1 Calling Conventions and Assertions



ForLp:
Inputs:	Operand1:Loop control variable, Operand2: Starting value, Operand3: Termination value.
Outputs:	None.
Errors:	None.
Side Effects:	Initializes the loop control variable with the Starting value.
Assertions:	None.

Next:
Inputs:	None.
Outputs:	None.
Errors:	None.
Side Effects:	Jumps to the corresponding FORLP statement.
Assertions:	None.


5.3.2 Syntax & Examples

The syntax for the ForLp statement is:



ForLp	LoopCtrlVar, Start, Stop

LoopCtrlVar should be a 16-bit register or memory location. Start and Stop must be a 16-bit register, memory location, or a constant. (Technical note: as long as LoopCtrlVar is not a memory location, or Start and Stop are not both memory locations, the size of the values is irrelevant.)

The ForLp structure assumes the operands are signed integer values (-32768..32767).

The syntax for the NEXT directive is:



	next

This next is automatically associated with the previous ForLp statement in the source file. It will generate an error if no such ForLp statement exists.

Note that you can nest ForLp..Next statements in a program and the ForLp..Next macros will work properly. The follow code demonstrates such a nested ForLp..Next construct:




        ForLp   I, 1, 10
        ForLp   J, 0, 7
         .
         .
         .
        Next    ; Repeats the "ForLp J..." loop.
        Next    ; Repeats the "ForLp I..." loop.

You may specify a register as a loop control variable. However, you must take care not to modify that register's value within the loop.



var
    char CharArray[128]
EndVar

        ForLp   bx, 0, 127
        mov     CharArray[bx], 0
        Next

5.4 ForEach, EndFor, Yield, and Iterators

The UCR Standard Library control.a include file provides macros to help support a sophisticated looping control structure: iterators. An iterator is a special kind of function that returns two things: a return value (like a typical function) and "failure" or "success." In addition to returning two distinct objects, an iterator differs from a standard function insofar as it maintains its activation record as long as it returns success. That is, until it returns failure, it does not deallocate local variables and parameters from the stack.

Iterators control the execution of a FOREACH loop. A FOREACH loop calls an iterator. If that iterator returns "success", then the FOREACH loop body executes and uses the iterators other return value as the loop control value. Upon hitting the ENDFOR statement, the FOREACH loop repeats, calling the iterator again and testing its return value for "success" or "failure." When the iterator fails, the FOREACH loop transfers control to the first statement following ENDFOR.

An iterator function always takes the following form:




MyIterator      iter
                push    bp
                mov     bp, sp
;Optional:      sub     sp, SomeValue   ;Allocate local storage
                 .
                 .                      ;Statements that compute some value
                 .
                Yield                   ;Return "Success" and a return value 
                 .                      ; to FOREACH loop.
                 .
                 .
                pop     bp
                add     sp, 2           ;Remove success address
                ret     n               ;Return through failure 
                                        ; address and pop parameters.
MyIterator      endi

To begin with, iterators are always near procedures. Therefore, they must reside in the same segment as the FOREACH loop that calls them. This also means that if you've pushed any parameters on the stack, the last such parameter you've pushed will be at offset BP+6 on the stack (the failure address is at BP+4).

Upon entry into the iterator, you must push the BP register onto the stack and then copy SP's value into BP. The yield statement will not work if you fail to do this. Optionally, you can drop the stack pointer down in memory to reserve space for local variables.

When you exit the iterator, you must pop BP's value and then pop the "normal" (i.e., success) return address off the stack. An iterator has two return addresses: one address that that returns control to the body of the FOREACH loop (the success address) and one that transfers control to the first statement beyond the ENDFOR statement (the failure address). When an iterator "returns" to the calling code and removes the activation record, it is returning failure. Therefore, the code has to remove the success return address and then return through the failure address. If you've pushed any parameters onto the stack, you should remove them when returning.

The "Yield" statement is how you return "success" to the FOREACH loop. Typically, you would load a register (e.g., AX) with a function return value and then execute the Yield. When yield returns to the FOREACH loop, it maintains the activation record of the iterator (i.e., the parameters, local variables, and other state information on the stack). When you hit the ENDFOR statement, control continues with the first statement past the Yield that yielded control to the FOREACH loop.

Warning: the Yield statement modifies the value in the DX register. If you want to preserve other registers, you need to push them at the beginning of the iterator and immediately after a Yield call. You need to pop these returns immediately after a Yield and immediately before the failure return. Note that the iterator and Yield automatically preserve the value in BP. You must not attempt to preserve this value yourself. Example: The following code preserves the values of BX and CX across calls to the iterator.




MyIterator      iter
                push    bp
                mov     bp, sp
;Optional:      sub     sp, SomeValue   ;Allocate local storage
                push    bx              ;Preserve BX and CX values
                push    cx
                 .
                 .      ;Statements that compute some value
                 .
                pop     cx
                pop     bx
                Yield                   ;Return "Success" and a return
                                        ; value to FOREACH loop.
                push    bx
                push    cx
                 .
                 .
                 .
                pop     cx
                pop     bx
                pop     bp
                add     sp, 2           ;Remove success address
                ret     n               ;Return through failure address 
                                        ; and pop parameters.
MyIterator      endi



For more information about iterators, consult "The Art of Assembly Language Programming," Chapters Eleven and Twelve. You can find this text on-line at "http://webster.ucr.edu".


5.4.1 Calling Conventions and Assertions



ForEach:
Inputs:	Operand1: Iterator to call.
Outputs:	None.
Errors:	None.
Side Effects:	Sets up the stack and calls the specified iterator.
Assertions:	None.

EndFor:
Inputs:	None.
Outputs:	None.
Errors:	None.
Side Effects:	Resumes control back to the iterator after the last yield.
Assertions:	This statement occurs at the bottom of a FOREACH loop.

Yield:
Inputs:	None.
Outputs:	None.
Errors:	None.
Side Effects:	Puts a "resume frame" on the stack and returns success from an iterator back to the calling FOREACH loop.
Assertions:	You are currently executing inside some iterator.


5.4.2 Syntax and Examples

The following "Range" iterator expects starting and ending values on the stack. It successively yields values in the range start..stop and then fails. Note that an iterator calling the Range iterator is semantically identical to a standard FOR loop.




; iterator Range(start:integer; stop:integer):integer
;
; Returns iterator result in AX register.
;
; Stack contents:
;
; bp+8: Start
; bp+6: Stop
; bp+4: Failure address
; bp+2: Success address
; bp+0: Old BP value

Range           iter
                push    bp
                mov     bp, sp
RangeLp:        mov     ax, [bp+8]
                cmp     ax, [bp+6]
                jg      FailRange
                Yield
                inc     word ptr [bp+8]         ;Start := Start + 1;
                jmp     RangeLp

FailRange:      pop     bp
                add     sp, 2                   ;Pop Success address off stack.
                ret     4                       ;Pop Start, Stop off stack.
Range           endi
                 .
                 .
                 .

; Foreach AX in Range(1, 10)

                push    1
                push    10
                foreach Range
                print   "AX = "
                mov     cx, 2
                putisize
                putcr
                endfor

Note that you may have more than more than one yield statement in an iterator. The following iterator simply returns the values 1, 2, and 3, demonstrating this fact:



One23           iter
                push    bp
                mov     bp, sp

                mov     ax, 1
                yield
                mov     ax, 2
                yield
                mov     ax, 3
                yield

                pop     bp
                add     sp, 2   ;Pop Success address off stack.
                ret     4       ;Pop Start, Stop off stack.
One23           endi


5.5 Implementation of the FORLP..NEXT Macros

The FORLP and NEXT macros are somewhat complicated by the fact that one can nest for loops inside other for loops. Technically, this means that we need to use a context free grammar to specify the syntax for these loops and we need to use a PDA (push down automata) to implement the recognizer/code generator for these loops. Unfortunately, MASM's macro capabilities are insufficient for the job because a PDA requires a stack to maintain state information between the FORLP and NEXT statements; also, MASM's macros don't support the use of a stack for this purpose. Fortunately, it is easy enough to create our own stack.

To see why a stack is necessary, consider the following example:



                forlp   i, 1, 10
                 .
                 .
                 .
                next

The FORLP and NEXT macros convert this to the following sequence of machine instructions:




                mov     i, 0    ;Emitted by the FORLP macro.
ForLoop0:       inc     i       ;   "    "   "    "     "
                cmp     i, 10   ;   "    "   "    "     "
                jg      Next0   ;   "    "   "    "     "
                 .
                 .
                 .
                jmp     ForLoop0 ;Emitted by the NEXT macro.
Next0:                           ;   "     "  "    "    "

To prevent duplicate symbol errors, the FORLP and NEXT macros increment the numeric value they append to the "ForLoop" and "Next" labels appearing in the code above. Note a slight problem, though; they code emitted by the FORLP macro refers to the label generated by the NEXT macro. Likewise, the code the NEXT macro emits refers to the label output by the FORLP macro. Somehow these macros must communicate this information with each other.

One simple solution is to use a global (to the macros) assembly language symbol that contains the numeric value to append to the end of the "ForLoop" and "Next" labels. The "NEXT" macro would increment this symbol by one. This would guarantee unique labels if you use the FORLP..NEXT statement more than once in a procedure.

Unfortunately, this simple solution doesn't work if you nest one for loop inside another. Consider the following example and the code we would generate if we relied on the "NEXT" macro to increment a global counter for each for loop we have:



                forlp   i, 1, 10
                forlp   j, 0, 5
                 .
                 .
                 .
                next    ;j
                next    ;i

Generated code:
                mov     i, 0            ;Generated by
ForLoop0:       inc     i               ; forlp i, 1, 10
                cmp     i, 10
                jg      Next0
                
                mov     j, -1           ;Generated by
ForLoop0:       inc     j               ; forlp j, 0, 5
                cmp     j, 5
                jg      Next0
                 .
                 .
                 .
                jmp     ForLoop0        ;Generated by first NEXT
Next0:          
                jmp     ForLoop1        ;Generated by second NEXT
Next1:

As you can plainly see, this code won't even assemble without error, much less work properly. The correct code we would have liked these macros to generate should have looked something like the following:



                mov     i, 0            ;Generated by
ForLoop0:       inc     i               ; forlp i, 1, 10
                cmp     i, 10
                jg      Next0
                
                mov     j, -1           ;Generated by
ForLoop1:       inc     j               ; forlp j, 0, 5
                cmp     j, 5
                jg      Next1
                 .
                 .
                 .
                jmp     ForLoop1        ;Generated by first NEXT
Next1:          
                jmp     ForLoop0        ;Generated by second NEXT
Next0:

Note that the second NEXT (not the first) must generate the label and jump for the first FORLP instruction. In general, the last FORLP instruction matches the first NEXT statement. Likewise, the first FORLP instruction matches the last NEXT statement. This is a typical "last-in first-out" (LIFO) organization. As you may recall from your study of data structures, you use a stack to implement LIFO operations.

As pointed out earlier, MASM's macro facilities don't let us process context free constructs (those requiring a stack for implementation) directly. However, we can construct our own stack and use that. To create a stack, we'll just use a MASM symbol. Although MASM's symbols can only hold a single value, we can build a stack by using a string variable and concatenating data to the end of the string (to push data) and extracting data from the string (to pop data). The "macro.a" file contains several macros specifically designed to manipulate text equates in this fashion. In particular, the $PUSH macro lets you "push" a value onto a specified "stack" while the $POP macro pops a value from a user-defined text stack. So the first thing we'll need is a variable that will hold the stack the $PUSH and $POP macros will manipulate. In the FORLP source code, the following equate equate defines the stack and initializes it to the empty stack:




$ForStk         textequ <>                      ;Stack variable for FOR loops.


Once we have the stack out of the way, we can concentrate on the FORLP macro itself. This macro begins by checking to see if the "$ForCnt" symbol is defined. "$ForCnt" is the counter that keeps track of how may for loops we've processed. Its value is what the FORLP macro pushes on to the stack and this is the value that the FORLP and NEXT macros append to the ForLoop and Next labels. If the symbol does not already exist, the FORLP macro defines it and initializes it to zero. Otherwise, if the symbol is already defined, the FORLP macro increments its value by one for the new loop. After having done this, the macro pushes the current value of this variable onto the for loop stack ($ForStk).




ForLp           macro   LCV, start, stop        ;LCV="Loop Ctrl Var"
                local   ForLoop

                ifndef  $ForCnt
$ForCnt         =       0
                else
$ForCnt         =       $ForCnt + 1
                endif

                $push   $ForStk, <%$ForCnt>


The next section of the FORLP macro determines whether the starting value is a constant or a variable or if the loop control variable is a register. If the starting value is a numeric constant, the macro emits a move instruction that loads the value of the constant, minus one, into the loop control variable (this is true whether the loop control variable is a memory location or a register). If the starting value is not a constant, the macro checks the loop control variable to see if it's a register. If so, the macro generates code to move the starting value (presumably a memory location) into the register and then it decrements the register. If the starting value is not a constant and the loop control variable is not a register, then the FORLP macro copies the starting value into the AX register, decrements this value, and stores it into the loop control variable. The macro also emits a message to let the user know that this macro has modified the value of the AX register.




;; Emit the instructions to initialize the loop control variable.

                if      IsConst(Start)            ;If immediate constant
                mov     LCV, Start-1
                else
                if      IsReg(LCV)                ;If register and variable
                mov     LCV, Start
                dec     LCV
                else                              ;If two memory locations
                mov     ax, Start
                dec     ax
                mov     LCV, ax
                echo    Warning! FORLP macro modifies AX
                endif
                endif


After emitting the initialization code, the FORLP macro constructs the "$ForLoop" variable for this loop by concatenating the textual equivalent of ForCnt's value to the end of the string "$ForLoop". The macro emits this label to the source file. After emiting the label, the FORLP macro emits an instruction to increment the loop control variable (remember, the initialization code above initializes the loop control variable with the starting value minus one, so executing this increment instruction the first time through produces the true initial value).




;; Output Loop Label:

ForLoop         catstr  <$ForLoop>, %$ForCnt
&ForLoop&:

                inc     LCV                     ;Bump up for next value.

The final segment of the FORLP macro emits the code that checks to see if the for loop should terminate. This code, like the initialization code, is different depending upon whether the termination value is a constant (or the loop control variable is a register) or if the termination value is a variable. If the termination value is a variable and the loop control variable is not a register, then the emitted code uses the AX register to do the comparison; otherwise, the emitted code directly compares the loop control variable against the termination value. The macro emits a JG to the $Nextx label, where x is the current value for the $ForCnt symbol.



;; Output test to see if the loop is done:

                if      IsConst(Stop) or IsReg(LCV)
                cmp     LCV, Stop
                else
                mov     ax, LCV
                cmp     ax, Stop
                endif
                jg      @catstr(<$Next>, %$ForCnt)

                endm



Compared to the FORLP macro, the NEXT macro is a piece of cake. First it pops a value off the $ForStk stack and then emits a jump to the $ForLoopx label where x is the value is popped off the $ForStk stack. It also emits the $Nextx label after the jump (x, once again, is the value popped off $ForStk). The $Nextx label is the target of the loop termination jump emitted by the FORLP macro.



; Here is the NEXT macro:


Next            macro
                local   ForDone

                $pop    $ForStk, ForDone

                jmp     @catstr(<$ForLoop>, %&ForDone&)
ForDone         catstr  <$Next>, %ForDone
&ForDone&:
                endm


5.6 Implementation of the FOREACH Macros

The FOREACH macro family consists of the ForEach macro, the EndFor macro, the Yield macro, and the iter/endi text equates.

The "iter" and "endi" text equates let you use the "iter" and "endi" lexemes in place of "proc" and "endp" when defining iterators. Since iterator support in the Standard Library requires the use of a near procedure, the text equate substitutes "proc near" for "iter" in your code.



iter            textequ <proc near>
endi            textequ <endp>

$ForEachStk     textequ <>


The ForEach macro begins in a manner quite similar to the FORLP macro. First it checks to see the symbol $ForCnt is define (if not, it defines it and sets it to zero, if so, it increments its value). Next it pushes the current value for $ForCnt onto the $ForEachStk stack variable. Then it pushes the address of the (synthesized) failure label onto the stack and calls the specified iterator function (passed as a macro parameter to FOREACH).




ForEach         macro   Iterator
                local   Failure

                ifndef  $ForCnt
$ForCnt         =       0
                else
$ForCnt         =       $ForCnt + 1
                endif

                $push   $ForEachStk, %$ForCnt
Failure         catstr  <$Failure>, %$ForCnt

                push    offset &Failure&
                call    Iterator
                endm


The EndFor macro emits a return near instruction (that returns control to the iterator) and then emits the Failure label referenced by the ForEach macro. It also pops the failure label's numeric suffix from the $ForEachStk stack variable.





EndFor          macro
                local   Failure

                retn
                $pop    $ForEachStk, Failure
@catstr(<$Failure>,<%Failure>):
                endm


The yield macro is invoked inside an iterator to return control to the ForEach loop. For details on how iterators work, you should consult chapters Eleven and Twelve in "The Art of Assembly Language Programming." You can find this text on-line at http://webster.ucr.edu




Yield           macro

                ifndef  $YieldedOnce
                echo    Warning: Yield modifies DX!
$YieldedOnce    =       0
                endif

                push    bp
                mov     dx, [bp+2]
                mov     bp, [bp]
                call    dx
                pop     bp
                endm