HLA Strings and the HLA String Library

Chapter Overview

A string is a collection of objects stored in contiguous memory locations. Strings are usually arrays of bytes, words, or (on 80386 and later processors) double words. The 80x86 microprocessor family supports several instructions specifically designed to cope with strings. This chapter explores some of the uses of these string instructions.

The 80x86 CPUs can process three types of strings: byte strings , word strings, and double word strings. They can move strings, compare strings, search for a specific value within a string, initialize a string to a fixed value, and do other primitive operations on strings. The 80x86's string instructions are also useful for manipulating arrays, tables, and records. You can easily assign or compare such data structures using the string instructions. Using string instructions may speed up your array manipulation code considerably.

Character Strings

Since you'll encounter character strings more often than other types of strings, they deserve special attention. The following paragraphs describe character strings and various types of string operations.

At the most basic level, the 80x86's string instruction only operate upon arrays of characters. However, since most string data types contain an array of characters as a component, the 80x86's string instructions are handy for manipulating that portion of the string.

Probably the biggest difference between a character string and an array of characters is the length attribute. An array of characters contains a fixed number of characters. Never any more, never any less. A character string, however, has a dynamic run-time length, that is, the number of characters contained in the string at some point in the program. Character strings, unlike arrays of characters, have the ability to change their size during execution (within certain limits, of course).

To complicate things even more, there are two generic types of strings: statically allocated strings and dynamically allocated strings. Statically allocated strings are given a fixed, maximum length at program creation time. The length of the string may vary at run-time, but only between zero and this maximum length. Most systems allocate and deallocate dynamically allocated strings in a memory pool when using strings. Such strings may be any length (up to some reasonable maximum value). Accessing such strings is less efficient than accessing statically allocated strings. Furthermore, garbage collection1 may take additional time. Nevertheless, dynamically allocated strings are much more space efficient than statically allocated strings and, in some instances, accessing dynamically allocated strings is faster as well.

A string with a dynamic length needs some way of keeping track of this length. While there are several possible ways to represent string lengths, the two most popular are length-prefixed strings and zero-terminated strings. A length-prefixed string consists of a single byte, word, or double word that contains the length of that string. Immediately following this length value, are the characters that make up the string. Assuming the use of byte prefix lengths, you could define the string "HELLO" as follows:

byte 5, "HELLO";

 

Length-prefixed strings are often called Pascal strings since this is the type of string variable supported by most versions of Pascal2.

Another popular way to specify string lengths is to use zero-terminated strings. A zero-terminated string consists of a string of characters terminated with a zero byte. These types of strings are often called C-strings since they are the type used by the C/C++ programming language. If you are manually creating string values, zero terminated strings are a little easier to deal with because you don't have to count the characters in the string. Here's an example of a zero terminated string:

byte "HELLO", 0;

 

Pascal strings are much better than C/C++ strings for several reasons. First, computing the length of a Pascal string is trivial. You need only fetch the first byte (or word) of the string and you've got the length of the string. Computing the length of a C/C++ string is considerably less efficient. You must scan the entire string (e.g., using the SCASB instruction) for a zero byte. If the C/C++ string is long, this can take a long time. Furthermore, C/C++ strings cannot contain the NULL character. On the other hand, C/C++ strings can be any length, yet require only a single extra byte of overhead. Pascal strings, however, can be no longer than 255 characters when using only a single length byte. For strings longer than 255 bytes, you'll need two or more bytes to hold the length for a Pascal string. Since most strings are less than 256 characters in length, this isn't much of a disadvantage.

Common string functions like concatenation, length, substring, index, and others are much easier to write (and much more efficient) when using length-prefixed strings. So from a performance point of view, length-prefixed strings seem to be the better way to go. However, Windows requires the use of zero-terminated strings; so if you're going to call win32 APIs, you've either got to use zero-terminated strings or convert them before each call.

HLA takes a different approach. HLA's strings are both length-prefixed and zero terminated. Therefore, HLA strings require a few extra bytes but enjoy the advantages of both schemes. HLA's string functions like concatenation are very efficient without losing Windows compatibility.

HLA's strings are actually an extension of length prefixed strings because HLA's strings actually contain two lengths: a maximum length and a dynamic length. The dynamic length field is similar to the length field of Pascal strings insofar as it holds the current number of characters in the strring. HLA's length field, however, is four bytes so HLA strings may contain over four billion characters. The static length field holds the maximum number of characters the string may contain. By adding this extra field HLA can check the validity of operations like string concatenation and string assignment to verify that the destination string is large enough to hold the result. This is an extra integrity check that is often missing in string libraries found in typical high level languages.

In addition to providing two lengths, HLA also zero terminates its strings. This lets you pass HLA strings as parameters to Win32 and other functions that work with zero-terminated strings. Also, in those few instances where zero-terminated strings are more convenient, HLA's string format still shines. Of course, the drawback to zero-terminated strings is that you cannot put the NUL character (ASCII code zero) into such a string, fortunately the need to do so is not very great.

HLA's strings actually have another few attributes that improve their efficiency. First of all, HLA almost always aligns string data on double word boundaries. HLA also allocates data for a string in four-byte chunks. By aligning strings on double word boundaries and allocating storage that is an even multiple of four bytes long, HLA allows you to use double word string instructions when processing strings. Since the double word instructions are often four times faster than the byte versions, this is an important benefit. As a result of this storage and alignment, HLA's string library routines are very efficient.

Of course, HLA strings are not without their disadvantages. To represent a string containing n characters requires between n+9 and n+12 bytes in memory. HLA's strings require at least n+9 bytes because of the two double word length values and the zero terminating byte. Furthermore, since the entire object must be an even multiple of four bytes long, HLA strings may need up to three bytes of padding to ensure this.

HLA string variables are always pointers. HLA even treats string constants as literal pointer constants. The pointer points at the first byte of the character string. Successive memory locations contain successive characters in the string up to the zero terminating byte. This format is compatible with zero-terminated strings like those that C/C++ uses. The dynamic (current) length field is situated four bytes before the first character in the string (that is, at the pointer address minus four). The maximum (static) length field appears eight bytes before the first character of the string. See HLA String Format. shows the HLA string format.

 
HLA String Format

For more information on strings and HLA strings, see the chapter on character strings in AoA.

HLA Standard Library String Functions

The HLA Standard Library contains a large number of efficient string functions that perform all the common string operations, and then some. This section discusses the HLA string functions and suggests some uses for many of these functions and other objects.

The stralloc and strfree Routines

procedure stralloc( strsize: uns32 ); returns( "eax" );

procedure strfree( strToFree: string );

 

This text has already discussed the stralloc and strfree routines in the chapter on character strings, but a review is probably useful here. These routines dynamically allocate and deallocate storage for a string object in memory. They are the principle mechanism HLA provides for allocating storage for string variables. Therefore, you need to be comfortable using these procedures.

The first thing to note about these routines is that they are not actually a part of the HLA String Library. They are actually members of the memory allocation package in the HLA Standard Library. The reason for mentioning this fact is just to point out that the names of these routines are stralloc and strfree. Most of the routines in the HLA Standard Library belong to the str namespace and, therefore, have names like str.cpy and str.length. Note that most HLA string function names have a period between the str and the base function name; this is not true for stralloc and strfree since they are not a part of the HLA string package3.

The stralloc parameter specifies the maximum number of characters for the string it allocates. The stralloc routine allocates at least enough storage for this many characters plus the 9-12 bytes of overhead required for a string object. It initializes the MaxStrLen field to at least strsize (it could be as large as strsize+3 depending on strsize and the need for padding bytes in the string object). This function also initializes the length field to zero and stores a zero byte in the first character position of the string data (that is, it zero terminates the empty string it creates). Since the other HLA string functions require double word aligned strings, stralloc returns a pointer that points at a double word boundary.

Upon return from stralloc, the EAX register contains the address of the string object. Generally you would store this 32-bit pointer into a string variable or pass it on to some other function that needs the address of a string object. Like any other string pointer, the value stralloc returns points at the first character position in the storage it allocates.

Internally, the stralloc routine calls malloc to allocate the storage for the string data on the heap. However, the pointer that stralloc returns is not the same value that malloc returns. This is because string objects require an eight-byte prefix that holds the MaxStrLen and length fields. Therefore, stralloc actually returns a pointer that is eight bytes beyond the value that the internal call to malloc returns. Therefore, you cannot call the free procedure to return this string storage to the heap because free requires a pointer to the beginning of the storage that malloc allocates4. Instead, call the strfree routine to return string object storage to the system. The strfree's parameter is the address of a string object that you allocated with stralloc.

Note that you must not use strfree to attempt to free storage for objects that you do not allocate (directly or indirectly) with stralloc. In particular, do not attempt to free statically initialized strings or strings you create with str.strvar.

Many of the HLA Standard Library string routines begin with a name of the form "str.a_*****". This "a_" prefix on the function name indicates that the string function automatically allocates storage for a new string by calling stralloc. These functions typically return a pointer to the new string in the EAX register, just like stralloc. When you are done with the string these functions create, you can free the storage for the string by calling strfree.

 

 

// stralloc and strfree demonstration program.

 

program str_alloc_free_demo;

#include( "stdlib.hhf" )

 

static

str1 :string;

str2 :string;

 

begin str_alloc_free_demo;

 

// Allocate a string with a maximum length of 16 characters.

 

stralloc( 16 );

mov( eax, str1 );

 

// Initialize this string with the str.cpy routine:

 

str.cpy( "Hello ", str1 );

 

// Allocate storage for a second string with 16 characters

// and initialize the string data:

 

stralloc( 16 );

mov( eax, str2 );

str.cpy( "world", str2 );

 

// Concatenate the two strings and print the pertinent data:

 

str.cat( str1, str2 );

 

mov( str1, ebx );

mov( [ebx-4], ecx ); // Get the current string length.

mov( [ebx-8], edx ); // Get the maximum string length.

 

stdout.put

(

"str1='",

str1,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

nl

);

 

mov( str2, ebx );

mov( [ebx-4], ecx ); // Get the current string length.

mov( [ebx-8], edx ); // Get the maximum string length.

 

stdout.put

(

"str2='",

str2,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

nl

);

 

// Okay, we're done with the strings, free the storage

// associated with them:

 

strfree( str1 );

strfree( str2 );

 

end str_alloc_free_demo;

 

Example of stralloc and strfree Calls
The str.strRec Data Structure

The str.strRec data structure lets you directly access the maximum and current length prefix values of an HLA string. This allows you to use symbolic (and meaningful) names to access these fields rather than using numeric offsets like -4 and -8. By using str.strRec you don't have to remember which offset is associated with the two different length values.

The str.strRec type definition is a RECORD with the following fields:

MaxStrLen

length

strData

 

The MaxStrLen field (obviously) specifies the offset (-8) of the maximum string length double word in a string. The length field specifies the offset (-4) to the current dynamic length field. The strData field specifies the offseet (0) of the first character in the string; generally, you do not use this last field because accessing the character data in a string is trivial (your string variable points directly at the first character in the string).

Generally, you use the str.strRec type to coerce a string pointer appearing in a 32-bit register. For example, if EAX contains the address of an HLA string variable, then "mov( (type str.strRec [eax]).length, ecx );" extracts the current string length. In theory, you could use this type to declare string headers, but no one really uses this data type for that purpose; instead, this type exists mainly as a mechanism for type coercion. The following sample program is a modification of the previous program that uses str.strRec rather than literal numeric offsets.

 

// str.strRec demonstration program.

 

program strRec_demo;

#include( "stdlib.hhf" )

 

static

str1 :string;

str2 :string;

 

begin strRec_demo;

 

// Allocate a string with a maximum length of 16 characters.

 

stralloc( 16 );

mov( eax, str2 );

 

// Initialize this string with the str.cpy routine:

 

str.cpy( "Hello ", str2 );

 

// Allocate storage for a second string with 16 characters

// and initialize the string data:

 

stralloc( 16 );

mov( eax, str1 );

str.cpy( "world", str1 );

 

// Concatenate the two strings and print the pertinent data:

 

str.cat( str1, str2 );

 

mov( str1, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the max str len

 

stdout.put

(

"str1='",

str1,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

nl

);

 

mov( str2, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the max str len

 

stdout.put

(

"str2='",

str2,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

nl

);

 

// Free the storage associated with these strings:

 

strfree( str1 );

strfree( str2 );

 

end strRec_demo;

 

Programming Example that uses the str.strRec Data Type
The str.strvar Macro

The str.strvar macro statically allocates storage for a string in the STATIC variable declaration section (you cannot use str.strvar in any of the other variable declaration sections). This provides a convenient mechanism for declaring static strings when you know the maximum size at compile-time.

Example:

static

StaticString: str.strvar( 32 );

 

This macro invocation does two things: (1) it reserves sufficient storage for a string that can hold at least 32 characters (plus an additional nine bytes for the string overhead); (2) it allocates storage for a string pointer variable and initializes that variable with the address of the string storage. When you reference the object named StaticString you are actually accessing this pointer variable.

Note that str.strvar uses parentheses rather than square brackets to specify the string size. Syntactically, square brackets would be nice since this gives the illusion of declaring an array of characters. However, str.strvar is a macro and the character count is a parameter; macro parameters always appear within parentheses, so you must use parentheses in this declaration.

 

// str.strvar demonstration program.

 

program strvar_demo;

#include( "stdlib.hhf" )

 

static

demoStr :str.strvar( 16 );

 

begin strvar_demo;

 

// Initialize our string via str.a_cpy (note that a_cpy automatically

// allocates storage for the string on the heap):

 

str.cpy( "Hello World", demoStr );

 

mov( demoStr, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the current str len

 

stdout.put

(

"demoStr='",

demoStr,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

nl

);

 

 

end strvar_demo;

 

Program that Demonstrates the use of the str.strvar Declaration
The str.length Function and the str.mLength Macro

procedure str.length( s:string ); returns( "eax" );

macro str.mLength( s ); // s must be a 32-bit register or a string variable

 

The str.length function and str.mLength macro compute the length of an HLA string and copy this length into the EAX register. The macro version (str.mLength) is more efficient since it compiles into a single MOV instruction (accessing the str.strRec.length field directly). For this reason you should generally use the macro (str.mLength) to compute the length rather than the str.length function. You should only use the str.length function when you need procedure call semantics (e.g., when you need to pass the address of the length function to some other procedure).

You may question why HLA even provides a length function. After all, extracting the string's length using the str.strRec type definition is easy enough to do. The principle reason HLA provides a length function is because "str.length(s)" is much easier to read and understand than "mov( (type str.strRec [eax]).length, eax);" Of course, the str.mLength function compiles directly into this instruction, so there is no efficiency reason for using the direct access mechanism. The only time you should really use the str.strRec RECORD type is when you need to move the string length into a register other than EAX.

The str.length and str.mLength parameters must be a string variable or a 32-bit register (which, presumably, contains the address of a string in memory). Remember, string variables are really nothing more than pointers, so when you pass a string variable as a parameter to an HLA string function, HLA passes the value of that pointer which happens to be the address of the first character in the string.

There is a big difference between the two calls "str.length( eax );" and "str.length( (type string [eax]) );" The first call assumes that EAX contains the value of a string pointer (that is, EAX points directly at the first character of the actual string); in this first example, HLA simply passes the value in the EAX register to the str.length function. In the second example, "str.length( (type string [eax]) );" , HLA assumes that EAX contains the address of a string variable (which is a pointer) and passes the 32-bit address at the location contained within EAX. In this example, EAX is a pointer to a string variable rather than the string itself.

Computing the length of a string is one of the most common string operations. In fact, length computation is probably the most oft-used string functions in a string library since most of the other string functions need to compute the string length in order to do their work. This is why HLA's length-prefixed string data structure is so important- computing the string length is a common operation and length-prefixed strings make this computation trivial.

 

// str.length demonstration program.

 

program strlength_demo;

#include( "stdlib.hhf" )

 

static

demoStr :string;

 

begin strlength_demo;

 

// Initialize our string via str.a_cpy (note that a_cpy automatically

// allocates storage for the string on the heap):

 

str.a_cpy( "Hello World" );

mov( eax, demoStr );

 

mov( eax, ebx );

str.mLength( ebx ); // Can use a register or str var with str.mLength.

mov( eax, ecx );

str.length( demoStr ); // Can use a register or str var with str.length.

 

stdout.put

(

"demoStr='",

demoStr,

"', length via str.mLength=",

(type uns32 ecx ),

", length via str.length=",

(type uns32 eax),

nl

);

 

// Free the storage allocated by the str.a_cpy procedure:

 

strfree( demoStr );

 

 

end strlength_demo;

 

Example of str.length and str.mLength Function Calls
The str.init Function

procedure str.init( var b:byte; numBytes:dword ); returns( "eax" );

There are four ways you can allocate storage for an HLA compatible string: you can use the str.strvar macro (see The str.strvar Macro) to statically allocate storage for a string, you can initialize a string variable in a STATIC or READONLY section, you can dynamically allocate storage using a function like stralloc, or you can manually reserve the storage yourself. To manually reserve storage you must set aside enough storage for the string, the maximum length, the current length, the zero terminating byte, and any necessary padding bytes. You must also ensure that the string begins on a double word boundary and that the entire structure's byte count is an even multiple of four5. After you reserve sufficient storage, you must also initialize the MaxStrLen and length fields and supply a zero terminating byte for the string. This turns out to be quite a bit of work. Fortunately, the str.init function takes care of most of this work for you.

This function initializes a block of memory for use as a string object. It takes the address of a character array variable b and aligns this address to a double word boundary. Then it initializes the MaxStrLen, length, and zero terminating byte fields at the resulting address. Finally, it returns a pointer to the newly created string object in EAX. The numBytes field specifies the size of the entire buffer area, not the desired maximum length of the string. The numBytes field must be 16 or greater, else this routine will raise an ex.ValueOutOfRange exception. Note that string initialization may consume as many as 15 bytes (up to three bytes to align the address on a double word boundary, four bytes for the MaxStrLen field, four bytes for the length field, and the string data area must be a multiple of four bytes long (including the zero terminating byte). This is why the numBytes field must be 16 or greater. Note that this function initializes the resulting string to the empty string. The MaxStrLen field will contain the maxium number of characters that you can store into the resulting string after subtracting the zero terminating byte, the sizes of the length fields, and any alignment bytes that were necessary.

In general, if you want the maximum string length to be at least m characters, you should reserve m+16 bytes and pass the address of this buffer to str.init. Note that the actual maximum length HLA writes to the MaxStrLen field is the maximum number of characters one could legally put into the string (after subtracting the overhead and padding bytes). If you need to set a specific MaxStrLength value of exactly m, then allocate m+16 bytes of storage, call str.init (passing the address of the buffer and m+16), and then store m into the MaxStrLen field upon return from str.init.

 

// str.init demonstration program.

 

program strinit_demo;

#include( "stdlib.hhf" )

 

static

theStr :string;

unalign :byte; // Do this so strData is not dword aligned.

strData :byte[ 48 ]; // Storage for a string with 32 characters.

 

begin strinit_demo;

 

// Create a string variable using the "strData" array to hold the

// string data:

 

str.init( strData, 48 );

mov( eax, theStr );

 

// Initialize our string via str.cpy:

 

str.cpy( "Hello there World, how are you?", theStr );

 

mov( theStr, ebx );

str.length( ebx );

mov( (type str.strRec [ebx]).MaxStrLen, edx );

lea( esi, strData );

 

stdout.put

(

"theStr='",

theStr,

"', length=",

(type uns32 ecx ),

", max length=",

(type uns32 edx),

nl,

"Address of strData: ",

esi,

nl,

"Address of start of string data: ",

(type dword theStr),

nl

);

 

 

end strinit_demo;

 

Programming Example that uses the str.init Function
The str.cpy and str.a_cpy Functions

procedure str.cpy( src:string; dest:string );

procedure str.a_cpy( src:string ); returns( "eax" );

 

The str.cpy routine copies the character data from one string to another and adjusts the destination string's length field accordingly. The destination string's maximum string length must be at least as large as the current size of the source string or str.cpy will raise a string overflow exception. Before calling this routine, you must ensure that both strings have storage allocated for them or the program will raise an exception. Note that simply declaring a destination string variable does not allocate storage for the string object. You must call stralloc or somehow otherwise allocate data storage for the string. Failing to allocate storage for the destination string is probably the most common mistake beginning programmers make when calling the str.cpy routine.

Note that there is a fundamental difference between the following two code sequences:

mov( srcStr, eax );

mov( eax, destStr );

 

and

 

str.cpy( srcStr, destStr );

 

The two MOV instructions above copy a string by reference whereas the call to str.cpy copies the string by value. Usually, copying a string by reference is much faster than copying the string by value, since you need only copy four bytes (the string pointer) when copying by reference. Copy by value, on the other hand, requires copying the length value (four bytes), each character in the string (length bytes), plus a zero terminating byte. This is slower than simply copying a pointer and can be much slower if the string is long. However, keep in mind that if you copy a string by reference, then the two string objects are aliases of one another. Any change to you make to one of the strings is reflected in the other. When you copy a string by value (using str.cpy), each string variable has its own data, so changes to one string will not affect the other.

Although str.cpy does not automatically allocate storage for the destination string, the need to do this arises quite often. The str.a_cpy handles this common requirement. As you can see above, the str.a_cpy routine does not have a destination operand. Instead, str.a_cpy calls stralloc to allocate sufficient storage for a new string and copies the source string to this new string. After copying the data, str.a_cpy returns a pointer to the new string in the EAX register. When you are done with this string data you should call strfree to return the storage back to the system.

 

// str.cpy demonstration program.

 

program strcpy_demo;

#include( "stdlib.hhf" )

 

static

strConst :string := "This is a string";

srcStr :str.strvar( 32 );

destStr :string;

smallStr :str.strvar( 12 );

 

begin strcpy_demo;

 

// Use str.cpy to initialize srcStr by copying the

// static string constant <<strConst>> to srcStr.

 

str.cpy( strConst, srcStr );

str.length( srcStr );

stdout.put

(

"srcStr='",

srcStr,

"', length=",

(type uns32 eax ),

nl

);

 

// Okay, now use str.a_cpy to make a copy of srcStr

// whose storage is dynamically allocated on the heap:

 

str.a_cpy( srcStr );

mov( eax, destStr );

str.mLength( eax );

stdout.put

(

"destStr='",

destStr,

"', length=",

(type uns32 eax ),

nl

);

 

// Now let's demonstrate what can go wrong if a string

// overflow occurs:

 

try

 

str.cpy( srcStr, smallStr );

 

anyexception

 

stdout.put( "An exception occured while copying srcStr to smallStr" nl );

 

endtry;

 

// Don't forget to free the storage associated with destStr:

 

strfree( destStr );

 

end strcpy_demo;

 

Program that uses the str.cpy and str.a_cpy Procedures
The str.cat and str.a_cat Functions

procedure str.cat( src: string; dest: string );

procedure str.a_cat( leftSrc: string; rightSrc: string ); returns( "eax" );

 

These two functions concatenate two strings. The str.cat procedure directly concatenates one string to the end of the destination string (that the second parameter specifies). The str.a_cat procedure creates a new string on the heap (by calling stralloc) and copies the string the first parameter specifies to this new string. Immediately thereafter, it concatenates the string object the second parameter specifies to the end of this new string. Finally, str.a_cat returns the address of the new string in the EAX register. Note that str.a_cat, unlike str.cat, does not affect the value of either string appearing in the parameter list. When you finish using the string that str.a_cat allocates, you can return the storage to the system by passing the address to strfree.

String concatenation is easily one of the most common string operations (the others being string copy and string comparison). Concatenation is a fundamental operation that you use to build larger strings up from smaller strings. A few common examples of string concatenation include applying suffixes (like ".HLA") to filenames and merging a person's first and last names together to form a single string.

 

 

Examples of the str.cat and str.a_cat Procedures
The String Comparison Routines

procedure str.eq( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ne( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.lt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.le( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.gt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ge( lftOperand: string; rtOperand: string ); returns( "al" );

 

procedure str.ieq( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ine( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ilt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ile( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.igt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ige( lftOperand: string; rtOperand: string ); returns( "al" );

 

These procedures compare two strings. They are equivalent to the boolean expression:

lftOperand op rtOperand

 

where op represents one of the relational operators "=", "<>" ("!=" to C programmers), "<", "<=", ">", or ">=". These functions return true (1) or false (0) in the EAX register depending upon the result of the comparison6. For example, "str.lt( s, r );" returns true in EAX if s < r, it returns false otherwise. This feature lets you use these procedures as boolean expression. The following example shows how you could use str.lt in an IF statement:

if( str.lt( s, r )) then

 

-- do something if s < r --

 

endif;

 

As you've probably noticed, there are two different sets of string comparison functions. Those that have names of the form "str.i**" do case insensitive string comparisons. That is, these functions compare the strings ignoring differences in alphabetic case. For example, these functions treat "Hello" and "hello" as through they were the same string. Note that case insensitive comparisons are relatively inefficient compared with case sensitive comparisons, so you should only use these forms if you absolutely need a case insensitive comparison.

These functions do not modify their parameters.

 

// String comparisons demonstration program.

 

program strcmp_demo;

#include( "stdlib.hhf" )

 

static

str1 :string := "abcdefg";

str2 :string := "hijklmn";

str3 :string := "AbCdEfG";

 

procedure cmpStrs( s1:string; s2:string ); nodisplay;

var

eq :boolean;

ne :boolean;

lt :boolean;

le :boolean;

ge :boolean;

gt :boolean;

begin cmpStrs;

stdout.put( nl "String #1 = '", s1, "'" nl );

stdout.put( "String #2 = '", s2, "'" nl nl );

 

str.eq( s1, s2 ); mov( al, eq );

str.ne( s1, s2 ); mov( al, ne );

str.lt( s1, s2 ); mov( al, lt );

str.le( s1, s2 ); mov( al, le );

str.ge( s1, s2 ); mov( al, ge );

str.gt( s1, s2 ); mov( al, gt );

stdout.put

(

"eq = ", eq, nl,

"ne = ", ne, nl,

"lt = ", lt, nl,

"le = ", le, nl,

"ge = ", ge, nl,

"gt = ", gt, nl

);

 

str.ieq( s1, s2 ); mov( al, eq );

str.ine( s1, s2 ); mov( al, ne );

str.ilt( s1, s2 ); mov( al, lt );

str.ile( s1, s2 ); mov( al, le );

str.ige( s1, s2 ); mov( al, ge );

str.igt( s1, s2 ); mov( al, gt );

stdout.put

(

"ieq = ", eq, nl,

"ine = ", ne, nl,

"ilt = ", lt, nl,

"ile = ", le, nl,

"ige = ", ge, nl,

"igt = ", gt, nl

);

end cmpStrs;

begin strcmp_demo;

 

cmpStrs( str1, str2 );

stdout.put( nl "------------------------------" nl );

 

cmpStrs( str1, str3 );

stdout.put( nl "------------------------------" nl );

 

cmpStrs( str2, str3 );

stdout.put( nl "------------------------------" nl );

 

 

cmpStrs( str2, str1 );

stdout.put( nl "------------------------------" nl );

 

cmpStrs( str3, str1 );

stdout.put( nl "------------------------------" nl );

 

cmpStrs( str3, str2 );

stdout.put( nl "------------------------------" nl );

 

 

end strcmp_demo;

 

Examples of the HLA String Comparison Functions
The str.prefix and str.prefix2 Functions

procedure str.prefix( src:string; prefixStr:string ); returns( "al" );

procedure str.prefix2( src:string; offs:uns32; prefixStr:dword );

returns("al");

 

The str.prefix and str.prefix2 functions are similar to str.eq insofar as they compare two strings and return true or false based on the comparison. Unlike str.eq, however, these two functions return true if one string begins with the other (that is, if the second string is a prefix of the first string).

The str.prefix compares prefixStr against src. If prefixStr is equal to src, or the src string begins with the characters in prefixStr and contains additional characters, then the str.prefix function returns true in EAX7. If the src string does not begin with the characters in prefixStr, then str.prefix returns false.

The str.prefix2 function lets you specify a starting index within the src string where this function begins searching for the src string.

 
Examples of the HLA str.prefix Function
The str.substr and str.a_substr Functions

procedure str.substr( src:string; dest:string; index:dword; len:dword );

procedure str.a_substr( src:string; index:dword; len:dword ); returns("eax");

 

The str.cat and str.a_cat procedures let you assemble different strings to produce larger strings; the str.substr and str.a_substr function do the converse - they let you disassemble strings by extraction small substrings from a larger string. The substring functions are another set of very common string operations. Programs that do a bit of string manipulation will probably use the substring functions in addition to the copy and concatenation functions.

Like all the HLA string functions that produce a string result, the substring functions come in two flavors: one that stores the resulting substring into a string object you've preallocated (str.substr) and a second form that automatically allocates storage on the heap for the result (str.a_substr). As usual, this second form returns a pointer to the new string in EAX and you should recover this storage by calling strfree when you're done using the string data.

The substring functions extract a portion of an existing string by specifying the starting character position in the string (the index parameter) and the length of the resulting string (the length parameter). The index parameter specifies the zero-based index of the first character to copy into the substring. That is, if index contains zero then the substring functions begin copying the string data starting with the first character of the string; likewise, if index contains five, then the substring functions begin copying the string data with the sixth character in the source string. The value of the index parameter must be between zero and the current length of the source string minus one. The substring functions will raise an exception if index is outside this range.

The length parameter specifies the length of the destination string; that is, it specifies how many characters to copy from the source string to the destination string. If the sum of index+length exceeds the current length of the source string, then the substring functions only copy the data from location index to the end of the source string; in particular, these functions do not raise an exception if index's value is okay but the sum of index and length exceeds the length of the source string. You can take advantage of this fact to copy all the characters from some point in a string to the end of that string by specifying a really large value for the length parameter; the convention is to use -1, which is $FFFF_FFFF (the largest possible unsigned integer), for this purpose.

The str.substr function copies the substring data to the string object specified by the dest parameter. This string must have sufficient storage to hold a string whose maximum length is length characters (or from position index to the end of the source string if the sum index+sum exceeds the source string length). The str.substr function updates the destination string's length field (but does not change the MaxStrLen field) and zero terminates the resulting string.

The str.a_substr doesn't have a destination string parameter. Instead, this function allocates storage for the destination string on the heap, copies the substring to the new string object, and then returns a pointer to this string object in the EAX register. When allocating storage for the new string, the str.a_substr function allocates just enough storage to hold the string and the necessary overhead bytes (between nine and twelve bytes). This function will not raise a string overflow error since it always allocates sufficient storage to hold the destination string (note however, that a memory allocation failure can raise an exception).

Note that it is perfectly possible, and reasonable, to specify zero as the length parameter for these substring functions. Doing so will extract a zero length (empty) string from the source string.

A common use of the substring functions is to extract words, numbers, or other special sequences of characters from a string. To do this you must first locate the start of the special sequence in the string and then determine the length of that special sequence; then you can use one of the substring functions to easily extract the sequence of characters you want from the string. This is such a common operation that HLA provides a set of special routines that automatically extract such sequences for you. Details on these functions appear later in this chapter (see The str.tokenize and str.tokenize2 Functions).

 

// str.substr demonstration program.

 

program substr_demo;

#include( "stdlib.hhf" )

 

static

aLongStr :string := "Extract a substring from this one";

subStr1 :string;

subStr2 :string;

 

begin substr_demo;

 

// Allocate storage to hold the first substring:

 

mov( stralloc( 64 ), subStr1 );

// Extract the word "Extract" from the string above:

 

str.substr( aLongStr, 0, 7, subStr1 );

stdout.put( "Extracting 'Extract' = <", subStr1, ">" nl );

 

// Extract all the different string lengths from the string above:

 

stdout.put( nl "str.substr demonstration:" nl nl );

 

mov( str.length( aLongStr ), edx );

for( mov( 0, ecx ); ecx < edx; inc( ecx ) ) do

 

str.substr( aLongStr, 0, ecx, subStr1 );

stdout.put( "'", subStr1, "'" nl );

 

endfor;

stdout.newln();

 

// Demonstrate the use of str.a_substr and exceeding the string length:

 

str.a_substr( aLongStr, 30, 100 );

mov( eax, subStr2 );

stdout.put( "End of the string is '", subStr2, "'" nl );

 

strfree( subStr2 ); // Free the storage allocated by str.a_substr

 

// Demonstrate what happens if the index exceeds the string's bounds

 

try

 

str.substr( aLongStr, 64, 4 );

 

// We won't get here

 

anyexception

 

stdout.put

(

"Exception occured when indexing beyond the length of aLongStr"

nl

);

 

endtry;

 

end substr_demo;

 

Using the str.substr and str.a_substr Procedures
The str.insert and str.a_insert Functions

procedure str.insert( src:string; dest:string; index:dword );

procedure str.a_insert( src:string; in_to:string; index:dword ); returns("eax");

 

These two functions insert a source string into a destination string. Unlike the concatenation functions, these routines let you insert the source string into the destination string at any character position, not just at the end of the string. Therefore, these functions are a generalization of the string concatenation operation.

The str.insert function inserts a copy of the src string into the dest string starting at character position index in the destination. The index value must be in the range 0..str.length(dest) or the program will raise an exception. The destination string must have sufficient storage to hold its original value plus the new string or the function will raise an exception.

The str.a_insert function does not modify its destination string (the in_to parameter). Instead, this function allocates storage for a new string on the heap, copies the data from the in_to string to this new string object, and then inserts the src string into this string object8. Like the other "str.a_****" routines, this function returns a pointer to the new string in EAX and you should free this storage by calling strfree when you are done using the string data.

When copying the source string to the destination, the string insertion routines insert the source string before the character at position index in the destination string. Note that the index value may lie in the range 0..str.length( dest ) or 0..str.length( in_to ). Most string functions only allow values in the range 0..(str.length(stringValue)-1). The insert procedures allow the index value to be one greater; doing so tells these routines to insert the source string at the end of the destination string. In this case, the string insertion routines degenerate into string concatenation9.

 

// str.insert demonstration program.

 

program insert_demo;

#include( "stdlib.hhf" )

 

static

insertInMe :string := "Insert into this string";

strToInsert :string := " 'a string to insert'";

dest1 :string;

dest2 :string;

 

begin insert_demo;

 

// Allocate storage to hold the first combined string:

 

mov( stralloc( 64 ), dest1 );

// Display the strings we're going to work with:

 

stdout.put( "Insert into: '", insertInMe, "'" nl );

stdout.put( "String to insert:", strToInsert, nl );

 

// Insert strToInsert at the fifth character position in insertInMe

// (note that we can't actually insert into insertInMe because

// the string it points at is a literal string whose length is fixed,

// therefore, we will actually insert strtoInsert into the copy of

// insertInMe that we've made in dest1):

 

str.insert( strToInsert, dest1, 6 );

 

stdout.put( nl "Combined string: <", dest1, ">" nl );

 

// Demonstrate the same thing using str.a_insert:

 

str.a_insert( insertInMe, strToInsert, 6 );

mov( eax, dest2 );

stdout.put( "Combined via str.a_insert: <", dest2, ">" nl );

 

 

// Demonstrate what happens if the index exceeds the string's bounds

 

try

 

str.insert( dest1, strToInsert, 64 );

 

// We won't get here

 

anyexception

 

stdout.put

(

"Exception occured when indexing beyond the length of dest1"

nl

);

 

endtry;

 

end insert_demo;

 

Using the str.insert and str.a_insert Procedures
The str.delete and str.a_delete Functions

procedure str.delete( dest:string; index:dword; length:dword );

procedure str.a_delete( src:string; index:dword; length:dword ); returns("eax");

 

These functions remove characters from the string parameter. They remove the number of characters the length parameter specifies starting at the zero-based position found in the index parameter. The str.delete procedure removes the characters directly from the string the dest parameter specifies. The str.a_delete procedure does not modify its string parameter; instead, it makes a copy of the string on the heap and deletes the characters from that copy. The str.a_delete procedure returns a pointer to the new string in the EAX register. Like the other "str.a_*****" routines, you should call strfree to release this string storage when you are done using it.

The string delete procedures will raise an exception if the index parameter is greater than the current length of the string. If index is equal to the length of the string, then these procedures do not delete any characters from the string. If the sum of index and length is greater than the current length of the string, then these routines will delete all the characters from position index to the end of the string. You can use this behavior to delete all the characters from some position to the end of the string by specifying a large value for the length (the convention is to use -1 for this purpose).

 

// str.delete demonstration program.

 

program delete_demo;

#include( "stdlib.hhf" )

 

static

aLongStr :string := "Delete a substring from this one";

dest1 :string;

dest2 :string;

 

begin delete_demo;

 

// Allocation storage for a string so we can demonstrate str.delete:

 

mov( stralloc( 64 ), dest1 );

strcpy( aLongStr, dest 1 );

 

// Okay, demonstrate deleting a substring from an existing string

// (Delete "sub" from "substring" in aLongStr):

 

str.delete( dest1, 9, 3 );

stdout.put( "Original string: '", aLongStr, "'" nl );

stdout.put( "Resultant string: '", dest1, "'" nl nl );

 

// Okay, now demonstrate the str.a_delete procedure.

// Also demonstrate what happens when the length exceeds the

// string bounds (but the starting index is within bounds):

 

str.a_delete( aLongStr, 18, 100 );

mov( eax, dest2 );

stdout.put( "Original string: '", aLongStr, "'" nl );

stdout.put( "Resultant string: '", dest2, "'" nl nl );

 

// Demonstrate what happens if the index exceeds the string's bounds

 

try

 

str.delete( dest1, 64, 4 );

 

// We won't get here

 

anyexception

 

stdout.put

(

"Exception occured when indexing beyond the length of dest1"

nl

);

 

endtry;

 

end delete_demo;

 

Using the str.delete and str.a_delete Procedures
The str.replace and str.a_replace Functions

procedure str.replace( dst:string; from:string; to:string );

procedure str.a_replace( src:string; from:string; to:string ); returns( "eax" );

 

These two functions replace characters in a string via a small lookup table. They scan through the dst/src string a character at a time and search through the from string for this character. If the routines do not find this character, they copy the current character to the destination string. If these routines find the current character in the from string, then they copy the character at the corresponding position in the to string to the destination string (in place of the original character).

As usual for the HLA string functions, the difference between str.replace and str.a_replace is that the str.replace function manipulates the dst string directly while the str.a_replace procedure copies and translates the characters from src to a new destination string it allocates on the heap via stralloc. Of course, you should free the strings str.a_replace allocates by calling strfree when you are done using the string data.

Usually, the from and to strings will be the same length because these routines use the index into the from string to select the translation character in the to string. However, it is not an error if these two strings have different lengths. If the to string is longer than the from string, then the replace routines simply ignore the extra characters in the to string. If the to string is shorter than the from string, then the replace routines will delete any characters found in the from string that don't have a corresponding character in the to string.

An example may help clarify the purpose of these routines. In past chapters, you've seen how to use the XLAT instruction to translate lower case to upper case characters. One drawback to using XLAT is that you have to create a 256-byte lookup table. You can accomplish this with somewhat less effort using the str.replace procedure. Here's the code that will translate lower case to upper case within a string:

str.replace

(

theString,

"abcdefghijklmnopqrstuvwxyz",

"ABCDEFGHIJKLMNOPQRSTUVWXYZ"

);

 

If theString contains "Hello", then the call above looks up "H" in the second parameter and doesn't find it. Therefore, it doesn't change the first character of theString. Next, str.replace looks up "e" in the second parameter; this time it finds the character so it replaces "e" in theString with the character at the corresponding position (5) in the third parameter. The fifth character position contains an "E", so str.replace substitutes an "E" for the "e" in the second character position of theString. This process repeats for the remaining characters in theString; since they are all lower case characters (present in the second parameter) the str.replace routine converts them to upper case.

Note that these routines are not particularly efficient. For each character appearing in the first string parameter, these functions have to scan through the second parameter. If the first parameter is n characters long and the second string is m characters long, this process could require as many as n*m comparisons. If the from string is rather long, you will get much better performance by using a lookup table and the XLAT instruction (that requires only n steps). Certainly you should never use these functions for case conversion (as in this example) because the HLA Standard Library already provides efficient routines for translating the case of characters within a string (see The str.upper, str.a_upper, str.lower, and str.a_lower Functions). Nevertheless, these functions are convenient to use and are not especially inefficient if the from string is not very large (say less than 10 characters or so).

 

// str.replace demonstration program.

 

program replace_demo;

#include( "stdlib.hhf" )

 

static

digitsStr :string := "Count 1 the 2 number 3 of 4 digits 5 in 6 "

"this 7 string 8";

dest1 :string;

dest2 :string;

 

begin replace_demo;

 

// Allocation storage for a string so we can demonstrate str.replace:

 

mov( stralloc( 64 ), dest1 );

strcpy( digitsStr, dest 1 );

 

// Convert all the digits to periods and delete everything else.

// After this conversion, the length of the string will tell us

// how many digits were in the string.

 

str.replace( dest1, "0123456789", ".........." );

str.length( dest1 );

stdout.put

(

"Original string: '", digitsStr, "'" nl

"Result string: '", dest1, "'" nl

"Length of result: ", (type uns32 eax), nl

);

 

// As above, but demonstrate str.a_replace and count the number

// of alphabetic characters.

 

str.a_replace

(

digitsStr,

"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",

"...................................................."

);

mov( eax, dest2 );

str.length( dest2 );

stdout.put

(

"Original string: '", digitsStr, "'" nl

"Result string: '", dest2, "'" nl

"Length of result: ", (type uns32 eax), nl

);

strfree( dest2 );

 

end replace_demo;

 

Example that Uses the str.replace and str.a_replace Routines
The str.setstr and str.a_setstr Functions

procedure str.setstr( fill:char; dest:string; count:dword );

procedure str.a_setstr( fill:char; count:dword ); returns( "eax" );

 

The str.set and str.a_set functions create a new character string whose length the count parameter specifies. These routines fill the string with count copies of the fill character. The str.set routine fills the dest string with the characters; the dest string's MaxStrLen value must be greater than or equal to count or str.setstr will raise a string overflow exception. The str.a_strset function allocates sufficient storage for a new string on the heap and initializes this string with the specified number of characters; str.a_strset returns a pointer to this new string in the EAX register. As usual, you should call strfree to deallocate the string str.a_setstr creates when you are done with the string.

These functions are especially useful for creating "padding" strings when formatting data for output. If you have some code that translates some data object's representation to a string for output, you can use str.setstr (or str.a_setstr) along with string concatenation to adjust the output string to some minimum width. The example below demonstrates how you could do this:

 

// str.setstr demonstration program.

 

program setstr_demo;

#include( "stdlib.hhf" )

 

static

topBottom :str.strvar(40);

leftRight :string;

 

begin setstr_demo;

 

// We are going to use topBottom and leftRight to draw a

// 30x20 box on the display. Begin by initializing these

// strings using str.setstr.

 

str.setstr( topBottom, '*', 30 );

str.a_setstr( ' ', 28 );

mov( eax, leftRight );

 

// Okay, draw the box using the strings we've created:

 

stdout.put( topBottom, nl );

for( mov( 18, ecx ); ecx > 0; dec( ecx )) do

 

stdout.put( '*', leftRight, '*', nl );

 

endfor;

stdout.put( topBottom, nl );

 

// Okay, free up the storage we allocated via str.a_setstr above.

 

strfree( leftRight );

 

end setstr_demo;

 

Using str.strset and str.a_strset to Format Output Strings
The str.index/str.index2 and str.rindex/str.rindex2 Functions

procedure str.index( source:string; searchStr:string ); returns( "eax" );

procedure str.rindex( source:string; searchStr:string ); returns( "eax" );

procedure str.index2( source:string; offs:uns32; searchStr:string );

returns( "eax" );

procedure str.rindex2( source:string; offs:uns32; searchStr:string );

returns( "eax" );

 

The str.index and str.rindex search for an occurrence of one string (searchStr) within another string (source). They return a zero-based index of the position of the searchStr within the source string in the EAX register. The term "position" means the index of the first character in source that matches the first character of searchStr once these routines locate searchStr within source. If these routines cannot find the searchStr within the source string, they return -1 ($FFFF_FFFF) in the EAX register.

Note that if the length of the searchStr is greater than the length of the source string these functions will always return -1. If the lengths of the two strings are equal, this function returns zero if the two strings are equal, it returns -1 otherwise.

The str.index function returns the index of the first occurrence of searchStr within source. If multiple occurrences exist, this function ignores all but the first occurrence. The str.rindex (reverse index) locates the last occurrence of the searchStr in source (that is, this function searches for the searchStr in the backwards direction starting at the end of the string).

These functions use a "brute-force" algorithm that is fine for short source strings but is inefficient for really large source and searchStr combinations. For most strings (where the source string is less than 100-200 characters) using str.index and str.rindex is probably okay; however, if you want to search through strings that are thousands of characters long, there are better algorithms available (Boyer-Moore string matching comes to mind). For short strings, the overhead of these fancier algorithms diminishes their effectiveness, so don't be afraid to use str.index and str.rindex on short strings.

The str.index2 and str.rindex2 work much like str.index and str.rindex except they let specify a starting position in the source string where these function begin searching for the second string. If these functions find the search string within the source string, they return the index from the beginning of the source string (not from the offs value) to the location of the substring they locate.

You should not use these functions to search for individual characters within the source string. The next section describes a more efficient solution for searching for single characters within a string.

 

// str.index/str.rindex demonstration program.

 

program strindex_demo;

#include( "stdlib.hhf" )

 

static

source :string := "the world says ""hello there"" slowly";

 

begin strindex_demo;

 

stdout.put( "Original string: '", source, "'" nl );

 

// Output the character position underneath each character

// so the user can easily see what's happening:

 

stdout.put( " " );

mov( 0, dl );

for( mov( 0, ecx ); ecx < str.length( source ); inc( ecx )) do

 

stdout.put( (type uns8 dl ) );

inc( dl );

if( dl > 9 ) then

 

mov( 0, dl );

 

endif;

 

endfor;

stdout.put( nl " " );

mov( 0, dl );

mov( 0, dh );

for( mov( 0, ecx ); ecx < str.length( source ); inc( ecx )) do

 

inc( dl );

if( dl > 9 ) then

 

inc( dh );

stdout.put( (type uns8 dh));

mov( 0, dl );

 

endif;

 

endfor;

stdout.put( nl nl );

 

// Use str.index and str.rindex to locate the substring "the" within

// the source string:

 

str.index( source, "the" );

mov( eax, ecx );

str.rindex( source, "the" );

 

stdout.put

(

"First location of ""the"" within """,

source,

""" is ",

(type uns32 ecx),

nl

"Last location of ""the"" within """,

source,

""" is ",

(type uns32 eax),

nl

);

 

 

end strindex_demo;

 

Using the str.index and str.rindex Functions
The str.chpos/str.chpos2 and str.rchpos/str.rchpos2 Functions

procedure str.chpos( source:string; searchFor:char ); returns( "eax" );

procedure str.rchpos( source:string; searchFor:char ); returns( "eax" );

procedure str.chpos2( source:string; offs:uns32; searchFor:char );

returns( "eax" );

procedure str.rchpos2( source:string; offs:uns32; searchFor:char );

returns( "eax" );

 

These two functions are very similar to the str.index and str.rindex functions of the previous section. The difference is that these routines search for a single character (searchFor) within the source string rather than a sequence of characters. These functions return the zero-based index of the searchFor character within the source string, assuming that the character is present within the string. These functions return -1 in EAX if the character is not present in the string.

The str.chpos function searches for the first occurrence of the searchFor character within the source string. It ignores any additional matching characters after the first occurrence it locates. The str.rchpos function locates that last occurrence of searchFor within source (that is, str.rchpos searches backwards through the source string for the searchFor character). The str.rchpos function ignores any earlier characters once it locates the last occurrence of the character within the string.

The str.chpos2 and str.rchpos2 procedures work in an identical manner to str.chpos and str.rchpos except that they let you specify a starting index in the source string. Note that these procedures return an index from the first character in the string rather than from the starting position in the string.

 

// str.chpos/str.rchpos demonstration program.

 

program strchpos_demo;

#include( "stdlib.hhf" )

 

static

source :string := "the world says ""hello there"" slowly";

 

begin strchpos_demo;

 

stdout.put( "Original string: '", source, "'" nl );

 

// Output the character position underneath each character

// so the user can easily see what's happening:

 

stdout.put( " " );

mov( 0, dl );

for( mov( 0, ecx ); ecx < str.length( source ); inc( ecx )) do

 

stdout.put( (type uns8 dl ) );

inc( dl );

if( dl > 9 ) then

 

mov( 0, dl );

 

endif;

 

endfor;

stdout.put( nl " " );

mov( 0, dl );

mov( 0, dh );

for( mov( 0, ecx ); ecx < str.length( source ); inc( ecx )) do

 

inc( dl );

if( dl > 9 ) then

 

inc( dh );

stdout.put( (type uns8 dh));

mov( 0, dl );

 

endif;

 

endfor;

stdout.put( nl nl );

 

// Use str.chpos and str.rchpos to locate the last space within

// the source string:

 

str.chpos( source, ' ' );

mov( eax, ecx );

str.rchpos( source, ' ' );

 

stdout.put

(

"First space within """,

source,

""" is at position ",

(type uns32 ecx),

nl

"Last space within """,

source,

""" is at position ",

(type uns32 eax),

nl

);

 

 

end strchpos_demo;

 

Using the str.chpos and str.rchpos Functions
The str.upper, str.a_upper, str.lower, and str.a_lower Functions

procedure str.lower( dest:string );

procedure str.upper( dest:string );

procedure str.a_lower( src:string ); returns( "eax" );

procedure str.a_upper( src:string ); returns( "eax" );

 

These functions translate alphabetic characters in their parameter strings to upper case (str.upper and str.a_upper) or to lower case (str.lower and str.a_lower). The str.lower and str.upper functions translate the characters directly in the dest string parameter. The str.a_lower and str.a_upper functions copy the src string and translate the data while copying it; they return a pointer to the new string in EAX. As with all str.a_xxxx routines, you should free the storage by calling strfree when you are done with the strings that str.a_lower and str.a_upper create.

 

// String comparisons demonstration program.

 

program struplow_demo;

#include( "stdlib.hhf" )

 

static

str1 :string := "abcdefg";

str2 :string := "hijklmn";

str3 :string := "AbCdEfG";

 

l_result1 :str.strvar(16);

l_result2 :str.strvar(16);

l_result3 :str.strvar(16);

 

u_result1 :str.strvar(16);

u_result2 :str.strvar(16);

u_result3 :str.strvar(16);

 

la_result1 :string;

la_result2 :string;

la_result3 :string;

 

ua_result1 :string;

ua_result2 :string;

ua_result3 :string;

begin struplow_demo;

 

// Use the str.lower and str.upper functions

// to convert str1...str3 to all lower and

// all upper case:

 

str.lower( str1, l_result1 );

str.lower( str2, l_result2 );

str.lower( str3, l_result3 );

 

str.upper( str1, u_result1 );

str.upper( str2, u_result2 );

str.upper( str3, u_result3 );

 

// Use the str.a_lower and str.a_upper functions

// to convert str1..str3 to all lower and all upper

// case, while allocating storage for the results.

 

mov( str.a_lower( str1 ), la_result1 );

mov( str.a_lower( str2 ), la_result2 );

mov( str.a_lower( str3 ), la_result3 );

 

mov( str.a_upper( str1 ), ua_result1 );

mov( str.a_upper( str2 ), ua_result2 );

mov( str.a_upper( str3 ), ua_result3 );

 

// Compare and displays the strings we've processed.

// Compare each combination of o