What is HLA All About, Anyway?

(A Primer for Those Who Already Know Assembly Languge.)

 

by Randy Hyde

 

When I first began writing HLA in the fall of 1996, I had visions of creating one of those "universally accepted" language projects that everyone would love. I'd get big "pats on the back" and people would talk about how great I was. As the design for HLA began to solidify, and I had to make some hard decisions about compromises in the language, those visions of glory faded quickly. When I released v1.0 of the HLA prototype public consumption in September, 1999, I fully expected some people to voice rigorous complaints about the language itself. I was not disappointed. One thing, however, that I should have forseen and prepared for is the fact that a large number of people would voice complaints about HLA in total ignorance. That is, I didn't anticipate some individuals looking at some sample code and making assumptions about the language without first looking to see if those complaints were valid. Part of the reason for this is that the HLA system does have a substantial amount of documentation (over 200 pages) and someone new to HLA isn't going to bother reading all this before raising some complaint. The purpose of this paper is to provide a synopsis of HLA (in a "nutshell" so to speak) to address the majority of complaints about the language I've seen to date.

 

This document consists of three major sections: the first discusses the design goals and design decisions I made while creating HLA. A defense of the HLA language is not possible without this introductory material, so bear with me on this. The second section directly addresses the complaints I've heard to date about the design of the HLA language. The third section discusses HLA from the point of view of "Hey, I already know assembly language, why should I bother to learn HLA?"

 

Design Goals and Decisions

 

HLA was originally conceived as a tool to teach assembly language programming. In early 1996 I decided to do a Windows version of my electronic text "the Art of Assembly Language Programming" (AoA). After an attempt to develop a new version of the "UCR Standard Library for 80x86 Programmers" (a mainstay of AoA), I came to the conclusion that MASM just wasn't powerful enough to make learning assembly language really easy. I decided to develop an assembler with sufficient power, providing the tools for a good standard library as well as satisify some other requirements. The High Level Assembler was the result of this.

 

The principle goal of HLA was to leverage student's existing programming knowledge. For example, a good Pascal programmer can get their first C/C++ program operational in a few minutes. All they've got to do is note the similarities between the two programming languages, make the appropriate syntactical changes, and they're up and running. Take that same Pascal programming and expect them to learn LISP or Prolog the same way, and you'll not meet with the same success. LISP and Prolog are completely different, they use a different "programming paradigm," so the student has to "start over from scratch" when learning these languages. Although assembly language is an imperative language (like Pascal and C/C++), there is a considerable "paradigm shift" when moving from one of these high level languages to assembly. In HLA, I wanted to create a language with high level control structures and declarations that made it possible for someone familiar with an imperative language like Pascal or C/C++ to get their first HLA program running in a matter of minutes (or, at worst, a matter of hours). Of course, to achieve this goal, I needed to add high-level data declarations and high-level control constructs to the HLA language.

 

The astute reader will quickly point out that high level control structures are not assembly language and letting the students use these types of statements is not really teaching them assembly language. This is quite true; since the purpose of teaching an assembly language course is to teach the students "assembly language programming" it is quite clear that HLA would fail if it only provided these high level control structures. Fortunately, this is not the case. HLA supports all standard assembly language instructions including CMP and Jcc instructions, so you can still write "pure" assembly language programs without using those high level language control structures. However, it does take time to learn the several hundred different machine instructions. Traditionally, it's taken my students (using only MASM) about five weeks before they could really write any meaningful programs in assembly language (you have to cover things like numeric representation, basic CPU architecture, addressing modes, data types, and introduce the instruction set before any real programs can be written).

 

HLA lets students write meaningful programs within about a week of it's introduction (e.g., the first assignment I gave last quarter was to write an "addition table" program that computes the outer product [addition table] of the two vectors 0..15 and 0..15, printing the table formatted nicely). They achieve this by using statements they already know (like IF and WHILE) with the injection of just a few assembly language concepts (registers, and the MOV and ADD instructions) plus an introduction to the HLA Standard Library. Over the next several weeks, these students write more and more complex programs as they are introducd to new assembly language and HLA concepts (e.g., data representation, basic architecture, addressing modes, data types, and additional instructions). At about the sixth week, I begin "weaning" these students off the high level language statements and force them to use the low level machine instructions. It turns out that they learn how to simulate an IF statement at roughly the same point in the quarter as they did when they used only MASM, but the big difference is that they've written a lot more code up to that point proving out other concepts in machine organzation and assembly language programming. In my limited experience with classroom testing, I've found that students spend less time on the class, cover more material, and retain the knowledge better (by the time of the final exam) than they did when I only used MASM.

 

The general goal of reducing the learning curve for students is achieved several ways.

 

(1) As noted above, HLA allows a gradual transition from high level languages into pure assembly language. My favorite analogy here is the Nicoderm CQ smoking cessation system ("gradual steps are better."). Like the Nicoderm system, HLA lets students learn assembly language in gradual steps rather than throwing them into the water and shouting "sink or swim!"

 

(2) In addition to letting the students employ high level language statements in their assembly language programs, HLA contains several other familiar concepts and syntactical items that ease the transition from high level language programming to assembly language. For example, HLA uses the familiar (to C/C++ programmers) "/*" and "*/" comment delimiters (as well as the "//" comment delimiter). Statements generally end with a semicolon (just as in high level languages). Machine instructions use a functional notation rather than "mnemonic-operand" notation. Constant, type, and variable declarations should look very familiar to Pascal programmers. HLA's standard library should look comfortable to anyone who has used the C/C++ standard library.

 

In addition to syntactical similarities, well-written HLA programs share a similar programming style with modern high level languages. So a student who has learned how to write readable Pascal, C/C++, or Java programs will be able to write readable HLA programs with almost no additional study. Contrast this with the style guide I've written for (MASM) assembly language programmers that is quite a bit different than high level languages and takes a while to master.

 

Another factor many people don't consider is the evaluation of a programming project. At UCR we are given about 1.5-2 hours per student per quarter of reader (student grader) time to grade projects. Experienced readers who can grade (or want to grade) assembly language projects are few and far inbetween. Most readers get "stuck" with grading the assembly class rather than volunteer for the job. The fact that most student assembly language projects have a horrible programming style and are hard to read only exacerbates this situation. HLA helps solve this problem. Since good HLA programming style is very similar to good C/C++ style, UCR's readers have a much easier time reading the projects and evaluating their programming style. Also, since the students have (presumably) learned good programming style in the prerequisite course(s), they tend to write easier to read HLA programs than MASM programs. This lets me assign more projects without fear of exceeding my reader budget each quarter.

 

HLA's advantages are easily summed up by a complaint I had from a student once. She said "HLA drives me nuts. It's so similar to C++ that I often get confused and try out something that would work in C++ only have have the HLA compiler reject it." I agreed with this student that this was a bit of a problem, but I also mentioned "what about all the times you've tried something from C++ and it HAS worked?" She thought about it for a moment and walked away agreeing with my assessment of her complaint. Had this student been learning assembly the traditional way, she wouldn't have bothered to try anything. She would had to have spent extra time learning how to achieve what she wanted by reading an assembly text or she would have missed out on the opportunity to actually learn something new. HLA's similarity to C++ encouraged her to try something out on her own. The experiments weren't always successful, but in those cases where they were, she benefited greatly from this. This anecdote, more than any other, sums up what my goals with HLA were and describes the success I believe I have achieved with it.

 

 

Complaints with the Design of HLA by Experienced Assembly Language Programmers

 

By far, the largest number of complaints about the HLA language come from people who've learned x86 programming with one of the available assemblers, take a quick look at HLA, and reject it because it's radically different than what they're used to programming with. Common complaints include "you've switched the operands of the instructions around" or "why did you switch the MOV operands and not the LEA operands?" (funny, those same people rarely complain about the fact that I didn't swap the CMP operands, either). Other common complaints include "why do I have to type all those damn parentheses (or semicolons)?" and "What does `@c' stand for?" (the carry flag, if you're wondering). It boils down to "Why didn't I use a conventional syntax so HLA would be easier to learn?" The answer is quite simple; I didn't design HLA to make it easy for people who already know assembly language to learn HLA. I designed HLA to make it easy for people who don't know assembly language, but do know a high level language, to learn assembly language.

 

While there are some technical reasons for choosing the order of the operands the way I did, the primary reason I chose this ordering was because it seems more natural to beginners (do you remember how weird "mov dest,src" seemed to you when you were first learning x86 assembly language?). Since most people read "mov( eax, ebx);" as "move eax to ebx" rather than "mov eax from ebx" I chose the former meaning. Likewise, I find that "lea( eax, x);" more naturally reads "load eax with the effective address of x" (some people, early on, pointed out that LEA could also be read as "load the effective address of x into eax." I concurred and allowed both syntactical forms "lea( eax, x);" and "lea( x, eax );" There is never any ambiguity because one operand is always a 32-bit register and the other is always a memory location.) Intel's syntax for CMP is right on; "cmp eax, x" is generally read as "compare eax to x" hence HLA keeps this same operand ordering (i.e., "cmp( eax, x );" ). The choice of operand order was always chosen to reflect how someone would normally state the instruction in plain English. To the beginning student, who doesn't have the baggage of Intel syntax already hanging around their neck, this syntax is a little more natural. Of course, to an existing assembly language programmer, HLA's operands will always seem backwards (I must admit, it took me about a week to get used to HLA's syntax and now I have trouble getting the operands right every time I use MASM). But remember, HLA was not designed to make life easy on existing assembly language programmers. They're already converted, I don't need to teaching them assembly programming. HLA was designed to bring more assembly language programmers into the fold.

 

Another common complaint from experienced programmers is that HLA isn't a true low level programming language. They view it as some sort of "mid-level" language (a term synonymous with "C"). I believe that persons who make this statement simply are not familiar with HLA. They see a few HLA examples involving statements like IF..THEN..ELSE..ENDIF or WHILE..ENDWHILE and assume that these are the control statements in HLA. Somehow I've eliminated compares and jumps in favor of these "structured" statements. As pointed out earlier, this isn't true. HLA supports both high level and low level control structures. Those how still insist on calling HLA a medium level language in view of these facts must also consider MASM and TASM to be medium level languages as they support high level control structures as well. For example, the HLA statements

 




if( eax <= ebx ) then



    mov( 0, eax );



endif;

 

is written in MASM/TASM as

 




.if eax <= ebx



    mov eax, 0



.endif

 

Syntactically, there is very little difference between the two. If you must call HLA a medium level language, then this label must apply to MASM/TASM as well.

 

For those who would insist that the proper way to write this in MASM is to ignore those HLL statements and write the code as

 




    cmp eax, ebx
    jnbe NotBE
    mov eax, 0
NotBE:



 

Well, I'd point out that you could also write this in HLA using the low-level code:

 




    cmp( eax, ebx );
    jnbe NotBE;
    mov( 0, eax );
NotBE:

 

There are very few things you can do with MASM that cannot also be done directly in HLA using the same programming paradigm (actually, MASM/TASM's boolean expressions in high level language statements are more powerful than those in HLA; I explicitly chose not to handle such fancy expressions because I wanted students to get used to using lower level constructs).

 

I did get one complaint that the documentation was too big to download (about 2 megs worth of PDF files, about 1 meg worth of HTML files). I won't comment any further on this complaint other than to say the problem is a lot better than not having enough documentation. The one very valid complaint about the HLA documentation is that it is unstructured, unindexed, and doesn't even have a table of contents (in other words, you can't find anything in the documentation except via search/find commands). I apologize profusely for this situation. As soon as I get done working on the HLA version of AoA I will rectify this situation. A good "quickstart" HLA manual for people who already know assembly language is another good idea. Once again, when I get the time...

 

A consistent complaint I hear is that I've attempted to turn assembly language into a high level language like C or Pascal (what did they think "High Level Assembler" meant, anyway?). I proudly admit to this. Somehow, though, people voicing this complaint seem to think that I'm doing this as a way to trick them into giving up their precious low-level language and sneakily get them to program in a high level language. I have no such designs. Quite the opposite really, I want to trick those who would only work in a high level language into giving up their precious C/C++ or Pascal compiler and work their way towards assembly language. If you're already a competent assembly language programmer, many of HLA's design goals are irrelevant in your particular case.

 

Hard-core assembly language programmers will complain that HLA emits code that they didn't write. For example, by default all procedures automatically emit the standard entry and exit sequences. Likewise, certain HLA `pseudo-instructions' compile into two or more instructions (e.g., "mov( mem, mem);" is syntactically legal in HLA, HLA compiles this to a push and a pop instruction). In other cases HLA creates variables behind your back to allow certain illegal forms of x86 instructions (e.g., HLA allows "mul( 10, eax);" even though the MUL instruction doesn't normally allow constants. It achieves this by creating a static variable initialized with the value ten). In almost all cases you can elect not to use these forms of the instructions or you can turn off the automatic code generation. Except for some exception handling code and the initialization of the main program, it is quite possible to tell HLA not to generate any code that you don't explicitly write yourself. Note that HLA does not translate LEA instructions into MOV instructions, or anything else like that; however, do keep in mind that HLA's output is assembled by MASM so any translations MASM does to your code, HLA will wind up doing as well (indirectly, at least).

 

Some people have gotten wind that a future version of HLA will offer code optimization. They've let me know in no uncertain terms that they don't want an assembler touching their precious code. Fear not, you'll always be able to turn this kind of stuff off in HLA.

 

A few people have noticed that HLA does not support the MMX instruction set or the SIMD instruction set. This is a valid complaint. I will deal with this issue in a future version of HLA. In the meantime, there are two quick workarounds. First, you can use the #asm..#endasm directives to embed MASM assembly code directly into an HLA program, or you can write macros that compile to the appropriate instructions you need. Neither solution is elegant, but I will incoporate the real instructions at one point or another and these two techniques provide a decent work-around in the meantime.

 

One (actually valid) complaint I hear is that it takes more typing to write an HLA program than it does to write the equivalent MASM program. While I haven't done the research to prove or disprove this (certainly there are a lot of extra parentheses and semicolons to type, but HLA's high level function call syntax and high level language statements save a lot of typing; I'm not sure what the end sum turns out to be), I argue that the point is irelevant. Within certain bounds (of course) the amount of typing does not affect program development time. On the average, a line of code in a program is read ten times more often than it's written. Hence, it's ten times as important to produce a language whose statements are easy to read over being easy to write. I didn't invent most of HLA's language constructs. They were carefully chosen from language features that researchers have shown to produce easier to read and maintain programs. What may seem like an extra keyword or symbol you have to type probably has a big impact on either the readability of the resulting code (positive) or on the ability of the compiler to easily pinpoint and report errors in the source code (also a positive thing). My apologies to those who can't touch type, but I want a language that lets you easily write readable programs; if I wanted to reduce typing, I would have adopted Jim Neil's TERSE language a long time ago. If you personally feel that shorter programs are more readable, by all means go to www.terse.com and take a look at Jim Neil's work. He has done a good job of converting x86 assembly language to an expression based language. As its name suggests, TERSE programs are very short and involve a whole lot less typing than HLA programs.

 

At the other extreme, I've heard a few comments that suggest that HLA is irrelevant. If you want (Pascal-based) high level language statements intermixed with assembly, why not use Delphi and Borland's built-in assembler (BASM)? There are two reasons why not. First, BASM is a very crude assembler (certainly weaker than any stand-alone assembler I've used). Second, HLA is actually higher level than Delphi in many respects. Delphi does not support a decent macro preprocessor, parameter passing mechanisms other than reference and value, thunks, iterators, and several other features. Most importantly, though, while BASM is good for writing a few machine instructions in a Delphi program, it is not well suited for writing true assembly language applications; HLA is well suited for this purpose.

 

A complaint I've heard once or twice is that HLA does not support segments, 16-bit addressing modes, and you can't write DOS programs with it. These are valid complaints. If you need to do all of these things, HLA is not the language for you. HLA was designed exclusively for use with flat-model 32-bit operating systems. I make no apologies for that design decision.

 

Several of my students have complained that HLA doesn't run under Linux. There are two reasons HLA won't be "Linux-compatible" soon: HLA v1.x emits MASM (only) compatible assembly code; and the HLA Standard Library was specifically written for Windows. This is not to say that a Linux version will never be written. Indeed, I plan to get started on HLA v2.0 in the summer of 2000. That version will be cross platform. However, it is impractical to port the current version to Linux. The HLA sources (compiler and library code) are publicly available and public domain. So if you're itching for a project, have at it.

 

One extremely valid criticism that I've yet to hear anyone state is that support tools for HLA are few and far inbetween (actually, non-existant would be a better term here). An integrated development environment (IDE) or incorporation into an existing IDE (e.g., Visual Studio) would be a welcome addition. An absolute necessity is an HLA-based source level debugger. A profiler, source code "pretty printer," and other development tools would also be welcome. My personal plans are to get a source level debugger up and running immediately after I get HLA v2.0 operational (this has to wait for v2.0 because that will be the first version that directly generates object code and, therefore, can emit debugging information that is pertinent to HLA).

 

 

What Does HLA Offer to Someone Who Already Knows Assembly Language?

 

Let's face it, if you already know assembly language, learning HLA is going to be a lot of work. Will the effort you expend really be worth it? Let me suggest up front that if you're a "bare-bones" programmer who rarely uses macros, structs, arrays, symbolic constants, large table lookups, or other advanced assembly language features, there is no reason why you would want to spend the time learning HLA. Although HLA is easy to learn, if you never plan on using any of HLA's advanced features then any time you spend learning HLA is wasted; after all, if you already know how to do it in MASM, why bother learning HLA if all you're going to achieve is learning how to do exactly what you do now, just a different way?

 

HLA's primary design goal was to create a system that beginners could use to learn assembly language programming. This might seem to imply that HLA doesn't really contain features of use to an advanced assembly language programmer. Nothing could be farther from the truth! Remember, I designed HLA because I wanted to rewrite the Art of Assembly Language Programming; originally, I attempted to use MASM in the new edition of the Art of Assembly but gave up because MASM lacked the power I needed to achieve this. When I designed HLA I incorporated lots of really fancy features into the language to overcome MASM's limitations. Of course, I do not expose these advanced features to beginning students in my textbook, but they are there (and mostly documented).

 

The real power of HLA (for advanced users) centers around the use of what I call the "HLA Compile-Time Language," or CTL. The CTL incorporates many features commonly found in modern assemblers including macros, conditional assembly, assembly time loops, assembly time expression evaluation, and built-in assembly-time functions. While very few of these general feature areas are unique to HLA (e.g., MASM provides examples in each of these areas), HLA's implementation is generally more sophisticated.

 

For example, HLA's macro processing facilities are more powerful than those found in any programming language of which I'm aware. HLA's macro facilities, for example, let you implement all of HLA's high level control statements with macros (they are built into the compiler for performance reasons, but you could implement them as macros). Indeed, I've provided an example file with the HLA release that demonstrates exactly how to do this. While there is no need to reimplement any of HLA's existing HLL control structures, the fact that you can do this means that you can also implement some other control structure that I haven't incorporated into HLA. In fact, a sister article to this one describes how to implement a variant of the WHILE loop using HLA's macro facilities. If you're the type who likes to play around with language control structures, you can have a lot of fun with HLA's macro facilities.

 

HLA's compile-time functions include some very sophisticated string handling functions. This allows you to manipulate portions of your assembly language source code, during assembly, to affect the overall code generation of your program. In fact, it is possible to write a simple compiler inside an HLA source file using nothing more than the HLA CTL (I've actually given a small example of this in the "u32Expr.HLA" source file found in the Examples directory of the HLA distribution). This feature is so powerful, you can use HLA to create DSELs (Domain Specific Embeded Languages). That is, you can design your own programming language an implement it using HLA.

 

Another neat feature in HLA is it support for pass by value/result, pass by result, pass by name, and pass by lazy evaluation parameters. HLA also provides support for a variant of "thunks" that encapsulate a sequence of instructions and their execution environment for execution elsewhere in the code

 

Of course, HLA provides support for classes and object oriented programming. HLA's OOP features far surpass those of TASM and other assemblers that support object oriented programming. Combined with HLA's macro facilities, you can create procedures and methods that support a variable number of parameters, procedure overloading, and several other advanced features.

 

In addition to the features found directly in the HLA language, advanced programmers will also appreciate many of the advanced macros and procedures found in the HLA standard library. The HLA Standard Library, for example, provides a very rich set of string and pattern matching procedures that are very powerful. You have to look at very high level languages like SNOBOL4 or Icon in order to duplicate the string processing features found in HLA (HLA's string processing capabilities are certainly far more powerful than those found in C/C++ and similar languages). Similarly, the HLA Standard Library arrays module provides very powerful array declaration and manipulation capabilities. For example, using the HLA arrays module, you can easily declare and use multi-dimensional dynamic arrays whose sizes your program determines at run-time.

 

For the very advanced programmer, HLA's source listings are available. Therefore, sophisticated programmers can actually modify the HLA source file to their heart's content (HLA was written with FLEX/BISON and Borland C++ v5.0; FLEX and BISON are provided with the HLA source code, you will have to supply your own legal copy of Borland C++). Source code (written in HLA) to the HLA standard library is also included with the distribution package.

 

Conclusion

 

HLA is definitely not for everyone. If you already know assembly language and you're happy with the capabilities of the assembler you're using (especially if you're not a big fan of macros), I doubt HLA will really improve your outlook on life. I might argue that HLA will let you write more readable programs than assemblers like MASM, but that's just my opinion, you may feel perfectly free to disagree with me on this in your particular case (just don't try to make that argument in general).

 

Although HLA was designed for beginners, until the HLA version of AoA appears, HLA is probably not a good choice for beginning assembly language programmers. Without documentation geared towards the beginning assembly language student, HLA can be very confusing. Fortunately, this problem will go away shortly (if it still exists while you're reading this).

 

HLA is definitely the most powerful x86 assembler I've ever seen (and I've seen quite a few). If you're interested in working with a "power tool" instead of a hand tool, HLA very well could be the assembler for you. I believe that once you spend the time to really learn HLA, you'll agree whole-heartedly.