Microsoft Basic Logical Expression Evaluation
 




This whitepaper on the History of Truth and Logical Expression Evaluation in Microsoft Basic was written, and copyright is held, by Microsoft Visual Basic MVP:
Dan Barclay
Copyright ©2001, Barclay Software, Inc.

Summary

This document describes logical expression evaluation in Microsoft Basic. The primary purpose is to describe expression evaluation from a language standpoint. That is, what the programmer should have in mind when writing the logical expression. This document was originally targeted at answering a few frequently asked questions, then wandered into the territory of a tutorial. I stopped adding (perhaps too late) when it seemed to be heading off in the direction of a dissertation.

Scope: This description applies to currently or previously shipped versions of MS Basic including 8 bit (CP/M TRSDOS), 16bit DOS (QuickBasic, Qbasic, IBM Basic Compiler, BASICA, PDS, VB for DOS), OS/2 (PDS), 16bit for Windows (Visual Basic versions 1-3), 32bit for Windows (Visual Basic versions 4-6). There are likely other MS Basic derivative products to which this applies that I have forgotten or am unaware of. Note: This behavior does not apply to Beta1 of VB.Net. It also will not apply to some other versions of Basic (some early Apple versions, for example, even when authored by Microsoft).

In addition to the language view, some effort is spent to describe optimizations that have been made in previous versions of MS Basic to emphasize some points in the language view.

The Questions

Much discussion has occurred recently concerning logical expression evaluation. Questions such as “why are logical operators bitwise?”; “Why is the value of true –1 and not simply 1?”; “Why doesn’t Basic include short circuiting?” are common. These are natural questions for those whose background is predominately another language with different characteristics. Basic has different syntax and different behavior than other languages… or it would not be another language. These questions are answered below.

This is how it works, by definition

First, understand that logical operations and tests are two different things.

For tests only (that is, the expression tested by IF and whatnot), a zero value is judged to be false, any nonzero value is judged to be the truth (well, True).

Logical operations have both numeric operands (we'll get to Boolean Data Type in a minute) and numeric results. The operators are bitwise on the numeric operands. That is, they operate on each corresponding bit of each operand. For example:

Assignment Bits Value As Integer
a = 0 0000000000000000  0
b = Not a 1111111111111111 -1

Assignment

Bits

Value As Integer
a = 1 0000000000000001  1
b = 3 0000000000000011  3
a And b 0000000000000001  1
a Or b 0000000000000011  3
a Xor b 0000000000000010  2
   Not b 1111111111111100 -4 (Trust me on this one!)

That is And, Or, Xor and Not compare each pair of corresponding bits in the operands and perform the corresponding logical operation. For those familiar with bitwise operations, I apologize for unneeded detail. For those wishing an even more detailed explanation of what is meant by “bitwise operations”, you may want to refer to Ethan Winer’s book Basic Techniques and Utilities “Bit Operations” in Ch 3 (out of print, but available for free download at http://www.ethanwiner.com).

If you use

   MyFalse% = 0
   MyTrue% = Not MyFalse%

you get all bits set in MyTrue%. (That happens to represent -1, by the way). Now, since all bits behave the same way (since they're all the same), you can use the bitwise Boolean operators to simulate value level Boolean operations.

This is why, in the olden days, you would find statements at the top of most routines such as:

   False% = 0
   True% = Not False%

It is also why, when the Boolean "constants" False and True were introduced their representation was as 0 and -1. Likewise, when the Boolean Data Type was introduced, even later, it had the behavior of representing false/true as 0/-1. In addition, coercion rules used the test rules to set the value. That is

   Dim MyBool As Boolean
   MyBool = 1       ' 1 *tests* as True, so MyBool is set to True (-1)
   MyInt% = MyBool  ' Now MyInt% equals -1

Why “True” is not 1

You could use the value of 1 for True if you want, since it will test properly, but then True is not equal to Not False, which will seem a bit strange. As such, it cannot be used to represent the Boolean value of True in value level Boolean expressions. Example:

   a = 1             ' true, in the testing sense
   b = 2             ' also true using the <>0 criteria

   If a And b Then   ' this is false because:
                     ' 0000000000000001
                     ' 0000000000000010
                     ' ----------------
                     ' 0000000000000000   And

When you are trying to deal with value level Boolean expressions, be certain you are using the values 0 and –1 so that all bits are either zero or one. Some have said the result of ignoring this advice will be unpredictable. That is incorrect. The result of doing this is entirely predictable but may be undesirable from your standpoint.

With Boolean Data Type it doesn’t matter much

Enter the Boolean data type. If you are trying to accomplish value-level Boolean expression evaluation you can assure you stick with the proper values (zero and minus one) by using Boolean variables. In fact, if you do you will find that the bitwise operators behave as value level Boolean operators.

Still, it's bits. Only zero and 1 are allowed (exist). The key here is that Boolean operators are bitwise and it is important what all the bits do. With the Boolean data type you are simply assured they are all doing the same thing.

It’s all in the numbers

Remember that Boolean Expressions are numeric in Basic. The operators take numeric operands (all bits) and produce a numeric result. The following is perfectly valid:

   A = 1
   B = 2
   C = B > A

In this fragment, the result left in C is -1. Consider also:

   A = 1
   B = 2
   C = B Or A

In this fragment, the result left in C is 3. Further, the following is perfectly valid:

   C = 1.23 * (B > A) + 1000 * A

Other Contracts

A few other things you should know about logical expression evaluation:

Order of Execution

Operator Precedence

Precedence of operations is Comparison (>,<,= and friends) then Logical (And, Or and friends). Within the same operation level, precedence is left to right. Example:

   If MyValue% And MyBitmask% And A > B Then

In the above expression, the bitwise And comparison of MyValue% and MyBitMask% is Anded (on a bitwise basis) to the result of A>B. In this case A>B is a switch turning on or off all the resulting bits.

Other (order of appearance)

Order of expression execution based on order of appearance is not guaranteed, even though precedence is guaranteed. Compiler optimizations may change the execution order.

Short Circuiting

Short Circuiting is a process by which execution of parts of a logical expression may be skipped entirely. For example:

   If A > B And B < C Then

In processing the above fragment, if A<=B then it simply does not matter if B<C so the compiler can skip that comparison. (Likewise, if B=>C it does not matter if A>B). The process of skipping that part of the expression is called “Short Circuiting” the expression. Short circuiting may be implemented for a number of reasons including performance (not having to perform comparisons as shown above) or flow control. A good flow control example would be:

   If A <> 0 And ((B / A) > 10) Then

In this case, if short circuiting were implemented for flow control, the expression B/A would avoid divide by zero because it would never be executed. Note that parenthesis are added in the above expression for clarity even though they are not required.

While it is possible that future versions of MS Basic will provide guaranteed execution order, current versions do not (as discussed above). Short circuiting is provided as a performance enhancement in the Basic PDS and VB for DOS optimizing compilers, but reliance on it for flow control is not supported.

In short, in Basic short circuiting is provided as an optimization by some compilers, but in currently available products neither short circuiting itself nor the order of execution is defined as a behavior of the language. The way to short circuit for flow control is to use explicit short circuiting (nested If statements). Similarly in other languages such as C the way you assure execution of expression fragments is to explicitly promote them out of logical expressions.

Embedded functions

Functions embedded in logical expressions are guaranteed to be evaluated, in no particular order. That is, with the expression:

   If A > MyFunction(Param1) And B = MyOtherFunction(Param2) Then

Both MyFunction and MyOtherFunction are evaluated, regardless of any short circuiting of the logical expression by the compiler. In addition, since execution order is not guaranteed the order of function execution is not guaranteed.

Compiler Optimizations

So, with all these numeric conversions and tests, isn’t performance impacted? Not necessarily. All isn’t as it may appear to be (to the compiler anyway). The purpose of discussing a few compiler optimizations is to show that actual implementation is not necessarily a rote implementation of the language view. The developer must always keep the language view in mind, as this determines behavior and is the same for all MS Basic products. Optionally, the developer should consider the actual implementation (which will vary from product to product) in order to take advantage of performance enhancements.

The VB for DOS compiler is used because it is convenient and illustrates a wide variety cases in which the language view and actual optimized implementation are different.

Direct Logical evaluation.

One way performance is maintained is by good expression evaluation by the compiler. An example would be a simple logical comparison. Consider the fragment:

   If A% > B% Then
      B% = 1
   End If

Viewing this from the language standpoint the developer should assume the following evaluation is occurring: First, the comparison A% > B% occurs, yielding a zero or –1. Next, the value of that expression is tested by If for Truth (nonzero).

However, the compiler sees that only a simple comparison is being made and can optimize the entire expression to a comparison and a jump over the assignment of B if warranted as follows:

   0030   0006    If A% > B% Then
   0030   0006            B% = 1
   0030   0006    End If
   0030    **        I00002:   mov     ax,B%
   0033    **                  cmp     ax,A%
   0037    **                  jnl     I00003
   0039    **                  mov     B%,0001h
   003F   000A
   003F   000A
   003F    **        I00003:

Short Circuiting

Short circuiting has been provided in some previous versions of MS Basic products, in a way that maintains language behavior but optimizes performance. Consider the fragment:

   If A% > B% And B% > C% And D% > E% Then
      B% = 1
   End If

Viewing this from the language standpoint, each comparison (A>B; B>C; D>E) yields a true or false numeric value (0 or –1). Each of the comparison results is then Anded to yield a numeric result for If to test.

However, the compiler sees that these are simple comparisons and furthermore short circuits them for performance:

   0030   0006    If A% > B% And B% > C% And D% > E% Then
   0030   0006            B% = 1
   0030   0006    End If
   0030    **        I00002:   mov     ax,B%
   0033    **                  cmp     ax,A%
   0037    **                  jnl     I00003
   0039    **                  cmp     ax,C%
   003D    **                  jng     I00003
   003F    **                  mov     ax,E%
   0042    **                  cmp     ax,D%
   0046    **                  jnl     I00003
   0048    **                  mov     B%,0001h
   004E   0010
   004E   0010
   004E    **        I00003:

Note the short circuiting (multiple jumps to I00003) as well as register reuse. The point of short circuiting expression evaluation is to not waste time comparing expressions that don’t matter anymore, since the first test has already failed.

Order of Execution

   If MyValue% And MyMask% And A% > B% Then
      B% = 1
   End If

This is a mix of comparison and numeric logicals. Viewing this from the language standpoint, a value (MyValue%) containing binary data is masked using the bitmask MyMask% and then is switched using the comparison A% > B%. This occurs using bitwise Anding of the value and the mask, followed by bitwise Anding of the result of the A>B comparison.

However, the compiler sees this a bit differently and reorders the expression a bit:

   0030   0006    If MyValue% And MyMask% And A% > B% Then
   0030   0006            B% = 1
   0030   0006    End If
   0030    **        I00002:   mov     ax,B%
   0033    **                  cmp     ax,A%
   0037    **                  mov     cx,0000h
   003A    **                  jnl     $+03h
   003C    **                  dec     cx
   003D    **                  and     cx,MyValue%
   0041    **                  and     cx,MyMask%
   0045    **                  and     cx,cx
   0047    **                  je      I00003
   0049    **                  mov     B%,0001h
   004F   000E
   004F   000E
   004F    **        I00003:

Note that the comparison is promoted to the top of the expression (as opposed to the order in which it was written).

It is not predictable which expressions will be executed and in what order, the compiler has full freedom to optimize at will. Even the use of parentheses do not guarantee order of execution, though they change precedence and certainly will make complex expressions more easily read.

   If (A% > B% And (B% > C% And (D% > E% Or F% > G%))) Then
      B% = 1
   End If

Though parentheses are used, the order of execution of the comparisons is neither fully inside out or outside in nor is it left to right or right to left:

   0030   0006    If (A% > B% And (B% > C% And (D% > E% Or F% > G%))) Then
   0030   0006            B% = 1
   0030   0006    End If
   0030    **        I00002:   mov     ax,B%
   0033    **                  cmp     ax,A%
   0037    **                  jnl     I00003
   0039    **                  mov     ax,E%
   003C    **                  cmp     ax,D%
   0040    **                  jl      I00004
   0042    **                  mov     ax,G%
   0045    **                  cmp     ax,F%
   0049    **                  jnl     I00003
   004B    **        I00004:   mov     ax,C%
   004E    **                  cmp     ax,B%
   0052    **                  jnl     I00003
   0054    **                  mov     B%,0001h
   005A   0014
   005A   0014
   005A    **        I00003:

It is key to note that the compiler has freedom to execute the expression in any way it wishes, but no freedom to change the precedence or logical meaning of the expression.

Function Promotion

The following fragment contains embedded functions.

   If A% > B% And B% > MyFunc1%(C%) And D% > MyFunc1%(E%) Then
      B% = 1
   End If

Early behavior established that functions embedded in logical expressions are always executed. This behavior is preserved by the compiler, even during short circuiting optimizations. This expression compiles to:

   004E   0006    If A% > B% And B% > MyFunc1%(C%) And D% > MyFunc1%(E%) Then
   004E   0006            B% = 1
   004E   0006    End If
   004E    **        I00003:   mov     ax,offset C%
   0051    **                  push    ax
   0052    **                  call    MYFUNC1%
   0057    **                  mov     _T0008%,ax
   005A    **                  mov     ax,offset E%
   005D    **                  push    ax
   005E    **                  call    MYFUNC1%
   0063    **                  mov     _T000C%,ax
   0066    **                  mov     ax,B%
   0069    **                  cmp     ax,A%
   006D    **                  jnl     I00005
   006F    **                  cmp     ax,_T0008%
   0073    **                  jng     I00005
   0075    **                  mov     ax,_T000C%
   0078    **                  cmp     ax,D%
   007C    **                  jnl     I00005
   007E    **                  mov     B%,0001h

The calls to user functions are promoted to the top of the evaluation using hidden variables to hold intermediate results, followed by short circuit evaluation of the logical expression itself.

Conclusion

The behavior of logical expressions in MS Basic follows well established and clear rules. Knowing this behavior is critical to proper execution of any application. Knowledge of the behavior includes both the defined and undefined characteristics.

Misunderstanding the defined behavior, or dependence on undefined behavior, lead to unwanted (even if predictable) behavior.

-- D.A. Barclay