Input, control and automation

Thursday 17 September 2015

VBA: 32,767 Leagues Under the Sea

04:38 Posted by Roxton No comments
A few weeks ago I attended a friend's stag-do in Dublin. He, I, and almost everyone else there had met while studying (in my case "attending" or "appearing on a register" might be a better description) at Imperial College, and we are all now techies of varying degrees of hardness (the groom does webby, 2.0-esque stuff, but we try to be kind). During a conversation I mentioned that as I was doing more automation work now than previously I was dealing with a lot of VBA, and that, despite its many shortcomings, I rather enjoyed it. My interlocutor snorted in a rather pitying way. "Don't you ever get frustrated by not being able to multiply two five-digit numbers together?" he asked, wiping Wee Whore (it was a stag-do in Dublin; of course there were humorously-named ales) from his chin. I replied, perhaps more sharply than I would have done had my face not been covered by a fine mist of Wee, that automation rarely required me to do so, and changed the subject to a recent systems cock-up at his grocery company.

There is, though, something rather silly about the way in which VBA handles number types. Wikipedia describes VBA's typing discipline as a static/dynamic/strong/weak hybrid, which, while lacking the verve of "to or for, by with and from everybody", is nevertheless a very good description. To see just how omni-shambolic it is, load up your favourite VBA IDE (by which I mean, press Alt+F11 in Excel) and play along. Let's start by printing a VLN (very large number):
Debug.Print 123456789123456789
You will notice that not only is VBA capable of handling VLNs, but that it automatically converts such numbers into scientific notation. So far, so good. Now, let's try the relatively simple task of multiplying two five-digit numbers together:
Debug.Print 11111 * 11111
Readers who still recall arithmetic tricks from before the shapes of the office began to close upon them will know that the output should be 123,454,321 - much smaller than our VLN above. However, VBA declines to frame that fearful symmetry and instead hands us an overflow error. Clearly, something else is going on. Rather than try to build the mystery any further (because to do so would only lead to a terrible bathos, and because most of you have probably already guessed the answer), let's jump straight to it. Here's another fine and revealing mess:
Debug.Print 32768     'This prints perfectly well
Debug.Print 32768 + 1 'And so does this
Debug.Print 32767 + 1 'But this returns an overflow error
That the magic number here is 32,767 will give the game away to those of you who have, in proper Snow Crash style, re-wired their brains to see the world in binary. For those who haven't, 32,768 is 2^15, and 2^15-1 is, as any fule no, the upper limit of the VBA Integer type (protip: that trivium can, at STP, end 90% of first dates within ten seconds of use. The more you know, etc.).

If you've played any submarine games or read Willard Price's Adventure series of books, you'll know that under the sea, in addition to singing crustaceans, there is "thermal layer" where warm water rests on top of cold, and that a submarine or other object passing through this layer must do so at speed lest it bounce off (actual submariners will know that almost none of that is true). The analogy is, I assume, obvious to you all. No?

Well, while other, fancier, la-di-da languages have signed- and unsigned- 32- and 16- and even 64-bit integers, VBA eschews the trappings of its decadent capitalist creators and instead supplies three honest, working-man's number-types: Byte, Integer, and Long (yes, I know it also has doubles and singles and those weird string-number things, but it's Party policy to treat those like Tony Benn fiddling his inheritance tax and pretend they never happened). VBA, as already mentioned, is weakly, strongly, statically and dynamically typed (which makes it fantastic for meeting diversity quotae), and will do its best to choose types for you should you fail to do so. This would work perfectly well, if not for the fact that output objects (such as those that are returned in the Debug.Print statements) are, if not otherwise specified, assumed to be of the same type as their input objects. When we asked VBA to multiply 11,111 by 11,111, it created two Integer objects to hold those values, performed the multiplication, and tried to pipe the output into an object of the same type - an Integer. Of course, 123,454,321 does not fit in an Integer object, and so we get an overflow. However, when we asked it to add 1 to 32,768, it created a Long object to hold 32,768 (which would not fit in an Integer), and assumed the output to be Long as well (which it is). Once we've passed the thermal layer we can move around as we please; it is only when crossing it that we have difficulties. Curiously, this limitation does not apply when dealing with explicitly typed values; this:
Dim int1 As Integer
Dim int2 As Integer
int1 = 11111
int2 = 11111
Debug.Print int1 * int2
works perfectly well. So, now you know: you can multiply two five-digit numbers together in VBA, but only if you pass them in variables. In other words, don't hard-code your values, you Whore-drinking muppet.

Bootnote:

Readers curious to explore the limitations of types of VBA in more detail should know that there is one more "thermal layer"; a Zimmerman's Valley before we pass into the abyss that is the Variant type. You can find it at 2,147,483,647.

Tuesday 15 September 2015

Arrays in VBA

06:02 Posted by Roxton , No comments
Judged entirely on its own merits, the implementation of arrays in VBA is neither significantly better nor worse than in most other scripting languages. They are, however, idioscyncratic in ways that seem specifically designed to confuse visitors - rather like the Paris Metro. Flicking back and forth between C#, C++, Java and VBA, I often catch myself using square brackets instead of parentheses or trying to add values dynamically and suffering the wrath of the debugger as a result. Here, then, for my reference as much as for anyone else, is a quick cheat-sheet for the use of arrays in VBA.

Defining arrays:

Defining an array is the same as defining any other variable, except that you include the dimensions of the array in parentheses:
Dim single_word As String    'This is a string variable
Dim array_of_ten_words(10) As String    'This is an array of strings
To write to or write from the array, you can use it as you would any other variable, provided that you specify the index in parentheses:
array_of_ten_words(1) = "one"
array_of_ten_words(5) = "five"
array_of_ten_words(10) = "ten"
Debug.Print array_of_ten_words(5)    'Outputs "five"
If you try to read from an index that has not been initialized (that is, an index to which no value has been assigned), you will receive a different value depending on the type of the array:
Dim array_of_ten_words(10) As String
Debug.Print array_of_ten_words(10)   'Returns an empty string ("")

Dim array_of_ten_ints(10) As Integer
Debug.Print array_of_ten_ints(10)    'Returns 0

Dim array_of_ten_bools(10) As Boolean
Debug.Print array_of_ten_bools(10)   'Returns false
If you try to initialize or read an index that exceeds the dimensions, you will receive Run-time error '9': Subscript out of range:
Dim array_of_ten_words(10) As String
array_of_ten_words(11) = "eleven"    'Fails with error 9
Debug.Print array_of_ten_words(11)   'Also fails with error 9
It's worth remembering that in VBA arrays are by default zero-based, which means that their first index is 0. Therefore, our array_of_ten_words actually contains eleven strings: 0 to 10:
Dim array_of_ten_words(10) As String
array_of_ten_words(0) = "zero"
Debug.Print array_of_ten_words(0)    'Returns "zero"
It is good practice, though unnecessary, to specify the lower bound of your arrays. In blatant defiance of Dijkstra's instructions, dimension specifications in VBA are inclusive:
Dim array_of_five_words(5 to 10) As String
array_of_five_words(5) = "five"
array_of_five_words(10) = "ten"
array_of_five_words(4) = "four"      'This line will fail with error 9
Thus, defining an array with a single subscript is just shorthand for specifying 0 as the lower bound:
Dim array_of_ten_words(10) As String
'Is functionally identical to
Dim array_of_ten_words(0 to 10) As String
If you want to get the upper and lower bounds of an array, you can use the UBound and LBound functions:
Dim array_of_five_words(5 to 10) As String
Debug.Print LBound(array_of_five_words)    'Returns 5
Debug.Print UBound(array_of_five_words)    'Returns 10

Dynamic arrays: 

So far we have seen only fixed-length arrays - once their dimensions have been set, they cannot be changed. However, you can also create dynamic arrays, the dimensions of which can be modified throughout your procedure. Creating a dynamic array is simple - don't include dimensions during definition:
Dim dynamic_array() As String
At this point the dynamic array is fairly useless, as read or write calls to any of its indexes will return errors. To use the array, you must assign it dimensions using the ReDim command:
Dim dynamic_array() As String
ReDim dynamic_array(5 to 10) As String    'dynamic_array can now be used as a normal array
When re-dimensioning a dynamic array the same rules apply as when defining a fixed-size array. Note that you cannot re-dimension an array that was originally defined as a fixed-size array:
Dim fixed_array(10) As String
ReDim fixed_array(20) As String    'This returns an error: "Array Already Dimensioned"
While that error message implies that you can't re-dimension an array more than once, you can in fact re-dimension a dynamic array as many times as you like; it is only fixed-size arrays that you cannot re-dimension. When you re-dimension a dynamic array, all the data stored in that array will be lost, and the array is effectively un-initialized. You can use this behaviour as a quick way to "wipe" an array:
ReDim dynamic_array(LBound(dynamic_array) To UBound(dynamic_array))
Being a keen-eyed and sharp-witted reader (or indeed a sharp-eyed and keen-witted one; for all practical purposes such readers are interchangeable), you will have noticed that I did not include the type (As String) in that last line. This is because VBA is eager to please and knows how you defined it the first time round. You can reaffirm its type if you like (though you don't need to), but don't get carried away and try and change it. If you do something like this:
Dim dynamic_array() As String
ReDim dynamic_array(10) As Integer
then you'll receive a compile error: Can't change data types of array elements, muggins.  If you need to have multiple data types in the same array you can simply omit the type declaration:
Dim mixed_array(10)
mixedarray(1) = "string"
mixedarray(2) = 4
mixedarray(3) = false
'And so on
But back to ReDim. If you would rather that all your lovely data not be lost when you re-dimension a dynamic array, you can use the Preserve keyword to instruct VBA to prevent records from being un-initialized:
Dim dynamic_array() As String
ReDim dynamic_array(5)
dynamic_array(1) = "one"
ReDim Preserve dynamic_array(10)
Debug.Print dynamic_array(1)    'Returns "one"
Preserve is a very useful modifier, not least because it allows us to append items to an array without having to go through the hassle of looping out the contents first:
Dim no_more() As String
ReDim no_more(2) As String
no_more(1) = "Blandford Forum"
no_more(2) = "Mortehoe"
ReDim Preserve no_more(UBound(no_more) + 1)    'Increase the size of the array by 1
no_more(UBound(no_more)) = "Midsomer Norton"   'Add an entry to the last index in the array
Unfortunately, Preserve is not without its quirks. Using Preserve you can increase or reduce the upper bound of an array, but you cannot change the lower bound:
Dim dynamic_array() As String
ReDim dynamic_array(5 to 10)
ReDim Preserve dynamic_array(5 to 11)    'This works
ReDim Preserve dynamic_array(5 to 8)     'And so does this
ReDim Preserve dynamic_array(6 to 10)    'But this will throw an error
ReDim Preserve dynamic_array(4 to 10)    'As will this
To be honest I don't think I've ever had occasion to change the lower bound of an array, but I'm sure it must be done from time to time.

One final point about bounds. I mentioned earlier that by default VBA arrays are zero-based; that is, a an array that is defined without an explicitly stated lower bound is assumed to have a lower bound of 0. If you wish to change the default lower bound to 1 because you're on holiday from MATLAB, you can do so by using the Option Base command. As with other VBA Option commands, this must be stated at the start of a module before any procedures. Its only argument is the default base, which can be either 1 or 0:
Option Base 0
Sub test()
    Dim array(10) As String
    Debug.Print LBound(array)    'Returns 0
End Sub

'In another module:
Option Base 1
Sub anotherTest()
    Dim array(10) As String
    Debug.Print LBound(array)    'Returns 1
End Sub

Multi-dimensional arrays: 

For the most part, you can use multi-dimensional arrays in exactly the same way you can the single-dimensional arrays we've seen so far:
Dim 1D_array(10) As String
Dim 2D_array(10, 5) As String
Dim 3D_array(10, 5, 4 to 8) As String
'And so on
Just separate your dimensions by commas and you're good to go. You can even mix dimensions with explicitly-stated lower bounds and those without, as can be seen in the 3D_array example above. There are, however, two features of arrays that act slightly differently when dealing with multiple dimensions. First, you can't use ReDim to change the number of dimensions if you're using Preserve:
Dim array()
ReDim array(4)    'This works, as we already know
ReDim array(4, 10, 5 to 8)    'This works too
ReDim array(3, 5)             'You can reduce the number of dimensions, no problem
ReDim Preserve array(3)       'But this will Error #9
ReDim Preserve array(3, 5, 6) 'And so will this
Second, if you want to use UBound or LBound, you'll need to specify the dimension to which you're referring (which is perfectly reasonable). If you don't, it will return the bound of the first dimension:
Dim array(4, 10)
Debug.Print UBound(array)    'Outputs 4
Debug.Print UBound(array, 1) 'Also outputs 4
Debug.Print UBound(array, 2) 'Outputs 10
Debug.Print UBound(array, 3) 'Returns Error #9
Now, being a normal, rational person (or as normal and rational as any person is who chooses to read an article about VBA arrays), you may be wondering what function returns the number of dimensions of an array. Is it .Rank, perhaps? NumDims? No. Astonishingly, there is no native VBA function this at all. This is acknowledged by Microsoft in a Retired Knowledge Base article which recommends looping through the dimensions until you hit an error. If you're really keen on showing off, it is possible to parse the array in binary and deduce its dimensions using kernel32.dll, but to be honest unless you have tens of thousands of dimensions it's almost certainly faster, and always-certainly simpler to use the error method, which I have re-written as a function below:
Function numDimensions(testArray as Variant) As Long
    Dim i As Long
    On Error GoTo ErrHandler
    For i = 1 to 60000    'According to KB this is the maximum number of dimensions an array can have
        'Do something, anything, to trigger an error
        'Trying to read the bounds of a non-existent dimension should do it:
        ErrorCheck = LBound(testArray, i)
    Next i

ErrHandler:
    numDimensions = i - 1
End Function

Limits

This is probably a good time to talk about size limits on arrays. While the aforementioned Knowledge Base article claims that the maximum number of dimensions an array can have is 60,000, you're going to be hard-pressed to test that. On my fairly sturdy little developer's laptop, I could only get up to 26 single-unit arrays before I ran into the dread Error #7: Out of Memory. Your mileage may vary, of course, but unless you're running VBA on a 2001-esque monolith (and if you are, please stop; I'm pretty sure that constitutes an act of intergalactic aggression), you're going to run up against hardware RAM limits long before you're constrained by those imposed by the software. For example, the size of a single dimension in a VBA array is supposedly limited to 2^31 - 1 (2147483647), but I can't test that because I can only get up to 99999999 before I run out of memory. If I'd added more than one dimension I wouldn't have got that far, because of course the size of an array is the the sum of the sizes of its dimensions.

Tuesday 8 September 2015

VBA: Test if style exists in Word document

06:51 Posted by Roxton , , 3 comments
Frustratingly, there isn't a native function for testing whether or not a particular style exists in a Word document or template. The most direct and obvious way to work this out is to loop through all the styles in the document, thus:
Dim style As Word.Style

For Each style In ActiveDocument.Styles
    If style = "style_to_test" Then
        Debug.Print "style_to_test exists in this document"
        Exit For
    End If
Next style
While this works, it's extremely slow. I don't mean that in the special "if you run this a million times it will add twelve microseconds to your process" sense of "slow" reserved for quants and 4K gamers, either: for some reason it can take Word a good half-second or so to retrieve the information for each style in your loop, so if you have a document with a hundred styles it can take about a minute for VBA to do something that you could visually check in a couple of seconds.

If you want to speed things up, you'll have to resort to error handling. I know, I know, it's terribly bad manners, but I think it's justified here - rather like police officers wearing made-up ties. I've tried a few different approaches to style errors, and I think this is the fastest and least likely to crash:
Dim testStyle as Word.Style

On Error Resume Next
Set testStyle = ActiveDocument.Styles(style_to_test)
if testStyle is nothing then
    debug.print "style_to_test is not in the document"
else
    debug.print "style_to_test is in the document"
end if
On Error Goto 0    'Or whatever error handler you were using before

Because this is code I tend to use quite frequently, I find it easier to bundle it up into a boolean-returning function:
Function styleExists(ByVal styleToTest As String, ByVal docToTest as Word.Document) As Boolean
    Dim testStyle as Word.Style
    On Error Resume Next
    Set testStyle = docToTest.Styles(styleToTest)
    styleExists = Not testStyle Is Nothing
End Function
The other nice thing about using a function is that you don't have to worry about resetting your error handling; the On Error Resume Next setting is contained within the function. Now we have our function, we can use it whenever we need to check a style:
    ...
    If styleExists(myStyle,ActiveDocument) = True Then
        'Do some tests involving myStyle
    End If
    ...