|
If you've done a lot of web site development with ASP,
particularly using VBScript and generating dynamic page elements such
as tables from a database query and the like, then you know full well
that VBScript (and, in general, ANY interpreted scripting language)
absolutely STINKS at string concatenation. In .NET, you have a much more
efficient object, StringBuilder, that was specially designed for this
purpose. But with VBScript, and even in a compiled VIsual Basic COM DLL,
repetitive string concatenation is not only notoriously slow, it can run
the CPU right off the top of the meter! Ouch! Not good, eh?
You know what I'm referring to:
while not rs.EOF
strTable = strTable & "<TR><TD>" & rs("Name")
& "</TD><TD>" & rs("address")
& ....
... etc. etc.
A few years ago, Francesco Balena put out a tremendous
article called something like "A Fast Class for Strings" that
used (among other things) the Win APIs CopyMemory, FillMemory and work
with byte arrays and pointers to speed up the process for most string
operations in VB. Since that time I've seen a few stabs taken at optimizing
string concatenation in ASP, and so here's my own.
I think the key item here is first to understand how
VBScript (and Visual Basic too) handle strings, and see if there is a
way "around" it. Once you understand what you are asking the
scripting engine to do when you write, "strMyBigString =strMyBigString
& strMyLittleBittyString" 100 times, then you'll know why you
want to avoid this. Not only will those dynamically assembled ASP web
pages render up to 13 times faster, but your CPU will run a
lot cooler as well. In my book, both of those are good goals!
So what does VB actually do when you want to concatenate
a substring to an existing string?
Strings in Visual Basic are stored as BSTR's. If you
use the function VarPtr on a variable of type String, you will
get the address of the BSTR, which is a pointer to a pointer of
the string. To get the address of the string buffer itself, you can use
the StrPtr function. This function returns the address of the first
character of the string. Take into account that Strings are stored as
UNICODE in Visual Basic. So in VB, the variable of type String, "strMyString"
is really a POINTER to a four- byte structure in memory that only holds
the length and memory address of the actual UNICODE DATA. When you "Concatenate"
strings with the "&" operator, numerous copies are made
by VB behind the scenes to architect the new longer BSTR. The time to
accomplish this process increases exponentially with the number of concatenations
that need to be done.
But what about arrays? VB and VBScript, its stunted little
sister, have intrinsic array functions such as JOIN that are, not surprisingly,
MUCH FASTER at concatenating variant array elements. So how about if we
just cobble together a little VBScript Class that keeps our stuff in an
array (its always a Variant anyway, so what's the difference?) and then
when we're finished with all of our concatenations we have it just do
a JOIN on the array with NO DELIMITER so it all comes back as ONE BIG
LONG STRING, in a SINGLE OPERATION. Make sense?
Here's my take on a Fast String Class for VBScript:
Class FastString
Dim stringArray,growthRate,numItems
Private Sub Class_Initialize()
growthRate = 50: numItems = 0
ReDim stringArray(growthRate)
End Sub
Public Sub Append(ByVal strValue)
' next line prevents type mismatch error if strValue
is null. Performance hit is negligible.
strValue=strValue & ""
If numItems > UBound(stringArray) Then ReDim Preserve stringArray(UBound(stringArray)
+ growthRate)
stringArray(numItems) = strValue:numItems = numItems + 1
End Sub
Public Sub Reset
Erase stringArray
Class_Initialize
End Sub
Public Function concat()
Redim Preserve stringArray(numItems)
concat = Join(stringArray, "")
End Function
End Class
When the class is instantiated, Class_Initialize() sets
the growthRate and Redims our stringArray. When we call Append, we are
simply adding our substring as a new element and incrementing the numItems
counter to keep track of "how many". If we've exceeded the initial
growthRate, we also ReDim Preserve our existing elements and add another
growthRate worth of elements to the array. Reset is just a convenience
member, you may need to use it. Finally "concat" performs the
JOIN and returns our final string.
So how much faster is it? Well, just CLICK
ON THIS LINK to bring up a client-side VBScript page that will perform
the operation 5,000 times on a substring "This is a substring"
the old way, and then the new way, and show you the times and the speed
difference ratio. (Remember, I said "5000 times" - so give it
a few seconds to finsh and display). To get the code, just "View
source" on the page that comes up from the link.
Now of course, most programmers who are performance -
minded will be asking, "I wonder if this will work in Javascript,
too?". You bet it does! In fact, the difference with Javascript is
even more dramatic! To find out for yourself, try this
on for size! (again, this is client - side code, so it may take a
while to do the 5,000 iterations with each method).
So if you are saying "that's cool" and yet
you go off and continue to concatenate your strings the old way, I leave
you with this little gem to ponder:
"How many psychiatrists does it take
to change a lightbulb?"
"Just one, but it will take a long time, and the bulb has to really
want to change."
Peter Bromberg is an independent consultant specializing in distributed .NET solutions
in Orlando and a co-developer of the EggheadCafe.com
developer website. He can be reached at pbromberg@yahoo.com
|