High Performance Delphi

 Home   Fundamentals   Guide   Code   Links   Tools   Feedback 

String Optimization Guidelines


Contents

Style Guide

HyperString

string initialization

Pre-allocating memory

Thread Safety

"fixing" D5 strings

short strings

Copy

longstrings

delete vs copy

Concatenation

casting to pchar


Style Guidelines

Use Hyperstring instead of "rolling your own"

There is no point in reinventing the wheel. The HyperString freeware library addresses the shortcomings and inefficiencies of the native AnsiString function set included with Delphi. If you are basically doing "normal" sorts of string operations but need more speed you should start here first.

Do not double-initialize strings

The default string type, AnsiString, is automatically initialized to be empty upon creation. Consequently, there is no need to initialize it a second time. For instance the code s := ''; is redundant below:

procedure GreatestOnEarth;
var 
  S: string; // a long string, not short!
begin
  S := '';
  ...
end;

Note that this does not extend to functions that return a string since the behavior of the result variable in this case is better characterized as a passed var parameter than a local variable.

Use SetLength to preallocate longstrings (AnsiStrings) whereever possible.

Dynamic allocation makes AnsiStrings very powerful. Unfortunately, it is quite easy to abuse this power. A typical situation looks something like this:

S2 := '';
for I := 2 to length(S1) do 
  S2 := S2 + S1[I];

Ignoring the fact that Delete could be used for this, the problem here is that memory for the S2 string may need to be re-allocated repeatedly inside the loop. This takes time. A simple and potentially much more efficient alternative is this:

setlength(S2, length(S1) - 1);
for I := 2 to length(S1) do 
  S2[I-1] := S1[I];

Here, memory for S2 is allocated only once, prior to the loop.

This sort of "memory manager abuse" is common with AnsiStrings only because re-allocation is automatic and thus easily ignored. With PChar and manual allocation, the programmer is made painfully aware of the problem with this coding style. The older Pascal style strings avoided this problem entirely by using static allocation.

Thread safety of strings and dynamic arrays - Applies to: Version 5+ and CPU's before Pentium III and Athlon

The thread safety of strings and dynamic arrays has been improved by preventing reference count problems. Previously, Reference counts were read altered then saved resulting in the potential for another reference on another thread to read or write in between those operations. This has been fixed by directly altering the reference count and locking that single instruction to prevent preemption. Everything has a price unfortunately. The lock CPU instruction prefix used to achieve this thread safety is quite expensive on Pentium II processors. My measure of the effect of this change is an additional 28 cycles per refcount adjustment, which in a worst case scenario can result in a 2x decrease in performance. Real world reports have placed the impact in the 0 to 20% range.

Reverting back to version 4 longstring behavior

It is possible to undo the change to longstring behavior described above. You can even make longstrings even faster before. To do this you need to make some changes in system.pas and recompile it.

The easiest way to recompile system.pas is to use the make utility and the makefile located in the /source/Rtl directory. Copy the source to a new directory! You don't want to replace the originals. The make also expects certain other subdirectoies such as lib and bin to be present. Make sure your new location has these as well. Also you will need TASM as there are many external asm files that need compiling as well.

The changes that need to be made are fairly simple. First, you need to get rid of all the lock prefixes. I prefer to do a global replace of 'lock' with '{lock}'. This will return strings and dynamic arrays to the Pre Version 5 performance levels. To go beyond that you need to eliminate two xchg instructions. These instructions have implicit lockprefixes. The original code is shown below:

procedure _LStrAsg{var dest: AnsiString; source: AnsiString};
...
@@2:    XCHG    EDX,[EAX]
...
procedure       _LStrLAsg{var dest: AnsiString; source: AnsiString};
...
        XCHG    EDX,[EAX]                       { fetch str                    }
...

In both cases you can replace the XCHG instruction by using three move instructions and ecx as a temp register:

procedure _LStrAsg{var dest: AnsiString; source: AnsiString};
...
@@2: {   XCHG    EDX,[EAX]}
        mov ecx,[eax]
		mov [eax],edx
		mov edx,ecx 
...
procedure       _LStrLAsg{var dest: AnsiString; source: AnsiString};
...
        {XCHG    EDX,[EAX]}                       { fetch str                    }
        mov ecx,[eax]
		mov [eax],edx
		mov edx,ecx 
...

The above changes will result in string assignments executing about 6 times faster than they do in Version 5. (2 times faster than Version 4).

Avoid using ShortStrings - Applies to: Version 5+

Presumably in an effort to phase out all the old shortstring methods and maintain only one set of string routines, shortstrings are converted to longstrings prior to many manipulations. This effectively makes these shortstring operations much slower.

Avoid using Copy to create dynamic string temporaries.

This also relates to memory manager abuse. A typical situation looks something like this:

if Copy(S1,23,64) = Copy(S2,15,64) then
  ...

Once again, the problem here is memory allocation for the string temporaries which takes time. It is unfortunate but the native AnsiString functions offer little alternative other than something like this:

I:=1;
Flag := False;
repeat
  Flag := S1[I+22] <> S2[I+14];
  Inc(I);
until Flag or (I>64);
if Not Flag then
  ...

Use longstrings (AnsiString) exclusively and cast to PChar when necessary.

Popular myth has it that AnsiString is somehow inherently less efficient. This stems from poor coding practices, memory manager abuse and lack of native support functions as described above. Once an AnsiString has been dynamically allocated, it is just like any other string; a linear series of bytes in memory, and no more or less efficient. With adequate support functions and proper coding, the performance difference with AnsiString is negligible.

Prefer Delete over Copy to remove from the end of the string

copy will always copy the entire string. However, delete will just cut off the end of the current one.

Change: AString :=copy(AString, 1, length(AString)-10);
To: Delete(AString, length(AString)-10, 10);

Concatenating Strings

The best way to concatenate strings is also the simplest. s1:=s2+s3+s4; will produce the best results regardless of the number of strings or whether they are compile-time constants or not. Note: In D2 when combining compile-time constants the s1:=Format([%s%s],s2,s3) approach may be faster.

Casting to PChar

Essentially there are 3 ways to convert a string to a pchar: typecast as pchar, take the address of the first character, and typecast the string to a generic pointer. Each of these does different things. Taking the address of the first character (i.e. p:=@s[1]; ) will force a call to UniqueString to ensure that the pchar returned points to a unique string only referenced by s in the above example. Typecasting a string to a PChar returns the address of the first character or if the string was empty it returns the address of a null. Thus the pchar is guarenteed to be non-nil. The simplest is casting as a generic pointer (i.e. p:=pointer(s);). This is also the fastest, as there is no hidden function call.

 Home   Fundamentals   Guide   Code   Links   Tools   Feedback 


Copyright © 2002 Robert Lee ([email protected])