- DelphiTools - https://www.delphitools.info -

Efficient String Concatenation in Delphi

Apple_WormYou may all know about String concatenation in Delphi, but do you know about the implicit String variables the compiler may create for you?

Along with the implicit variables come implicit exception frames, and a whole lot of hidden stack juggling, which can quickly become hidden complexity bottlenecks [1].


Looking Innocent

What’s more innocent-looking than a function like this one:

function Apples(nb : Integer) : String;
begin
   Result := IntToStr(nb) + ' apple(s)';
end;

but we all know looks can be deceiving, and sometimes where there are apples, there can be a worm…

Well, here it is, look behind the code at the asm generated for that function:

Project65.dpr.31: begin
005D82B0 55               push ebp
005D82B1 8BEC             mov ebp,esp
005D82B3 6A00             push $00
005D82B5 53               push ebx
005D82B6 56               push esi
005D82B7 8BF2             mov esi,edx
005D82B9 8BD8             mov ebx,eax
005D82BB 33C0             xor eax,eax
005D82BD 55               push ebp
005D82BE 68F8825D00       push $005d82f8
005D82C3 64FF30           push dword ptr fs:[eax]
005D82C6 648920           mov fs:[eax],esp
Project65.dpr.32: Result := IntToStr(nb) + ' apple(s)';
005D82C9 8D55FC           lea edx,[ebp-$04]
005D82CC 8BC3             mov eax,ebx
005D82CE E8BD5FE4FF       call IntToStr
005D82D3 8B55FC           mov edx,[ebp-$04]
005D82D6 8BC6             mov eax,esi
005D82D8 B910835D00       mov ecx,$005d8310
005D82DD E85AF9E2FF       call @UStrCat3
Project65.dpr.33: end;
005D82E2 33C0             xor eax,eax
005D82E4 5A               pop edx
005D82E5 59               pop ecx
005D82E6 59               pop ecx
005D82E7 648910           mov fs:[eax],edx
005D82EA 68FF825D00       push $005d82ff
005D82EF 8D45FC           lea eax,[ebp-$04]
005D82F2 E8E9EAE2FF       call @UStrClr
005D82F7 C3               ret

That was scary eh? Does it matter in terms of performance? Yep, it is about twice slower than the alternative I will give below, and it’ll be even worse in a multi-threaded setting. In the profiler it will often come out as begin/end being the bottleneck [2] points.

All that extra complexity because IntToStr is a function that returns a String, so the compiler needs a temporary variable to store that String. Ans a String is a reference-counted type, so the compiler needs an implicit exception frame (and yes, with ARC objects in NextGen you get the same kind of exception frames).

Next: Looking Stupid [3]

Previous: Looking Innocent [4]

Looking Stupid

A simple way of avoiding the implicit variable is to use the one variable you have, Result:

function Apples(nb : Integer) : String;
begin
   Result := IntToStr(nb);
   Result := Result + ' apple(s)';
end;

While that does look stupid, if you look behind the code at asm, it is much simpler now:;

Project65.dpr.31: begin
005D82B0 53               push ebx
005D82B1 56               push esi
005D82B2 8BDA             mov ebx,edx
005D82B4 8BF0             mov esi,eax
Project65.dpr.32: Result := IntToStr(nb);
005D82B6 8BD3             mov edx,ebx
005D82B8 8BC6             mov eax,esi
005D82BA E8D15FE4FF       call IntToStr
Project65.dpr.33: Result := Result + ' apple(s)';
005D82BF 8BC3             mov eax,ebx
005D82C1 BADC825D00       mov edx,$005d82dc
005D82C6 E819F9E2FF       call @UStrCat
Project65.dpr.34: end;
005D82CB 5E               pop esi
005D82CC 5B               pop ebx

And it executes about twice faster as well, and as a bonus, it will be even faster in multi-threaded applications due to reduced pressure on the memory-manager.

And if you inline the functions, the second one will still keep its advantage, as inlining will only move the implicit variable and frame to the function where the inlining takes place.

Not so stupid after all

Note that this “optimization” use of an explicit local variable can be leveraged with all reference-counted types (dynamic arrays, interfaces, objects in ARC compilers…), and it can also help with debugging, as it makes it very simple to inspect the intermediate returned values.

So sometimes, it can pay twice to write a little more code: it’ll be easier to debug/maintain, and it will run faster.

That was for a single concatenation. What happens if you have many?

Check the followup article: Efficient String Building in Delphi [5].