You may all know about String concatenation in Delphi, but do you know about the implicit String variables the compiler may create for you?
Along with the implicit variables come implicit exception frames, and a whole lot of hidden stack juggling, which can quickly become hidden complexity bottlenecks [1].
Looking Innocent
What’s more innocent-looking than a function like this one:
function Apples(nb : Integer) : String;
begin
Result := IntToStr(nb) + ' apple(s)';
end;
but we all know looks can be deceiving, and sometimes where there are apples, there can be a worm…
Well, here it is, look behind the code at the asm generated for that function:
Project65.dpr.31: begin 005D82B0 55 push ebp 005D82B1 8BEC mov ebp,esp 005D82B3 6A00 push $00 005D82B5 53 push ebx 005D82B6 56 push esi 005D82B7 8BF2 mov esi,edx 005D82B9 8BD8 mov ebx,eax 005D82BB 33C0 xor eax,eax 005D82BD 55 push ebp 005D82BE 68F8825D00 push $005d82f8 005D82C3 64FF30 push dword ptr fs:[eax] 005D82C6 648920 mov fs:[eax],esp Project65.dpr.32: Result := IntToStr(nb) + ' apple(s)'; 005D82C9 8D55FC lea edx,[ebp-$04] 005D82CC 8BC3 mov eax,ebx 005D82CE E8BD5FE4FF call IntToStr 005D82D3 8B55FC mov edx,[ebp-$04] 005D82D6 8BC6 mov eax,esi 005D82D8 B910835D00 mov ecx,$005d8310 005D82DD E85AF9E2FF call @UStrCat3 Project65.dpr.33: end; 005D82E2 33C0 xor eax,eax 005D82E4 5A pop edx 005D82E5 59 pop ecx 005D82E6 59 pop ecx 005D82E7 648910 mov fs:[eax],edx 005D82EA 68FF825D00 push $005d82ff 005D82EF 8D45FC lea eax,[ebp-$04] 005D82F2 E8E9EAE2FF call @UStrClr 005D82F7 C3 ret
That was scary eh? Does it matter in terms of performance? Yep, it is about twice slower than the alternative I will give below, and it’ll be even worse in a multi-threaded setting. In the profiler it will often come out as begin/end being the bottleneck [2] points.
All that extra complexity because IntToStr is a function that returns a String, so the compiler needs a temporary variable to store that String. Ans a String is a reference-counted type, so the compiler needs an implicit exception frame (and yes, with ARC objects in NextGen you get the same kind of exception frames).
Previous: Looking Innocent [4]
Looking Stupid
A simple way of avoiding the implicit variable is to use the one variable you have, Result:
function Apples(nb : Integer) : String;
begin
Result := IntToStr(nb);
Result := Result + ' apple(s)';
end;
While that does look stupid, if you look behind the code at asm, it is much simpler now:;
Project65.dpr.31: begin 005D82B0 53 push ebx 005D82B1 56 push esi 005D82B2 8BDA mov ebx,edx 005D82B4 8BF0 mov esi,eax Project65.dpr.32: Result := IntToStr(nb); 005D82B6 8BD3 mov edx,ebx 005D82B8 8BC6 mov eax,esi 005D82BA E8D15FE4FF call IntToStr Project65.dpr.33: Result := Result + ' apple(s)'; 005D82BF 8BC3 mov eax,ebx 005D82C1 BADC825D00 mov edx,$005d82dc 005D82C6 E819F9E2FF call @UStrCat Project65.dpr.34: end; 005D82CB 5E pop esi 005D82CC 5B pop ebx
And it executes about twice faster as well, and as a bonus, it will be even faster in multi-threaded applications due to reduced pressure on the memory-manager.
And if you inline the functions, the second one will still keep its advantage, as inlining will only move the implicit variable and frame to the function where the inlining takes place.
Not so stupid after all
Note that this “optimization” use of an explicit local variable can be leveraged with all reference-counted types (dynamic arrays, interfaces, objects in ARC compilers…), and it can also help with debugging, as it makes it very simple to inspect the intermediate returned values.
So sometimes, it can pay twice to write a little more code: it’ll be easier to debug/maintain, and it will run faster.
That was for a single concatenation. What happens if you have many?
Check the followup article: Efficient String Building in Delphi [5].