Code Optimization: Go For the Jugular

Previous: Stalking the Prey

Repeat Until Belly.Full;

Those first 33% were easily gained. Let’s go for another round of SamplingProfiler:

Going For The Jugular - Further Profiling Results

Things are more satisfying: the line performing the actual work is now taking up most of the CPU time. Second comes the case of line. For further speed improvements, we now need to move the conditional out of the loop:

procedure DoSomething3(var data : array of Integer; what : TDoWhat);

   procedure RaiseUnsupported(what : TDoWhat);
      raise Exception.Create('Unsupported: '+GetEnumName(TypeInfo(TDoWhat), Integer(what)));

   i : Integer;
   case what of
      dwInc :
         for i:=Low(data) to High(data) do
      dwDec :
         for i:=Low(data) to High(data) do

We have increased the line count noticeably, but most of those extra lines are still cosmetic. What further makes it a reasonable trade-off is that the execution time has been reduced by 66% from the initial version, it now executes 3 times faster!

Are there any more easy gains to be had? Let’s run the last version through SamplingProfiler:

Going For The Jugular - Final Profiling Results

More than 92% of the execution time now goes to the loop and actual work. We got only a wee bit left for stack setup (line 96) and the case of (line 97). At this point, the above makes it clear that if you want to go faster you’ll have to increase the line count and code complexity significantly as you’ll need to replace the two-liner loops with something else, which is bound to be heavier (unrolling, SIMD, etc.)

Next: Rest Under A Tree

6 thoughts on “Code Optimization: Go For the Jugular

  1. > the compiler won’t “know” about the exception in the called procedure

    That’s something that I really miss in Delphi. You can’t give the compiler a hint that a procedure never returns. C++0x has something for this but Delphi doesn’t. Maybe somebody should file a feature request in QC.

  2. Good article! This reminds me I need to try your SamplingProfiler on some of my code, hopefully I will help me remove some bottlenecks.

  3. > the compiler won’t “know” about the exception in the called procedure

    The approach I’d have taken would have been to have a utility function that returns the Exception to be raised.

    ie: line 105 to read
    raise CreateUnsupportedException(what)

    and defined as
    function CreateUnsupportedException(what : TDoWhat): Exception;
    Result := Exception.Create(‘Unsupported: ‘+GetEnumName(TypeInfo(TDoWhat), Integer(what)));

    Wouldn’t that still remove the Sting cleanup code into the function but also allow the compiler to “see” the raise.

  4. i think this is one of the most helpful technical articles, on any subject, that i’ve ever seen. It starts with a real example, and it deals with the real questions that result. And what’s good is that you actually focus on the weird stuff, explain why it is, and how to fix it – or why it should not be fixed.

    By dealing with the tough questions, in the same way that they would occur to another developer, you make the task of optimizing seem obvious and natural.

    Actually saying that a CreateFmt would be the first fix, but then explaining why it won’t help in this case, is perfect.

  5. Yes, great educational article, thank you Eric.

    But interestingly I tested this same procedure under Delphi 7 as an exercise to become familiar with SamplingProfiler, and in my tests the exception frame overheads were insignificant.

    The time spent on the “end” statement in DoSomething1 varies between 0 and 11%.

    I tested with array sizes from 40,000,000 to 200,000,000 integers, with the following typical number of samples for each array size:

    Array size: 40M 80M 200M
    DoSomething1 105 212 544
    DoSomething4 104 211 539

    So it seems to me that with Delphi 7 it’s not worth the disadvantages of moving the exception to a procedure.

    Or am I missing something? (The “stack setup” line did not appear in any of my profiling, nor did the “case of” line for DoSomething4).

Comments are closed.