Archive

Posts Tagged ‘Tips’

Knowing what and when to optimize…

April 20th, 2009
Comments Off

…is as important as knowing how to optimize.

In this thread on the Delphi forums Ante Bonic brought back to intention this excellent Delphi Optimization Guide in Delphi article by Robert Lee. The article has aged a bit, but many tips remain true with the Delphi 2009 compiler (sadly so).  Like many optimization articles, Robert’s focuses on mostly local optimization tips, which can draw in warnings like this one one by Anders Isaksson:

Optimization should be done after profiling, not before.

Which I couldn’t agree more with. But to be fair, Robert’s states so in his article, as do most authors of optimization articles. Recipes and local optimization tips are to be used after all algorithmic and data structures improvements have been taken advantage off.

If one can list tips and tricks for local optimization, do’s and don’ts that are true often enough to be good tips in many scenarios. However, it’s practically impossible to come up with a “reusable” list of tips for algorithms and data structures. Too many specifics can come together, even when the problems are similar, considerations of scale or reactivity can drastically influence architectural and algorithmic options.

Hence the most visible optimization recipes are often local optimization ones, but mostly because there are few global optimization recipes. You only have global optimization methodologies. But even these methodologies can usually be summarized with few words:

  1. Time, profile, analyze and confirm your bottlenecks.
  2. Improve algorithms & data structures.
  3. Exhaust 1 & 2 before looking at local optimizations, and then don’t forget 1.

To optimize efficiently, ie. not waste your time, you have to master the first point.
To optimize effectively, ie. not waste the machine time, you have to master the second.

And the third point you ask? It’s a razor’s edge, when applied effectively, it can be very efficient, with very few changes like in this case, but if not, it’s a good way to end up there. To be effective, local optimization has to be about taking care of hidden machinery, hidden shortcomings of the compiler, hidden algorithms and data-structures that get in the way.

I’ll close this post by quoting Robert Lee’s article on timing:

Timing code is generally called “profiling”. If you want to improve the performance of your code, you first need to know precisely what that performance is. Additionally, you need to re-measure with each change you apply to your code. Do not spend a single second twiddling code to improve performance until you have analytically determined exactly where the application is spending its time. I cannot emphasize this enough.

Tips , , , , , , ,

begin…end as bottlenecks?

March 25th, 2009

There will come a time when SamplingProfiler may report you that begin or end are your bottlenecks. This may sound a little surprising, but it’s actually quite a common occurrence, and something that instrumenting profilers are not going to point out, so it might be worth a little explanation.

This can be illustrated it with the minimalistic example of an array property getter. Witness the innocuous looking code below:

function TMyList.GetItem(index : Integer) : T;
begin
    if (index < 0) or (index >= Count) then
       Error(index);
    Result := FItems[index];
 end;

Nothing out of the ordinary there, you can find similar looking code in practically every array-based collection in the RTL and many third party libraries. But someday, that GetItem will be bottleneck, and you could be left looking at code profiling results like those:

begin-end-critical-01

Yes, those are the are the begin and end lines taking up more than 70% of the CPU time spent inside GetItem
You knew it! Sampling profilers are unreliable… or are they? Surely the index range checking must be the culprit? or the assignment and the reference counting business? Well, they could be, but in this case they aren’t.

To understand why, let’s have a look in the CPU view. Place a breakpoint on your begin, run up to there and hit Ctr+Alt+C, here is what you could see:

begin-end-critical-02

That’s a whole lot of traffic to the stack: 3 registers saved, 3 copies. Those things aren’t free, they can dwarf what your explicit code does, and in this example, they do. We didn’t even have any local variables, if we did, they would have taken setup and teardown code, and this code would have been “hidden” in begin and end too.

This illustrates a difference of sampling vs instrumenting profilers: the ability to pinpoint an actual bottleneck, even if it is “outside” of your explicit code, so you can find where the actual bottleneck is, and don’t waste time trying to optimize what isn’t critical.

Now what can you do to improve things locally? With generics, an interface type and Delphi 2009 sp2, nothing much, short of going BASM. The bottleneck code is compiler-generated, optimizing the assignment or the range checking would only provide minimal benefits. If you want to go faster, you’ll have to reduce the number of calls to GetItem, ie. open that “Show Callers” pane, have a look there, and solve the issue at the higher-level routines that are involved.

But there are other situations in which you can influence the auto-generated begin/end code, the solutions then typically revolve around distributing the code across smaller local functions or methods, tweaking your variable usage, separating branches, or if all else fails, going BASM… but that is food for future posts!

Tips , , , , , , ,

SamplingProfiler 1.6.0 out of the woods

March 20th, 2009

Version 1.6.0 of the Delphi sampling profiler is now available from its downloads page!

cpu-usage-optionsThe main addition is the ability to have sampling conditioned by CPU usage, ie. only gather profiling information when the CPU usage is high, either for the system or the process.
This was added with three goals in mind:

  • eliminate all that happens when the CPU isn’t busy from the profiling results, making it easier to focus on the CPU bottlenecks that matter.
  • gather profiling information only when the system is under stress, and find out if your code copes well with system stress… or is a poor OS citizen and just adds to the trouble.
  • identify sources of high CPU usage in your code, that could be reducing battery life when running on a mobile platform.

Note that CPU-usage based sampling can have the side-effect of eliminating I/O and other waits from the profiling results, so if your application’s bottlenecks aren’t CPU-based, you could miss them.

Other changes are support for the “Pause” key to pause profiling, time limit for sampling collection now starting from the first time sampling is enabled (rather than application start) and support for multi-selection when opening results.

This is also the first SamplingProfiler version compiled with Delphi 2009, oddities are not known at this point but expected. For all bug reports & suggestions, head to the forums.

News , , , , , ,

Saving results & merging

March 9th, 2009
Comments Off

SamplingProfiler run results can be saved to .spr files (Sampling Profiler Results) and later reused for comparison purposes, or for merging, one of the less obvious features of the profiler.
You can merge results by right-clicking on a results tab and selecting… “Merge results”, oddly enough. After this, the samples will be aggregated across the runs you selected, hopefully providing more statistical accuracy.

This can be particularly useful when analysing the results from multiple runs, collected from multiple users in the field via SamplingProfiler’s silent mode for instance. It can also be useful if you collect profiling information from automated test tools, each stressing the same library or base code in different ways.

Merging results isn’t as much about getting high numerical precision on your bottlenecks. Sure, you can use it for numerical accuracy, but who cares if a routine takes 95% or 92.24638% of the CPU time? identifying the bottleneck is usually all that matters.

Merging is about figuring out the bottlenecks that matter in everyday use, bottlenecks which may not come up in your routine tests, or may not be seen as critical when seen in isolation. It can be about getting information on that odd, hard-to-reproduce, slowdown your users may be experiencing from time to time. It can also be about identifying the minor bottlenecks that could be the cause of a “sluggish” feel to your UI.

A last word on the SPR files: those are persistence streams of SamplingProfiler native format, they’re binary, highly compact, and for you, the user, highly proprietary and blackboxy. If you want to do your own analysis on the profiling results, there is an alternative: you can save results as an XML file, which will include all the data in a verbose fashion. Be warned however that a deceptively small SPR can result in a huge XML file.

Tips , , , , , ,

Control sampling from your code

March 2nd, 2009
Comments Off

One issue when trying to profile a “live” application is that you may be getting a lot of noise, resulting from a particular library or section of code being executed from multiple contexts. You may also be after profiling only one particular case, and want some reproducibility between runs… in short: you want a finer grained control on when or for what the profiling will take place.

In those cases, you can control SamplingProfiler’s samples collection from your code with the following:

OutputDebugString('SAMPLING ON');
 ...whatever needs to be profiled...
 OutputDebugString('SAMPLING OFF');

Those calls to OutputDebugString() are understood as commands to turn sampling ON or OFF. Usually you’ll want to use this in conjunction with the “Start sampling on command only” option, but it can also be used in reverse to “pause” sample collection. OutputDebugString() is declared in the Windows.pas unit.

As of version 1.5.2, another command that is accepted via OutputDebugString() is ‘SAMPLING THREAD threadID’, which is used to define from which threadID samples must be collected. This is useful when you want to profile a particular thread in multi-threaded application… but that’s another can o’worms for another day!

Tips , , , , ,

Using SamplingProfiler from the IDE

February 27th, 2009
Comments Off

SamplingProfiler comes as a stand alone-application, but it’s also ready for integration in the IDE via the Tools menu. Go to the Tools menu configuration and add an entry for SamplingProfiler. Set the parameters field to $EXENAME. Voilà!

From now on, when working on a project, you can compile and then hit the SamplingProfiler entry in the tools menu, it will open on your current project executable. If you saved a profiling project (.spp file) alongside your executable, it will be loaded automatically too.

Tips , , , , , ,