- DelphiTools - https://www.delphitools.info -

String Lookup: Hash, Sorted or Unsorted ?

StringLookupShowdownWhen looking up a string, what is the fastest strategy?
A hash map, a sorted list or an unsorted list?

Of course it depends on how many strings you have, but where are the cutoff points?
Here is a quick test, and an interesting tidbit is uncovered…

Without further ado, here is the result chart.

Time is the vertical axis (lower is better), horizontally is the number of items, note that it is logarithmic.

StringLookupShowdown [1]

Those are results for random, 12 characters strings, but the curves where almost unchanged if the strings were “less random” (with 6 identical characters at the start f.i.)

“TDictionary” and “Hash FNV1a” are both hash-maps, but “Hash FNV1a” uses a fast hash, while “TDictionary” uses the slower Bob-Jenkins Hash (which is also more prone to collisions [2], but these did not affect that particular test)

Observations

Tidbit of note

Sorted & Unsorted lists are both using TStringList, with just an overloaded CompareStrings method hardcoded to a case-insensitive comparison, and the difference is just whether Sorted is set to true or not… so why the difference even for the case where you have 1 or 2 elements?

In both cases IndexOf() was called, but IndexOf merely reroutes the execution to TStrings.IndexOf orĀ TStringList.Find depending on the value of Sorted, and that makes all the difference…

When there are a couple elements, an unsorted search in TStringList is actually spent doing a fair share of Automatic Reference Counting tasks (there is also a bit of range checking, but it’s negligible in comparison).