Delphi offers two ways of enumerating files in a directory and its sub-directories, the first is the classic (and buggy) FindFirst/FindNext, the second is IOUtils TDirectory.GetFiles and not very efficient.
Here is why and how I implemented DWScript‘s dwsXPlatform.CollectFiles, and a tip about getting a small system-wide boost as a bonus.
8dot3 file naming
The old 8dot3 naming convention dating back to the DOS ancestry of Windows has been obsolete for a while, but it’s still likely to cost you time… or trouble.
It affects both Delphi methods negatively, because of the underlying Windows API function they use (FindFirstFile) is obsolete as well, and obsolete in two ways:
- it spends time returning both the regular (long) file name and the 8dot3 name (which can means extra lookups in the file system), even though it’s not used by Delphi.
- it doesn’t filter extension appropriately (for compatibility with 8dot3 names)
In the case of FindFirst, it means that if you search for ‘*.dpr’, you’ll get .dproj files as well.
TDirectory.GetFiles solves the filtering by doing it Delphi-side with TMask from the Masks unit. TMask uses a quite efficiently implemented state machine, but IOUtils invokes it through the MatchesMask function, which creates and destroys a TMask every single time…
IOUtils internal logic is also quite complex and heavy-weight (with anonymous procedures, implicit exception frames, implicit conversion and generally redundant code), and the GetFiles implementation doesn’t scale well as it relies on a dynamic array as return value (FastMM mitigates the issue, but not entirely).
So in practice, if you’ve got a fast SSD or if everything is in the Windows file system memory cache, IOUtils will be the bottleneck, not the file system.