This version generates code that is up to twice faster than the OpenBLAS for matrix sizes up to CPU cache size (100×100 to 200×200 usually), and maintains a marginal lead for larger sizes, though those are bound by memory bandwidth. The performance profile is similar on both AMD and Intel CPUs.
A test version of SamplingProfiler 64bit is available here (3.2 MB).
It has only been tested with 64bit binaries compiled by Delphi 10.3 and detailed map files. It should work with other Delphi version, (TD32 and other debug information formats have not been tested yet).
There other known issues with stack traces from DLLs, so it is rough around the edges but should be functional.
Just create a new repository with a “LibCBLAS” unit meant to use the OpenBLAS library in its Windows 64bit incarnation from Delphi 10.3+
Support for floating point modulus has been added to DWScript, it extends the “mod” operator to accept floats.
Here is a summary of recent DWScript changes, the major one is a change in operator precedence to something similar to Delphi and FreePascal.
Other changes are related to sets and bug fixes.