Introducing TRefCountedObject

DWScript source code recently introduced a newcomer: TRefCountedObject.

This base class takes the place of TObject in dwsUtils, and is now present throughout the DWScript code. What it adds is, well, a manually reference-counted class.

If you’re not interested in the DWScript internals, you just need to know it allowed to achieve a 5% to 25% reduction in compiled scripts memory usage (depending on what a script does), along with a minor speedup in script execution speed, at the cost of a minor compile slowdown (which should disappear once I improve the current hash table approach). It also allowed to catch a handful of memory management issues that had gone undetected even by FastMM before (as they had no side-effect).

Using TRefCountedObject

The class itself is a simple affair:

TRefCountedObject = class
   private
      function GetRefCount : Integer; inline;
      procedure SetRefCount(n : Integer); inline;
   public
      function IncRefCount : Integer; inline;
      function DecRefCount : Integer;
      property RefCount : Integer read GetRefCount write SetRefCount;
      procedure Free;
end;

But you’ll notice it introduces its own Free method, which is implemented as

procedure TRefCountedObject.Free;
begin
   if Self<>nil then
      DecRefCount;
end;

So it doesn’t directly release the object but decreases the reference count.

In practice, when you introduce a new reference to an existing instance, you invoke IncRefCount, and when you remove such a reference, you invoke DecRefCount, or Free. That’s about it. It’s manual, so it’s not fool-proof, but it is simple, and the overhead is much lower than when using interfaces, and unlike when using interfaces, you still benefit from faster calling conventions (for non-virtual methods) and direct read/write for the simple properties.

Note that this is a generalization of a mechanism that was previously present only for constant unifications (TUnifiedConstExpr), and it’s purposes was to allow the same instance to be referenced in multiple places. TUnifiedConstExpr used a very low level hack to enforce that and wasn’t as flexible, TRefCountedObject is much higher level hack, but it’s still a hack, as to behave correctly, it means a TRefCountedObject should never be freed as a TObject (or TObject.Free will be invoked).

Recovery of the hidden TMonitor field

TRefCountedObject also has another trick up its sleeves, it uses the TMonitor hidden field to store the reference counter. From D2009 up to XE2, TMonitor is still buggy and isn’t really suitable for real-world use, so not being able to use TMonitor on a TRefCOuntedObject is no big loss, and it allows to recover some bytes that were previously wasted in all class instances.

TInterfacedSelfObject was also re-based on TRefCountedObject, and benefits from the same “recovery” of the TMonitor field bytes.

With the introduction of TRefCountedObject, a whole subset TVarExpr is also now getting unified as the constants were, which in practice allows to achieve the above-mentioned reduction in compiled-scripts memory usage.

4 thoughts on “Introducing TRefCountedObject

  1. Manual handling of reference counting is not an easy task.
    About the memory model, most today’s programmers are used to either a garbage-collector paradigm (C#/Java/scripts), either full manual allocation/release (Delphi/ObjectiveC).
    Sounds to me like the opposite of the standard reference counting in interfaces (and e.g. the Apple’s ARC model): by default, it will increment the reference count, and you’ll have to explicitly define a “weak assignment” when you do not want the reference count to be increased. I’m a bit confused here…

    So is it correct to guess that the purpose of this modification is to by-pass the DWS internal garbage collector, and provide deterministic memory allocation/release?
    Does make sense on server side, but such an implementation sounds very weird to me.
    Could you provide some “real-world” sample code and use cases?

  2. A. Bouchez :

    So is it correct to guess that the purpose of this modification is to by-pass the DWS internal garbage collector, and provide deterministic memory allocation/release?

    The DWS internal collector for script execution is an hybrid, objects are already reference-counted, and a GC takes care of releasing the cycles.

    This new class is aimed at the DWS engine itself, the Abstract Syntax Tree and the compiler in particular. The real-world issue came from there: some leaf nodes in the AST are similar (constant & variables references), but to be able to actually share the instances in Delphi, they would either have to be under a separate memory management scheme or use interfaces, as the branches can and are manipulated in various ways (by the optimizers f.i.).
    Since the AST needs to be created fast, and browsed fast, released fast, interfaces weren’t looking very good: they have an overhead, and accessing interfaces methods or properties has an overhead too.

    The TRefCountedObject solves that by providing the convenience of reference counting for memory management, while staying at the (fast) object level. Manual reference counting is also trivial to implement, since it only happens when reusing a common leaf.

    Script execution benefits from TRefCountedObject in ways of CPU cache efficiency: less compiled scripts memory for the AST and less memory requirements for allocated objects.

    A. Bouchez :

    By the way, did you envisage create a JVM backend for DWS?

    I envisaged it, alongside .Net IL, LLVM and CUDA, though for speeding up script executions, Java didn’t make sense. It would have made sense as an app compiler (like for SmartMS), though that would have fallen squarely against Oxygene & FPC efforts, so the added value would have been quite limited for a not-so-negligible effort.

    In theory, the DWScript AST could get back-end compilers to just about anything. In practice, the question is whether or not it’s worth the effort.
    The compiler is probably the least time-intensive aspect: it’s thought and complexity intensive, but not time intensive, while a full-blown environment (libraries, IDE, docs, etc.) is just going to require many man hours to be competitive.

  3. @Eric
    Thanks for the detailed answer.
    It was not clear to me that this class was used within the compiler, and not on script side.

    It now does make perfectly sense!
    Nice work! 🙂

Comments are closed.