Unicode Leftover Bug From Hell

Or in other words, before getting to the gory details, DWScript now works when compiled with {$HIGHCHARUNICODE ON} on a machine with Cyrillic code-page 1251.

DWScript was converted years ago to Unicode, and been working just fine.

But there was a leftover bug from that crossing of the Styx.

(more…)

Crouching Smileys, Hidden Diacritics

As noted in a recent post, Unicode is not so straightforward. Namely claims of utf-16 being simpler than utf-8, or that you do not have to care about Unicode complexities.

Maybe that was the case ten years ago, but The Unicode jungle is much closer to home these days.

Here are a few dangers lurking in the not-so-dark shadows.
(more…)

UTF-8, UTF-16 or both? (poll)

The FreePascal version of DWScript has been stalled for a little while on the incomplete UnicodeString (utf-16) support among other things.

It’s hard to blame the FreePascal team for that, given that Linux is primarily utf-8, and that utf-8 has quite a few advantages over utf-16.

(more…)

Zero-based Strings indexes?

In a now infamous and enormous thread I won’t name, Allen Bauer dropped a bomb:

<bomb>Oh, and strings may become immutable and 0-based …</bomb>

(more…)

SynEdit DWS Highlighter, Unicode Identifiers, Type refactorings

Here is a summary of recent changes for DWScript SVN-side:

A syntax highlighter for SynEdit (Unicode version, available in SynEdit’s SVN) is now available, it introduces support for DWSScript specific features (keywords, string delimiters…). It is essentially a fork of the Pascal syntax highlighter.
Unicode identifiers are now supported, though as always with Unicode identifiers, it may not be too wise to use them too much if you want to keep the scripts maintainable. It allows to access exposed classes or libraries that use Unicode identifiers themselves.
The type system refactoring is in progress, most of the changes are purely internal, a handful of corner-case bugs were spotted and fixed in the process.
Compiler will now hint about variables that are never written to (when coSymbolDictionary option is set).
Inc/Dec/Succ/Pred are now magic functions, and will return the correct type when invoked on an enumeration.

Optimistic Unicode case-insensitive CompareText

In Unicode Delphi, post-Delphi 2009, there are two ways of making case-insensitive string comparisons, CompareText, which only does case-insensitivity in the ASCII range (non-accentuated characters), and the judiciously misnommed AnsiCompareText, which works on the whole Unicode range by calling into the Windows API.

(more…)

DelphiTools

DWS, Profiler and other Pascal tools

Tag Archives: Unicode

Unicode Leftover Bug From Hell

Crouching Smileys, Hidden Diacritics

UTF-8, UTF-16 or both? (poll)

(more…)

Zero-based Strings indexes?

SynEdit DWS Highlighter, Unicode Identifiers, Type refactorings

Optimistic Unicode case-insensitive CompareText