|
Title:
|
Unicode tip #3 - UTF-16 Number of Printable Characters
|
Author:
|
Bob Swart |
Posted:
|
11/24/2008 12:14:10 PM
(GMT+1)
|
Content:
|
The number of elements of a string can be retrieved by calling the Length function. So using the example of yesterday, Length(Clef) = 4, meaning there are 4 WideChar elements in the Clef constant. However, this includes the two square brackets as well as the two surrogate characters.
For the number of printable characters in a UTF-16 string, we need to check the surrogates and count the surrogate pair as one printable character (instead of two). This is implemented in the following UTF16Length function, which returns the number of printable characters:
function UTF16Length(const S: String): Integer; var i: Integer; begin Result := 0; for i:=1 to Length(S) do if not IsSurrogate(S[i]) then Result := Result + 1 else // Surrogate if (i > 1) and IsSurrogatePair(S[i-1],S[i]) then Result := Result + 1 end;
Calling UTF16Length on Clef returns 3: the opening and closing brackets are counted as well as the surrogate pair (being equal to a single printable character).
This tip is the third in a series of Unicode tips taken from my Delphi 2009 Development Essentials book published today on Lulu.com.
|
Back |
|