The TextFile of Delphi 2009 can only write AnsiStrings and not Unicode Strings, which means we can only write ANSI data to text files, right?
Wrong!
Since UTF8Strings are also (special) AnsiStrings, we can still write Unicode data to text files, provided we convert the Unicode String to a UTF8String (with no data loss) before writing to the file.
Note that we should also write the UTF-8 BOM to the output in case you want to save the text file and read it afterwards:
program UnicodeTextFile;
uses
Windows, SysUtils;
const // surrogate bytes
Clef = #$5B + #$D834 + #$DD1E + #$5D;
var
F: Text;
B: Byte;
begin
Assign(F, 'output.txt');
Rewrite(f);
for B in TEncoding.UTF8.GetPreamble do write(f, AnsiChar(B));
writeln(f, UTF8String('['+Clef+']'));
Close(f);
end.
Since UTF8String is an AnsiString, we can combine the code above with writeln of normal strings, which will be converted to AnsiStrings, as long as we keep away from high-ascii characters (since these would indicate the start of a UTF8 special character byte sequence).
program UnicodeTextFile;
uses
Windows, SysUtils;
const // surrogate bytes
Clef = #$5B + #$D834 + #$DD1E + #$5D;
var
F: Text;
B: Byte;
begin
Assign(F, 'output.txt');
Rewrite(f);
for B in TEncoding.UTF8.GetPreamble do write(f, AnsiChar(B));
writeln(f, UTF8String('['+Clef+']'));
writeln(f, 'This is a UTF-16 String which will be written as AnsiString');
Close(f);
end.
As long as we convert UTF-16 Unicode Strings to UTF8 before writing to Text files, and don’t forget to use the UTF-8 BOM as prefix, this will work fine for writing files with Unicode UTF-8 output.
This tip is the 7th in a series of Unicode tips taken from my Delphi 2009 Development Essentials book published earlier this week on Lulu.com.