This example shows how to create and load XML files that should support widestrings.
Since version 3.20, all internal strings in NativeXml are UTF8 encoded strings. The string type is Utf8String.
Once it is time to save your document, choose the appropriate encoding for the external file or stream (SaveToStream also saves with the right encoding). Set property "ExternalEncoding" to any of these values:
- seAnsi: This will result in a flat ansi-encoded file that uses characters that are only encoded with the codepage defined in ExternalCodepage.
- seUTF16LE: This will result in a unicode file with byte order mark $FF FE. This should be used as a default for unicode files.
- seUTF16BE: This will result in a big-endian unicode file with byte order mark $FE FF.
- seUTF8: This will result in a UTF-8 encoded file (the byte order mark is left out). UTF-8 is more storage friendly when the majority of characters are western/latin. However it uses more space for languages like Chinese or Japanese.
Here's an example on how to set external encoding:
procedure CreateXML;
var
ADoc: TNativeXml;
begin
ADoc := TNativeXml.CreateName('Root');
try
// ..add all your creation code here
// Save to unicode
ADoc.ExternalEncoding := seUTF16LE;
ADoc.EncodingString := 'UTF-16';
ADoc.SaveToFile('c:\temp\test.xml');
finally
ADoc.Free;
end;
end;
Adding widestrings to the document is easy. Each node's value can now be set to a widestring using property ValueWide, and widestrings can in general be added using the FromWidestring function.
Here's example code that adds a new node to the root, then sets the name of the node to AName and the value to AValue:
procedure AddNode(ADoc: TXmlDocument; AName, AValue: widestring);
begin
with ADoc.Root do
with NodeNew(FromWidestring(AName)) do
ValueWide := AValue;
end;
function CreateXMLAndLoadFromFile(AFilename: string): TNativeXml;
begin
Result := TNativeXml.Create;
Result.LoadFromFile(AFilename);
end;
When reading from a stream (e.g. from a TCP connection), the stream does not always contain the byte order mark (BOM). However, if the stream is unicode, NativeXml will recognise it as such without any help, because the declaration also defines the encoding. Example: