[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

XML files now valid and UTF-8


The XML files which caused a problem today, and all ones generated in
future (ie. day 6 tomorrow) will now only have the standard XML entities
encoded (eg. <, >, & and "). All other stuff (such as &nbsp;) is now
stored as UTF-8. This means that today's five.xml now contains a 2-byte
wide character instead of &nbsp;.

On the user-interface side of things Perl is automagically translating
these to ISO-8859-1 for me, but others may need to check their code and
be careful.

It does mean, however, that all the pre-parsing steps necessary for
MSXML etc. are no longer needed.



Andrew Flegg -- mailto:andrew@xxxxxxxx  |  http://www.bleb.org/