A matter of character.

Today at work I had to deal with a very mysterious bug. I bug was filed for incorrect characters appearing in our application when running on a Czech OS.
The issue manifested itself in the Czech and Polish versions of our Delphi application. While some of the extended characters displayed correctly other characters such as ‘ř’ appeared as an ‘o’.
I examine the strings in questions and they did have the proper hexadecimal ASCII representation. What was wrong? Other applications worked just fined.
The strings in question where being displayed on Raize panels. Were the Raize panels to be blamed?
Due to the lack of debugging facilities on the target machines I followed the classical approach of creating a skeleton application and went on to try to reproduce the issue.
As I tested my skeleton application I became more and more puzzled. On my development box (English XP Professional) the strings appeared correctly – granted I had to change the Charset on the panels to EASTERN_EUROPEAN.
However when I ran the same code on my Czech test environment – one machine a Czech XP Professional and the other machine a Czech Windows 2003 Server – the characters were not displayed correctly. That’s right! The characters were different then the ones displayed by the same application on my development computer.
I decided to review the unit that contained the strings. The strings where declared as constants and added to an array. The system language identifier was the index to the array. So far so good.
Then I noticed that the arrays where declared as widestrings. At that point I started to suspect that that was the root cause for this issue. The arrays where recently changed from strings to widestrings in order to accommodate simplified Chinese. And the Delphi application in question was brand new. Humm…
Next I figured that the characters that were being displayed right on the target machine had an ASCII code matched their Unicode counterpart. The ones that were not being displayed properly didn’t. I changed the hexadecimal of later characters to their Unicode code and voila! All the characters started to appear as expected on both environments.
I learned a couple of things.
First, when it comes to converting widestrings to strings you must use the characters  unicode representation instead of the ASCII.
Second, test your applications on the target OS. Using different code pages on your localized OS is not a guarantee that your application will display language correctly.
I still need to understand why the localized widestrings worked on my development computer and did not work on the target language OS. Any ideas?

One thought on “A matter of character.”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.