Changing the file encoding has done something terrible.

13 years ago • updated by 13 years ago • 4

Apologies for the somewhat non-specific title.

All of my files should be in UTF-8. For source code it doesn't really matter (since valid symbols are in the ASCII 7-bit range and the primary base language is English) but when editing a gettext .po symbols file to update some of the translations I noticed, too late, that the encoding that was selected was ISO-8859-1†. Before changing it, all of the previously UTF-8 text looked fine, or at least, I didn't notice any problems. After changing the encoding back to UTF-8 and saving the file (where the file looks OK in the editor) and recompiling my .po file, the webapp encoding is completely destroyed. Closing and re-opening the file, I get text like:

"les donnÃ©es Ã une analyse"

The webapp presented this even worse if I attempted to manually set the browser encoding to ISO 8859-1; double conversion is ugly business. Luckily I version control everything, so it could be rolled back, but a close look needs to be given to how Textastic handles file encoding. So far I haven't been able to reproduce, but I will try to create a basic test case for you after work.

† Incorrectly called "ISO Latin 1" in your menus; it's "Latin 1" or "ISO 8859-1"… only Microsoft calls it by a mixed name as ISO standards are all numbered. ;) More correct would be "Western (ISO 8859-1, Latin 1)", same for all of the other ISO 8859 variants, but that's a separate issue.

Vote

Replies 4
Oldest first
- Newest first
- Oldest first

13 years ago

Things get weirder, actually, as I try to recover from this problem.

I close the file, revert the commit, and re-open the file. The encoding is still wrong, and the symbols are still garbled. I open the file in another editor, the encoding is correctly guessed and the symbols are fine. See screenshot, Textastic on the right: