0

Encoding switch does not work

Denis Vasilkovsky 12 years ago in iPad updated by Freemen Muaddib 8 months ago 11

It seems that Textastic does not switch encoding neither from current file menu nor from wrench one.


I have some files with Windows CP-1251 encoding. Switching between different encoding options have no visual effect. Also after applying changes encoding remains utf-8.


Switching between fonts doesn't take any effect.


What i'm doing wrong?

Tap and hold the file and choose "Open with Encoding". Then you can select the encoding with which you want to open the file.


If you change the encoding in the Document Properties popover, you select the encoding that is used when saving the file.

Oh, i get it. I must use "Open with Encoding" at first opening otherwise encoding would be lost during autosave.


Thank you.

Changing the encoding for saving the document seems not to work. I oppened an ISO Latin document by error as an UTF8, and it was converted to UTF8. After saving to my server, the encoding was broken (UTF 8 instead of original ISO LATIN) so I changed to ISO LATIN in the document properties and saved it again, but it still UTF 8 and in document properties, ISO LATIN as been reset to UTF 8 :-(

If you open it once with the wrong encoding, it will be saved with this encoding if you make changes to it.


If you can't change it to ISO-Latin 1, it means that the file now contains characters that can't be expressed in the selected encoding.


You could try to tap-and-hold the file in the Files section and select "Open with Encoding" and choose ISO-Latin 1. But, I doubt that this helps once it is saved with the wrong encoding, sorry.

I think this use case should be handle in a way close what Eclipse editor does: if the file contains characters that may not be encoded with the specified encoding, the editor should refuse to save and offer an option to go to the first incompatible character, in order to allow the user to change it.


What will happen if I open another file specifying ISO LATIN and then add incorrect character and try to save ? Will it automatically switch to another encoding ? Or will refuse to save ? Or will it remove the wrong characters ?


And if I remove the wrong characters from the first file, will the editor accept to switch back to ISO LATIN encoding ?


Fantastic editor anyway, and thanks for being so fast to answer questions!

Currently, it will switch to UTF-8 when you enter a character that can't be encoded in the originally selected encoding. It will not switch back. I agree that this behavior should be improved.

Any idea when it will be improved? Because until then, it makes the application totally unusable since there is a risk of  file corruption, and even if this corruption may be reversed easily, it implies using another editor that allows switching back the character set.

 

There is only a risk of file corruption when you open it with the wrong encoding and edit it. This is not different from other text editors and there's no easy fix for this. You can use the "Open with Encoding" option I mentioned above to open a file with a different encoding then the automatically selected one. Textastic will remember this selection for each file. No need to use another editor. 


But, once you saved it with the wrong encoding, you probably can't got back (depending on the characters and the encodings involved). 


Since UTF-8 is a superset and thus compatible with ISO-8859-1, it should actually be no problem in your case and the file should not be corrupt. You probably added a character that is not in the ISO-8859-1 encoding.


There's one notable character that can't be expressed in ISO-8859-1 and is often found on websites: the copyright sign ©. Maybe your file does contain this character? 

It seems to be in contradiction with what you said earlier, which would be good news!

Here is the exact use case:

I opened a file (through SSH) that was encoded in ISO Latin. I then added some code, including the Euro character. Then I saved the file (not having taken care of encoding at that time). When I tested the file on the server, all accented characters were broken and I realized the file had been saved in UTF-8, although I supposed it had been opened in ISO-LATIN. This is what make the editor unusable in our case. It is not considered acceptable that the editor changes the encoding without warning the user. As I said, the Eclipse editor, in such a case, would display a dialog box saying it can't save the file because of special characters, as many (most?) other editors would do. Eclipse offers one significant advantage upon some editor by offering an option to go to the first occurrence of such a character.

Beside that, I would have thought of a intuitive workaround to correct the problem: reopening the file, removing the  erroneous characters and saving it with the old encoding. From what you said earlier, it seems it is not possible, but your last message seems to imply the contrary. If possible, this would be an acceptable workaround until a best fix,  although it has the great inconvenient to let the server serve a wrong file during while the workaround is applied. But if it is not possible, it implies opening the file with another editor able to automatically do this. (The file contain hundreds of accented characters that cannot be manually changed form a tablet editor!)

 

Sorry, I misread your original comment. I thought you asked if it would *automatically* switch back to the previous encoding when you remove the character. It does not do that. But, if you remove the character you can manually swith back to the previous encoding using the File Properties popover. 

Two suggestions:

- If an non standard UTF-8 encoding is detected when opening a file, Textastic should show the detected encoding (for example chinese (GB18030) and ask if the user wants to use a different encoding to decode the text, (clarify that internally Textastic works in UTF-8 ) or if the user wants to preserve the original encoding when loading or saving this file (always converting internally to and from UTF-8). Textastic should be able to detect and display the characters of any encoding (this does not happens now.. chinese encoded in GB18030 is not detected and shown as garbage).

- If you switch the encoding from the Document Properties the app should ask if we want to encode the current visible document in the selected encoding (so that it will be saved as it is shown but in the new encoding) or if we want to reload the original file using the selected encoding (with a warning that doing that will discard any unsaved changes to the file).