How do I convert an ANSI encoded file to UTF-8 with Notepad++?

Utf 8Notepad++

Utf 8 Problem Overview


I have a website, and I can send my Turkish characters with jQuery in Firefox, but Internet Explorer doesn't send my Turkish characters. I looked at my source file in notepad, and this file's code page is ANSI.

When I convert it to UTF-8 without BOM and close the file, the file is again ANSI when I reopen.

How can I convert my file from ANSI to UTF-8?

Utf 8 Solutions


Solution 1 - Utf 8

Regarding this part:

> When I convert it to UTF-8 without bom and close file, the file is again ANSI when I reopen.

The easiest solution is to avoid the problem entirely by properly configuring Notepad++.

Try Settings -> Preferences -> New document -> Encoding -> choose UTF-8 without BOM, and check Apply to opened ANSI files.

notepad++ UTF-8 apply to opened ANSI files

That way all the opened ANSI files will be treated as UTF-8 without BOM.

For explanation what's going on, read the comments below this answer.

To fully learn about Unicode and UTF-8, read this excellent article from Joel Spolsky.

Solution 2 - Utf 8

Maybe this is not the answer you needed, but I encountered similar problem, so I decided to put it here.

I needed to convert 500 xml files to UTF8 via Notepad++. Why Notepad++? When I used the option "Encode in UTF8" (many other converters use the same logic) it messed up all special characters, so I had to use "Convert to UTF8" explicitly.


Here some simple steps to convert multiple files via Notepad++ without messing up with special characters (for ex. diacritical marks).

  1. Run Notepad++ and then open menu Plugins->Plugin Manager->Show Plugin Manager
  2. Install Python Script. When plugin is installed, restart the application.
  3. Choose menu Plugins->Python Script->New script.
  4. Choose its name, and then past the following code:

convertToUTF8.py

import os
import sys
from Npp import notepad # import it first!

filePathSrc="C:\\Users\\" # Path to the folder with files to convert
for root, dirs, files in os.walk(filePathSrc):
    for fn in files: 
        if fn[-4:] == '.xml': # Specify type of the files
            notepad.open(root + "\\" + fn)      
            notepad.runMenuCommand("Encoding", "Convert to UTF-8")
            # notepad.save()
            # if you try to save/replace the file, an annoying confirmation window would popup.
            notepad.saveAs("{}{}".format(fn[:-4], '_utf8.xml')) 
            notepad.close()

After all, run the script

Solution 3 - Utf 8

If you don't have non-ASCII characters (codepoints 128 and above) in your file, UTF-8 without BOM is the same as ASCII, byte for byte - so Notepad++ will guess wrong.

What you need to do is to specify the character encoding when serving the AJAX response - e.g. with PHP, you'd do this:

header('Content-Type: application/json; charset=utf-8');

The important part is to specify the charset with every JS response - else IE will fall back to user's system default encoding, which is wrong most of the time.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKerem BekmanView Question on Stackoverflow
Solution 1 - Utf 8jakub.gView Answer on Stackoverflow
Solution 2 - Utf 8Jun MurakamiView Answer on Stackoverflow
Solution 3 - Utf 8Piskvor left the buildingView Answer on Stackoverflow