Save text file UTF-8 encoded with VBA

VbaUtf 8

Vba Problem Overview


how can I write UTF-8 encoded strings to a textfile from vba, like

Dim fnum As Integer
fnum = FreeFile
Open "myfile.txt" For Output As fnum
Print #fnum, "special characters: äöüß" 'latin-1 or something by default
Close fnum

Is there some setting on Application level?

Vba Solutions


Solution 1 - Vba

I found the answer on the web:

Dim fsT As Object
Set fsT = CreateObject("ADODB.Stream")
fsT.Type = 2 'Specify stream type - we want To save text/string data.
fsT.Charset = "utf-8" 'Specify charset For the source text data.
fsT.Open 'Open the stream And write binary data To the object
fsT.WriteText "special characters: äöüß"
fsT.SaveToFile sFileName, 2 'Save binary data To disk

Certainly not as I expected...

Solution 2 - Vba

You can use CreateTextFile or OpenTextFile method, both have an attribute "unicode" useful for encoding settings.

object.CreateTextFile(filename[, overwrite[, unicode]])        
object.OpenTextFile(filename[, iomode[, create[, format]]])

Example: Overwrite:

CreateTextFile:
 fileName = "filename"
 Set fso = CreateObject("Scripting.FileSystemObject")
 Set out = fso.CreateTextFile(fileName, True, True)
 out.WriteLine ("Hello world!")
 ...
 out.close

Example: Append:

 OpenTextFile Set fso = CreateObject("Scripting.FileSystemObject")
 Set out = fso.OpenTextFile("filename", ForAppending, True, 1)
 out.Write "Hello world!"
 ...
 out.Close

See more on MSDN docs

Solution 3 - Vba

This writes a Byte Order Mark at the start of the file, which is unnecessary in a UTF-8 file and some applications (in my case, SAP) don't like it. Solution here: https://stackoverflow.com/q/4143524/385695

Solution 4 - Vba

Here is another way to do this - using the API function WideCharToMultiByte:

Option Explicit

Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
  ByVal CodePage As Long, _
  ByVal dwFlags As Long, _
  ByVal lpWideCharStr As Long, _
  ByVal cchWideChar As Long, _
  ByVal lpMultiByteStr As Long, _
  ByVal cbMultiByte As Long, _
  ByVal lpDefaultChar As Long, _
  ByVal lpUsedDefaultChar As Long) As Long

Private Sub getUtf8(ByRef s As String, ByRef b() As Byte)
Const CP_UTF8 As Long = 65001
Dim len_s As Long
Dim ptr_s As Long
Dim size As Long
  Erase b
  len_s = Len(s)
  If len_s = 0 Then _
    Err.Raise 30030, , "Len(WideChars) = 0"
  ptr_s = StrPtr(s)
  size = WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, 0, 0, 0, 0)
  If size = 0 Then _
    Err.Raise 30030, , "WideCharToMultiByte() = 0"
  ReDim b(0 To size - 1)
  If WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, VarPtr(b(0)), size, 0, 0) = 0 Then _
    Err.Raise 30030, , "WideCharToMultiByte(" & Format$(size) & ") = 0"
End Sub

Public Sub writeUtf()
Dim file As Integer
Dim s As String
Dim b() As Byte
  s = "äöüßµ@€|~{}[]²³\ .." & _
    " OMEGA" & ChrW$(937) & ", SIGMA" & ChrW$(931) & _
    ", alpha" & ChrW$(945) & ", beta" & ChrW$(946) & ", pi" & ChrW$(960) & vbCrLf
  file = FreeFile
  Open "C:\Temp\TestUtf8.txt" For Binary Access Write Lock Read Write As #file
  getUtf8 s, b
  Put #file, , b
  Close #file
End Sub

Solution 5 - Vba

I looked into the answer from Máťa whose name hints at encoding qualifications and experience. The VBA docs say CreateTextFile(filename, [overwrite [, unicode]]) creates a file "as a Unicode or ASCII file. The value is True if the file is created as a Unicode file; False if it's created as an ASCII file. If omitted, an ASCII file is assumed." It's fine that a file stores unicode characters, but in what encoding? Unencoded unicode can't be represented in a file.

The VBA doc page for OpenTextFile(filename[, iomode[, create[, format]]]) offers a third option for the format:

  • TriStateDefault 2 "opens the file using the system default."
  • TriStateTrue 1 "opens the file as Unicode."
  • TriStateFalse 0 "opens the file as ASCII."

Máťa passes -1 for this argument.

Judging from [VB.NET documentation] 3 (not VBA but I think reflects realities about how underlying Windows OS represents unicode strings and echoes up into MS Office, I don't know) the system default is an encoding using 1 byte/unicode character using an ANSI code page for the locale. UnicodeEncoding is UTF-16. The docs also describe UTF-8 is also a "Unicode encoding," which makes sense to me. But I don't yet know how to specify UTF-8 for VBA output nor be confident that the data I write to disk with the OpenTextFile(,,,1) is UTF-16 encoded. Tamalek's post is helpful.

Solution 6 - Vba

The traditional way to transform a string to a UTF-8 string is as follows:

StrConv("hello world",vbFromUnicode)

So put simply:

Dim fnum As Integer
fnum = FreeFile
Open "myfile.txt" For Output As fnum
Print #fnum, StrConv("special characters: äöüß", vbFromUnicode)
Close fnum

No special COM objects required

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKarsten W.View Question on Stackoverflow
Solution 1 - VbaKarsten W.View Answer on Stackoverflow
Solution 2 - VbaMáťa - Stitod.czView Answer on Stackoverflow
Solution 3 - VbaPhilHibbsView Answer on Stackoverflow
Solution 4 - VbaFaloView Answer on Stackoverflow
Solution 5 - VbaBennett BrownView Answer on Stackoverflow
Solution 6 - VbaSancarnView Answer on Stackoverflow