导入带有UTF-8特殊字符的txt文件到xlsx

Import txt files with UTF-8 special characters to xlsx

我有从另一个系统自动导出给我的 txt 文件(我无法更改此系统)。当我尝试使用以下代码将这些 txt 文件转换为 excel 时(我手动创建了一个子文件夹 xlsx):

Sub all()

   Dim sourcepath As String
   Dim sDir As String
   Dim newpath As String
    
    sourcepath = "C:\Users\PC\Desktop\Test\"
    newpath = sourcepath & "xlsx\"
    
    'make sure subfolder xlsx was created before

    sDir = Dir$(sourcepath & "*.txt", vbNormal)
    Do Until Len(sDir) = 0
        Workbooks.Open (sourcepath & sDir)
        With ActiveWorkbook
            .SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
            .Close
        End With
        
        sDir = Dir$
    Loop
End Sub

它确实有效,但是某些特殊字符,如 ä、ö 和 Ü 等,无法正确显示。 IE。当我稍后打开 xlsx 文件时,我可以看到这些文件已被诸如 ä 之类的内容所取代。我可以使用变通方法,然后开始替换它们,但是我想改进我的 txt 到 xlsx 代码。根据这个post or this one,应该可以使用ADODB.Stream。但是,我不知道如何将其实现到我的代码(循环)中以使其在我的情况下在这里工作?如果有另一种方法而不是 ADOB.Stream 我也很好。我没有必要使用 ADOB.Stream.

您是否尝试过使用 Origin 参数强制转换 代码页 ?我不知道您是否需要特定的,但 UTF-8 常量可能是一个起点。我个人喜欢这个页面作为参考来源:https://docs.microsoft.com/en-us/windows/win32/intl/code-page-identifiers

所以解决方案可能会像这样简单 - 它在我的虚拟测试中有效:

Option Explicit
Private Const CP_UTF8 As Long = 65001

Public Sub RunMe()
    Dim sDir As String, sourcePath As String, fileName As String
    Dim fso As Object
    
    sourcePath = "C:\anyoldpath\"
    
    Set fso = CreateObject("Scripting.FileSystemObject")
    sDir = Dir(sourcePath & "*.txt", vbNormal)
    Do While Len(sDir) > 0
        fileName = sourcePath & "xlsx\" & fso.GetBaseName(sDir) & ".xlsx"
        Application.Workbooks.OpenText sourcePath & sDir, CP_UTF8
        ActiveWorkbook.SaveAs fileName, xlOpenXMLWorkbook
        ActiveWorkbook.Close False
        sDir = Dir()
    Loop
End Sub

假设这些 txt 文件是制表符分隔的。

字符的处理或code page它由QueryTable对象的Workbooks.OpenText method or by the TextFilePlatform propertyOrigin参数管理。

这些 txt 文件应该用 Workbooks.OpenText 方法打开,但是为了处理 Decimal.Separator 与您系统中的文件不同的问题,我建议使用QueryTable 方法也适用于以 csv 扩展名的制表符分隔文件。

我们只需要替换这些行:

sFile = Dir$(sPathSrc & "*.csv")
    sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".csv")) & "xlsx"

有了这些:

sFile = Dir$(sPathSrc & "*.txt")
    sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".txt")) & "xlsx"

过程`Open_Csv_As_Tab_Delimited_Then_Save_As_Xls 没有变化,也许是名称的变化以反映其多功能性。

使用此 tst 文件测试:

生成了这个“xlsx”文件:

希望将这些过程添加到您的项目中应该很简单,如果您对所用资源有任何问题或疑问,请告诉我。

Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()
Dim sFilenameSrc As String, sFilenameTrg As String
Dim sPathSrc As String, sPathTrg As String
Dim sFile As String
Dim bShts As Byte, exCalc As XlCalculation

    sPathSrc = "C:\Users\PC\Desktop\Test\"
    sPathTrg = sPathSrc & "xlsx\"

    Rem Excel Properties OFF
    With Application
        .EnableEvents = False
        .DisplayAlerts = False
        .ScreenUpdating = False
        exCalc = .Calculation
        .Calculation = xlCalculationManual
        .CalculateBeforeSave = False
        bShts = .SheetsInNewWorkbook
        .SheetsInNewWorkbook = 1
    End With

    Rem Validate Target Folder
    If Len(Dir$(sPathTrg, vbDirectory)) = 0 Then MkDir sPathTrg

    Rem Process Csv Files
    sFile = Dir$(sPathSrc & "*.txt")
    Do Until Len(sFile) = 0
        
        sFilenameSrc = sPathSrc & sFile
        sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".txt")) & "xlsx"
        
        Call Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFilenameSrc, sFilenameTrg)
        
        sFile = Dir$
    
    Loop

    Rem Excel Properties OFF
    With Application
        .SheetsInNewWorkbook = bShts
        .Calculation = exCalc
        .CalculateBeforeSave = True
        .ScreenUpdating = True
        .DisplayAlerts = True
        .EnableEvents = True
    End With
    
    End Sub

Sub Open_Txt_As_Tab_Delimited_Then_Save_As_Xls(sFilenameSrc As String, sFilenameTrg As String)
Dim Wbk As Workbook
    
    Rem Workbook - Add
    Set Wbk = Workbooks.Add(Template:="Workbook")
    With Wbk
    
        Rem Txt File - Import
        With .Worksheets(1)
            
            Rem QueryTable - Add
            With .QueryTables.Add(Connection:="TEXT;" & sFilenameSrc, Destination:=.Cells(1))
                
                Rem QueryTable - Properties
                .SaveData = True
                .TextFileParseType = xlDelimited
                .TextFileDecimalSeparator = "."
                .TextFileThousandsSeparator = ","
                .TextFileConsecutiveDelimiter = False
                .TextFileTabDelimiter = True
                .TextFileTrailingMinusNumbers = True
                .TextFilePlatform = 65001       'Unicode (UTF-8)
                .Refresh BackgroundQuery:=False
                
                Rem QueryTable - Delete
                .Delete
            
        End With: End With

        Rem Workbook - Save & Close
        .SaveAs Filename:=sFilenameTrg, FileFormat:=xlOpenXMLWorkbook
        .Close
    
    End With

    End Sub