支持 utf16 与 ms xml 6.0
Support for utf16 with ms xml 6.0
我正在从一个法语站点抓取数据。我使用的是 MS XML 6.0,有些字母没有被正确识别
(例如é)
代码:
Dim xml_obj As XMLHTTP
Set xml_obj = New XMLHTTP
xml_obj.Open "GET", "http://www.emploi.nat.tn/fo/Fr/global.php?page=146&menu1=&FormLinks_Sorting=1&FormLinks_Sorted=&num_page=0&limit=500&numpage=1", False
xml_obj.send
Dim htmldoc As New HTMLDocument
htmldoc.body.innerHTML = xml_obj.responseText
responseText 以 UTF-8 编码。任何解决方法?
由于编码是 windows-1256
,您首先需要解码页面。然后直接在文档中写 html 而不是在正文中:
Sub UsageExample()
Dim req As New MSXML2.ServerXMLHTTP60 ' Microsoft XML, v6.0 '
req.Open "GET", "http://www.emploi.nat.tn/fo/Fr/global.php?page=146&menu1=&FormLinks_Sorting=1&FormLinks_Sorted=&num_page=0&limit=500&numpage=1", False
req.Send
Dim doc As New MSHTML.HTMLDocument ' Microsoft HTML Object Library '
WriteDocument doc, req.responseBody, "windows-1256"
End Sub
Private Sub WriteDocument(document As Object, data, charset As String)
Dim stream As New ADODB.stream ' Microsoft ActiveX Data Objects 6.1 Library '
stream.Open
stream.Type = 1
stream.Write data
stream.Position = 0
stream.Type = 2
stream.charset = charset
document.Open
document.Write stream.ReadText
document.Close
stream.Close
End Sub
我正在从一个法语站点抓取数据。我使用的是 MS XML 6.0,有些字母没有被正确识别 (例如é)
代码:
Dim xml_obj As XMLHTTP
Set xml_obj = New XMLHTTP
xml_obj.Open "GET", "http://www.emploi.nat.tn/fo/Fr/global.php?page=146&menu1=&FormLinks_Sorting=1&FormLinks_Sorted=&num_page=0&limit=500&numpage=1", False
xml_obj.send
Dim htmldoc As New HTMLDocument
htmldoc.body.innerHTML = xml_obj.responseText
responseText 以 UTF-8 编码。任何解决方法?
由于编码是 windows-1256
,您首先需要解码页面。然后直接在文档中写 html 而不是在正文中:
Sub UsageExample()
Dim req As New MSXML2.ServerXMLHTTP60 ' Microsoft XML, v6.0 '
req.Open "GET", "http://www.emploi.nat.tn/fo/Fr/global.php?page=146&menu1=&FormLinks_Sorting=1&FormLinks_Sorted=&num_page=0&limit=500&numpage=1", False
req.Send
Dim doc As New MSHTML.HTMLDocument ' Microsoft HTML Object Library '
WriteDocument doc, req.responseBody, "windows-1256"
End Sub
Private Sub WriteDocument(document As Object, data, charset As String)
Dim stream As New ADODB.stream ' Microsoft ActiveX Data Objects 6.1 Library '
stream.Open
stream.Type = 1
stream.Write data
stream.Position = 0
stream.Type = 2
stream.charset = charset
document.Open
document.Write stream.ReadText
document.Close
stream.Close
End Sub