使用 OleDB 范围错误从 Excel 2013 文件中读取非常大的数据
Reading very large data from an Excel 2013 file using OleDB range error
我正在尝试在 OleDB 的帮助下使用 Visual Basic.NET 读取一个 Excel 2013 文件(.xlsx,大小约为 100 MB)。主要问题是在以下行中出现系统内存不足异常:
da.Fill(dt)
来自下面的代码。
Private Function ReadExcelFile() As DataSet
Dim ds As New DataSet()
Dim connectionString As String =
"Provider=Microsoft.ACE.OLEDB.12.0;;Extended Properties=Excel 12.0 XML;Data Source=C:\file.xlsx;"
Using connection As New OleDbConnection(connectionString)
connection.Open()
Dim cmd As New OleDbCommand()
cmd.Connection = connection
Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
For Each dr As DataRow In dtSheet.Rows
Dim sheetName As String = dr("TABLE_NAME").ToString()
If Not sheetName.EndsWith("$") Then
Continue For
End If
cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
Dim dt As New DataTable()
dt.TableName = sheetName
Dim da As New OleDbDataAdapter(cmd)
da.Fill(dt)
ds.Tables.Add(dt)
Next
cmd = Nothing
connection.Close()
End Using
Return ds
End Function
但我认为最好的解决方案是按块读取数据,所以我发现我可以通过在 SQL 语句中添加列范围来读取数据,如下所示:
cmd.CommandText = "SELECT * FROM [" & sheetName & "B1:B10];"
我通过在该范围内递增来进行循环,但我发现了一个错误。以此为例,
cmd.CommandText = "SELECT * FROM [" & sheetName & "B50000:B51000];"
它仍然有效。但是,如果我这样做,
cmd.CommandText = "SELECT * FROM [" & sheetName & "B70000:B70001];"
我收到这个错误。
请注意,Excel 文件有 475128 行,B70000-B70001 还不到总数的一半。
有人能解释一下吗?我想我在这里遗漏了一些东西。
我找到了可行的解决方案。不使用 DataSet,而是使用 DataReader。我加一个worker就不会挂了
Private Function ReadExcelFile() As DataSet
Dim ds As New DataSet()
Dim connectionString As String = GetConnectionString()
Using connection As New OleDbConnection(connectionString)
connection.Open()
Dim cmd As New OleDbCommand()
cmd.Connection = connection
Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
For Each dr As DataRow In dtSheet.Rows
Dim sheetName As String = dr("TABLE_NAME").ToString()
If Not sheetName.EndsWith("$") Then
Continue For
End If
cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
Dim ddr As OleDbDataReader = cmd.ExecuteReader()
Dim counter As Integer = 0
While (ddr.Read())
MessageBox.Show(ddr.GetValue(0))
End While
Next
cmd = Nothing
connection.Close()
End Using
Return ds
End Function
行:
Dim ddr As OleDbDataReader = cmd.ExecuteReader()
Dim counter As Integer = 0
While (ddr.Read())
MessageBox.Show(ddr.GetValue(0))
End While
是基本代码,您可以在其中访问第一列(索引 0)的行。这是有效的,因为我读到 DataSet 是一个 in-memory 对象(这就是为什么我们可能会出现系统内存不足异常的原因)- Check here for reference
我仍然想知道为什么会出现上述问题。
我正在尝试在 OleDB 的帮助下使用 Visual Basic.NET 读取一个 Excel 2013 文件(.xlsx,大小约为 100 MB)。主要问题是在以下行中出现系统内存不足异常:
da.Fill(dt)
来自下面的代码。
Private Function ReadExcelFile() As DataSet
Dim ds As New DataSet()
Dim connectionString As String =
"Provider=Microsoft.ACE.OLEDB.12.0;;Extended Properties=Excel 12.0 XML;Data Source=C:\file.xlsx;"
Using connection As New OleDbConnection(connectionString)
connection.Open()
Dim cmd As New OleDbCommand()
cmd.Connection = connection
Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
For Each dr As DataRow In dtSheet.Rows
Dim sheetName As String = dr("TABLE_NAME").ToString()
If Not sheetName.EndsWith("$") Then
Continue For
End If
cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
Dim dt As New DataTable()
dt.TableName = sheetName
Dim da As New OleDbDataAdapter(cmd)
da.Fill(dt)
ds.Tables.Add(dt)
Next
cmd = Nothing
connection.Close()
End Using
Return ds
End Function
但我认为最好的解决方案是按块读取数据,所以我发现我可以通过在 SQL 语句中添加列范围来读取数据,如下所示:
cmd.CommandText = "SELECT * FROM [" & sheetName & "B1:B10];"
我通过在该范围内递增来进行循环,但我发现了一个错误。以此为例,
cmd.CommandText = "SELECT * FROM [" & sheetName & "B50000:B51000];"
它仍然有效。但是,如果我这样做,
cmd.CommandText = "SELECT * FROM [" & sheetName & "B70000:B70001];"
我收到这个错误。
请注意,Excel 文件有 475128 行,B70000-B70001 还不到总数的一半。
有人能解释一下吗?我想我在这里遗漏了一些东西。
我找到了可行的解决方案。不使用 DataSet,而是使用 DataReader。我加一个worker就不会挂了
Private Function ReadExcelFile() As DataSet
Dim ds As New DataSet()
Dim connectionString As String = GetConnectionString()
Using connection As New OleDbConnection(connectionString)
connection.Open()
Dim cmd As New OleDbCommand()
cmd.Connection = connection
Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
For Each dr As DataRow In dtSheet.Rows
Dim sheetName As String = dr("TABLE_NAME").ToString()
If Not sheetName.EndsWith("$") Then
Continue For
End If
cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
Dim ddr As OleDbDataReader = cmd.ExecuteReader()
Dim counter As Integer = 0
While (ddr.Read())
MessageBox.Show(ddr.GetValue(0))
End While
Next
cmd = Nothing
connection.Close()
End Using
Return ds
End Function
行:
Dim ddr As OleDbDataReader = cmd.ExecuteReader()
Dim counter As Integer = 0
While (ddr.Read())
MessageBox.Show(ddr.GetValue(0))
End While
是基本代码,您可以在其中访问第一列(索引 0)的行。这是有效的,因为我读到 DataSet 是一个 in-memory 对象(这就是为什么我们可能会出现系统内存不足异常的原因)- Check here for reference
我仍然想知道为什么会出现上述问题。