使用更好的方式读取 excel 数据
using better way to read the excel data
现在我正在从 excel 获取数据并循环遍历行,并根据条件处理结果,例如将结果存储在对象中以供进一步处理。
excel sheet 大约 20 MB,记录数接近 7000,我正在使用打开 xml 从 excel 文件中获取数据,例如如以下代码所述。
string filePath = @"C:\weather-Data\DesignConditions_p.xlsx";
using FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false);
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
Sheet firstSheet = workbookPart.Workbook.Descendants<Sheet>().First();
Worksheet sheet = ((WorksheetPart)workbookPart.GetPartById(firstSheet.Id)).Worksheet;
var rows = sheet.Descendants<Row>();
var weatherDataList = new List<WeatherStation>();
foreach (Row row in rows.Skip(5)) // it is taking almost more than 60 minutes to process and enter into the if loop below (country.Equals("USA"))
{
var weatherData = new WeatherStation();
string country = GetCellValue(filePath, "Annual", $"B{row.RowIndex.ToString()}");
if (country.Equals("USA"))
{
weatherData.CountryAbbreviation = country;
weatherData.StateAbbreviation = GetCellValue(filePath, "Annual", $"C{row.RowIndex.ToString()}");
weatherData.Number = GetCellValue(filePath, "Annual", $"E{row.RowIndex.ToString()}");
......
.......
}
}
任何人都可以指出正确的方向以在从 excel 读取数据时优化处理时间,我正在为此应用程序使用 .Net Core
提前致谢。
您可以使用 'SAX' 方法,这样您就可以分段读取文件,因此处理和 IO 会更快。:
// The SAX approach.
static void ReadExcelFileSAX(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
{
if (reader.ElementType == typeof(CellValue))
{
text = reader.GetText();
Console.Write(text + " ");
}
}
Console.WriteLine();
Console.ReadKey();
}
}
https://docs.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
除此之外,您可以寻找 library/nuget 阅读速度更快的软件包,因为我看不到 impact-full 可以进一步调整此代码的方法。
现在我正在从 excel 获取数据并循环遍历行,并根据条件处理结果,例如将结果存储在对象中以供进一步处理。
excel sheet 大约 20 MB,记录数接近 7000,我正在使用打开 xml 从 excel 文件中获取数据,例如如以下代码所述。
string filePath = @"C:\weather-Data\DesignConditions_p.xlsx";
using FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false);
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
Sheet firstSheet = workbookPart.Workbook.Descendants<Sheet>().First();
Worksheet sheet = ((WorksheetPart)workbookPart.GetPartById(firstSheet.Id)).Worksheet;
var rows = sheet.Descendants<Row>();
var weatherDataList = new List<WeatherStation>();
foreach (Row row in rows.Skip(5)) // it is taking almost more than 60 minutes to process and enter into the if loop below (country.Equals("USA"))
{
var weatherData = new WeatherStation();
string country = GetCellValue(filePath, "Annual", $"B{row.RowIndex.ToString()}");
if (country.Equals("USA"))
{
weatherData.CountryAbbreviation = country;
weatherData.StateAbbreviation = GetCellValue(filePath, "Annual", $"C{row.RowIndex.ToString()}");
weatherData.Number = GetCellValue(filePath, "Annual", $"E{row.RowIndex.ToString()}");
......
.......
}
}
任何人都可以指出正确的方向以在从 excel 读取数据时优化处理时间,我正在为此应用程序使用 .Net Core
提前致谢。
您可以使用 'SAX' 方法,这样您就可以分段读取文件,因此处理和 IO 会更快。:
// The SAX approach.
static void ReadExcelFileSAX(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
{
if (reader.ElementType == typeof(CellValue))
{
text = reader.GetText();
Console.Write(text + " ");
}
}
Console.WriteLine();
Console.ReadKey();
}
}
https://docs.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
除此之外,您可以寻找 library/nuget 阅读速度更快的软件包,因为我看不到 impact-full 可以进一步调整此代码的方法。