大型字符串列表和 .txt 文件的性能真的很差

Question

你需要知道的：

我的应用程序使用食品数据库，该数据库存在于 .txt 文件中。每种食物大约有 170 个数据（2-3 位数字），由 tabstops 分隔，并且每种食物再次由 \n 分隔，因此 每一行 在这个 .txt 文件中有 1 食物的数据。

应用程序目标平台是Android，它需要离线工作，我使用Unity 使用 c# 进行编码。

我的两个问题是：

正在访问 .txt 文件

因为 android 应用程序无法通过

访问 .txt 文件

$"{Application.DataPath}/textFileName.txt"

我在 Inspector 中将 .txt 文件指定为 TextAsset (name: txtFile) .当应用程序第一次启动时，我将 TextAsset 文件的所有数据加载到 json (name: jsonStringList)，这包含一个字符串列表：

for (int i = 0; i < amountOfLinesInTextFile; i++); { jsonStringList.Add(txtFile.text.Split('\n')[i]) }

从技术上讲确实有效，但不幸的是 txtFile 总共有大约 15000 行，这使得它真的慢（Stopwatch时间为for-loop：≈750000 ms，大约是12.5分钟...)

显然，让用户在第一次打开应用程序时等待那么长时间是不可取的...

正在搜索 jsonList

在该应用程序中，可以通过将多种食物放在一起来制作自己的食物。为此，用户必须搜索寻找食物，然后可以按结果添加它.
目前我签入一个 for-loop 如果用户搜索栏的输入 InputField (name: searchbar) 匹配 jsonStringList 的食物，如果该食物未已经显示。

如果两者为真，我将食物的名称添加到List<string>（ name: results), 这是我用来显示匹配食物的。（由于食物的数据（包括名称）由 tabstops 分隔，我使用 .Split('\t') 来获取食物名称的正确数据）

  for (int i = 0; i < amountOfLinesInTextFile; i++)
  {   string name = jsonStringList[i].Split('\t')[nameIndex].ToLower();
      if (name.Equals(searchBar.text.ToLower()) && !results.Contains(name))
      {
          results.Add(name);
      }
  }

再说一遍：技术上可行，但它也太慢（即使很难很多比 问题 1)

更快

(Stopwatch 对于 for-loop: ≈1600 ms)

如果能帮助我缩短这两个动作的时间，我将非常高兴！也许有一种完全不同的方法来处理如此大的 .txt 文件，但每一点减少时间都会有所帮助！

Answer 1

15000 确实不是一个大文件。你只是做了太多不必要的事情reading/transformations。你需要做一次，缓存它（在你的情况下保存在变量中），重复使用它。

var foodIndex = txtFile
  .text
  .Split('\n')                //get rows
  .Select(x=> x.Split('\t'))  //get columns for each row
  .ToDictionary(x=> x[nameIndex], StringComparer.OrdinalIgnoreCase);   //build case-insensitive search index

var myFood = foodIndex["aPpLe"];

这个产品Dictionary<string, string[]>

更好的方法

反序列化 CSV format (your file is obviously CSV table) into POCO 行：

public class Food
{
   [DataMember(Order=1)] //here is your nameIndex
   public string Name {get;set;}
   [DataMember(Order=2)]
   public int Amount {get;set;}
   //...
}

var foodIndex = SomeCSVParse<Food>(txtFile.text)
  .ToDictionary(x=> x.Name, StringComparer.OrdinalIgnoreCase);

var myFood = foodIndex["aPpLe"];

这产生Dictionary<string, Food>搜索索引，看起来更好，更容易使用。

这样所有从字符串到 int/double/datetime/etc 的转换、列的顺序、分隔符（逗号、制表符、空格）、文化（如果有 float/double）、高效阅读、headers，等等可以直接放弃到第 3 方框架。有人在这里做了这个 - Parsing CSV files in C#, with header

nuget 上也有很多框架，只需选择 smaller/popular 或从源代码中复制粘贴即可 - https://www.nuget.org/packages?q=CSV

并详细了解 C# 中的数据结构 - https://docs.microsoft.com/en-us/dotnet/standard/collections/

大型字符串列表和 .txt 文件的性能真的很差

Really bad performance for large Lists of strings and .txt files

c#

search

list

text-files

unity3d

更好的方法