C#比较csv中不同行的字段
C# compare fields from different lines in csv
我正在尝试比较一行中数组的 0 索引中的值和下一行中的 0 索引中的值。想象一个 CSV,其中我在第一列中有一个唯一标识符,在第二列中有一个对应的值。
USER1, 1P
USER1, 3G
USER2, 1P
USER3, 1V
我想检查下一行(或上一行,如果更容易的话)的值 [0] 进行比较,如果它们相同(如示例中所示),则将其连接到索引 1。那是,数据应显示为
USER1, 1P, 3G
USER2, 1P
USER3, 1V
在它被传递到下一个函数之前。到目前为止我有
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null)
{
break;
}
contact.ContactId = parts[0];
long nextLine;
nextLine = parser.LineNumber+1;
//if line1 parts[0] == line2 parts[0] etc.
}
}
}
有人有什么建议吗?谢谢你。
如何将数组保存到变量中:
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
string[] oldParts = new string[] { string.Empty };
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null || parts.Length < 1)
{
break;
}
if (oldParts[0] == parts[0])
{
// concat logic goes here
}
else
{
contact.ContactId = parts[0];
}
long nextLine;
nextLine = parser.LineNumber+1;
oldParts = parts;
//if line1 parts[0] == line2 parts[0] etc.
}
}
}
执行此类操作的最简单方法是将每一行转换为一个对象。你可以用CsvHelper
,https://www.nuget.org/packages/CsvHelper/, to do the work for you or you can iterate each line and parse to an object. It is a great tool and it knows how to properly parse CSV files into a collection of objects. Then, whether you create the collection yourself or use CsvHelper
, you can use Linq
to GroupBy
, https://msdn.microsoft.com/en-us/library/bb534304(v=vs.100).aspx, your "key" (in this case UserId) and Aggregate
, https://msdn.microsoft.com/en-us/library/bb549218(v=vs.110).aspx,其他的属性变成一个字符串。然后,您可以将新的分组依据集合用于您的最终目标(将其写入文件或将其用于您需要的任何地方)。
您基本上找到了所有唯一条目,因此将它们放入以联系人 ID 为键的字典中。如下:
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
Dictionary<string, List<string>> uniqueContacts = new Dictionary<string, List<string>>();
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null || parts.Count() != 2)
{
break;
}
//if contact id not present in dictionary add
if (!uniqueContacts.ContainsKey(parts[0]))
uniqueContacts.Add(parts[0],new List<string>());
//now there's definitely an existing contact in dic (the one
//we've just added or a previously added one) so add to the
//list of strings for that contact
uniqueContacts[parts[0]].Add(parts[1]);
}
//now do something with that dictionary of unique user names and
// lists of strings, for example dump them to console in the
//format you specify:
foreach (var contactId in uniqueContacts.Keys)
{
var sb = new StringBuilder();
sb.Append($"contactId, ");
foreach (var bit in uniqueContacts[contactId])
{
sb.Append(bit);
if (bit != uniqueContacts[contactId].Last())
sb.Append(", ");
}
Console.WriteLine(sb);
}
}
}
如果我没理解错的话,你问的本质上是"how do I group the values in the second column based on the values in the first column?"。
一种快速而简洁的方法是 Group By using LINQ:
var linesGroupedByUser =
from line in File.ReadAllLines(path)
let elements = line.Split(',')
let user = new {Name = elements[0], Value = elements[1]}
group user by user.Name into users
select users;
foreach (var user in linesGroupedByUser)
{
string valuesAsString = String.Join(",", user.Select(x => x.Value));
Console.WriteLine(user.Key + ", " + valuesAsString);
}
我没有使用您的 TextFieldParser
class,但您可以轻松地使用它。但是,这种方法确实要求您有能力将所有数据加载到内存中。你不提这是否可行。
我正在尝试比较一行中数组的 0 索引中的值和下一行中的 0 索引中的值。想象一个 CSV,其中我在第一列中有一个唯一标识符,在第二列中有一个对应的值。
USER1, 1P
USER1, 3G
USER2, 1P
USER3, 1V
我想检查下一行(或上一行,如果更容易的话)的值 [0] 进行比较,如果它们相同(如示例中所示),则将其连接到索引 1。那是,数据应显示为
USER1, 1P, 3G
USER2, 1P
USER3, 1V
在它被传递到下一个函数之前。到目前为止我有
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null)
{
break;
}
contact.ContactId = parts[0];
long nextLine;
nextLine = parser.LineNumber+1;
//if line1 parts[0] == line2 parts[0] etc.
}
}
}
有人有什么建议吗?谢谢你。
如何将数组保存到变量中:
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
string[] oldParts = new string[] { string.Empty };
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null || parts.Length < 1)
{
break;
}
if (oldParts[0] == parts[0])
{
// concat logic goes here
}
else
{
contact.ContactId = parts[0];
}
long nextLine;
nextLine = parser.LineNumber+1;
oldParts = parts;
//if line1 parts[0] == line2 parts[0] etc.
}
}
}
执行此类操作的最简单方法是将每一行转换为一个对象。你可以用CsvHelper
,https://www.nuget.org/packages/CsvHelper/, to do the work for you or you can iterate each line and parse to an object. It is a great tool and it knows how to properly parse CSV files into a collection of objects. Then, whether you create the collection yourself or use CsvHelper
, you can use Linq
to GroupBy
, https://msdn.microsoft.com/en-us/library/bb534304(v=vs.100).aspx, your "key" (in this case UserId) and Aggregate
, https://msdn.microsoft.com/en-us/library/bb549218(v=vs.110).aspx,其他的属性变成一个字符串。然后,您可以将新的分组依据集合用于您的最终目标(将其写入文件或将其用于您需要的任何地方)。
您基本上找到了所有唯一条目,因此将它们放入以联系人 ID 为键的字典中。如下:
private void csvParse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] { "," };
Dictionary<string, List<string>> uniqueContacts = new Dictionary<string, List<string>>();
while (!parser.EndOfData)
{
string[] parts = parser.ReadFields();
if (parts == null || parts.Count() != 2)
{
break;
}
//if contact id not present in dictionary add
if (!uniqueContacts.ContainsKey(parts[0]))
uniqueContacts.Add(parts[0],new List<string>());
//now there's definitely an existing contact in dic (the one
//we've just added or a previously added one) so add to the
//list of strings for that contact
uniqueContacts[parts[0]].Add(parts[1]);
}
//now do something with that dictionary of unique user names and
// lists of strings, for example dump them to console in the
//format you specify:
foreach (var contactId in uniqueContacts.Keys)
{
var sb = new StringBuilder();
sb.Append($"contactId, ");
foreach (var bit in uniqueContacts[contactId])
{
sb.Append(bit);
if (bit != uniqueContacts[contactId].Last())
sb.Append(", ");
}
Console.WriteLine(sb);
}
}
}
如果我没理解错的话,你问的本质上是"how do I group the values in the second column based on the values in the first column?"。
一种快速而简洁的方法是 Group By using LINQ:
var linesGroupedByUser =
from line in File.ReadAllLines(path)
let elements = line.Split(',')
let user = new {Name = elements[0], Value = elements[1]}
group user by user.Name into users
select users;
foreach (var user in linesGroupedByUser)
{
string valuesAsString = String.Join(",", user.Select(x => x.Value));
Console.WriteLine(user.Key + ", " + valuesAsString);
}
我没有使用您的 TextFieldParser
class,但您可以轻松地使用它。但是,这种方法确实要求您有能力将所有数据加载到内存中。你不提这是否可行。