比较两个表的最快方法是什么?
What is the fastest way to compare two tables?
例如有两个表具有相同的架构,但内容不同:
表 1
| field1 | field2 | field3 |
----------------------------------------
| 1 | aaaaa | 100 |
| 2 | bbbbb | 200 |
| 3 | ccccc | 300 |
| 4 | ddddd | 400 |
表2
| field1 | field2 | field3 |
----------------------------------------
| 2 | xxxxx | 200 |
| 3 | ccccc | 999 |
| 4 | ddddd | 400 |
| 5 | eeeee | 500 |
预期的比较结果为:
已在 B 中删除:
| 1 | aaaaa | 100 |
不匹配:
Table1:| 2 | bbbbb | 200 |
Table2:| 2 | xxxxx | 200 |
Table1:| 3 | ccccc | 300 |
Table2:| 3 | ccccc | 999 |
B区新增
| 5 | eeeee | 500 |
使用C#,比较两个表最快的方法是什么?
目前我的实现是:
检查table1中的每一行在table2中是否完全匹配;
检查 table2 中的每一行是否与 table1 中的完全匹配。
效率为 n*n
,因此对于 100k 行,在服务器上 运行 需要 20 分钟。
非常感谢
你可以这样试试,应该很快:
class objType
{
public int Field1 { get; set; }
public string Field2 { get; set; }
public int Field3 { get; set; }
public bool AreEqual(object other)
{
var otherType = other as objType;
if (otherType == null)
return false;
return Field1 == otherType.Field1 && Field2 == otherType.Field2 && Field3 == otherType.Field3;
}
}
var tableOne = new objType[] {
new objType { Field1 = 1, Field2 = "aaaa", Field3 = 100 },
new objType { Field1 = 2, Field2 = "bbbb", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 300 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 }
};
var tableTwo = new objType[] {
new objType { Field1 = 2, Field2 = "xxxx", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 999 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 },
new objType { Field1 = 5, Field2 = "eeee", Field3 = 500 }
};
var originalIds = tableOne.ToDictionary(o => o.Field1, o => o);
var newIds = tableTwo.ToDictionary(o => o.Field1, o => o);
var deleted = new List<objType>();
var modified = new List<objType>();
foreach (var row in tableOne)
{
if(!newIds.ContainsKey(row.Field1))
deleted.Add(row);
else
{
var otherRow = newIds[row.Field1];
if (!otherRow.AreEqual(row))
{
modified.Add(row);
modified.Add(otherRow);
}
}
}
var added = tableTwo.Where(t => !originalIds.ContainsKey(t.Field1)).ToList();
可能值得重写 Equals
而不是 AreEqual
(或使 AreEqual
成为 class 定义之外的辅助方法),但这取决于您的项目设置。
例如有两个表具有相同的架构,但内容不同:
表 1
| field1 | field2 | field3 |
----------------------------------------
| 1 | aaaaa | 100 |
| 2 | bbbbb | 200 |
| 3 | ccccc | 300 |
| 4 | ddddd | 400 |
表2
| field1 | field2 | field3 |
----------------------------------------
| 2 | xxxxx | 200 |
| 3 | ccccc | 999 |
| 4 | ddddd | 400 |
| 5 | eeeee | 500 |
预期的比较结果为:
已在 B 中删除:
| 1 | aaaaa | 100 |
不匹配:
Table1:| 2 | bbbbb | 200 |
Table2:| 2 | xxxxx | 200 |
Table1:| 3 | ccccc | 300 |
Table2:| 3 | ccccc | 999 |
B区新增
| 5 | eeeee | 500 |
使用C#,比较两个表最快的方法是什么?
目前我的实现是: 检查table1中的每一行在table2中是否完全匹配; 检查 table2 中的每一行是否与 table1 中的完全匹配。
效率为 n*n
,因此对于 100k 行,在服务器上 运行 需要 20 分钟。
非常感谢
你可以这样试试,应该很快:
class objType
{
public int Field1 { get; set; }
public string Field2 { get; set; }
public int Field3 { get; set; }
public bool AreEqual(object other)
{
var otherType = other as objType;
if (otherType == null)
return false;
return Field1 == otherType.Field1 && Field2 == otherType.Field2 && Field3 == otherType.Field3;
}
}
var tableOne = new objType[] {
new objType { Field1 = 1, Field2 = "aaaa", Field3 = 100 },
new objType { Field1 = 2, Field2 = "bbbb", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 300 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 }
};
var tableTwo = new objType[] {
new objType { Field1 = 2, Field2 = "xxxx", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 999 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 },
new objType { Field1 = 5, Field2 = "eeee", Field3 = 500 }
};
var originalIds = tableOne.ToDictionary(o => o.Field1, o => o);
var newIds = tableTwo.ToDictionary(o => o.Field1, o => o);
var deleted = new List<objType>();
var modified = new List<objType>();
foreach (var row in tableOne)
{
if(!newIds.ContainsKey(row.Field1))
deleted.Add(row);
else
{
var otherRow = newIds[row.Field1];
if (!otherRow.AreEqual(row))
{
modified.Add(row);
modified.Add(otherRow);
}
}
}
var added = tableTwo.Where(t => !originalIds.ContainsKey(t.Field1)).ToList();
可能值得重写 Equals
而不是 AreEqual
(或使 AreEqual
成为 class 定义之外的辅助方法),但这取决于您的项目设置。