我正在从我的数据库中复制数据,我如何知道首先从哪些表中复制以避免引用错误?
I'm copying data from my database, how do I know which tables to copy from first to avoid reference errors?
我有一个数据库模式,其中有许多表和表之间的关系 - 你知道,标准 SQL 数据库设置。
我想生成 insert
语句来 "copy" 从一个数据库到另一个具有相同架构的数据,但是 none 的数据。
问题是,如果我以任何顺序执行此操作,它可能无法正常工作,因为先插入的数据可能取决于稍后才编写脚本的数据。
如何对 insert
语句进行排序,以便数据依赖项的顺序正确?
您想要的类型称为 topological sort. This orders elements (in your case, tables) so that depending elements come after dependency elements. One common technique to perform this kind of sorting is to build a graph structure and apply the sorting algorithm on it. Many frameworks have libraries which build graphs and have algorithms to perform the topological sort for you (.Net, Java, Python, C++)。
您将面临的一个问题是您的表是否具有循环关系。例如:
[a] --> [b] --> [a]
这个循环阻止了图的拓扑排序,除非你知道 [a]
中的 none 个实体引用 [b]
中的实体,而后者又引用相同的实体[a]
中的实体,您无法确定是否会避免引用冲突。
这是 C# 中的示例脚本(使用 LinqPad) which queries the relationships in a database schema, uses Quickgraph 构建图形,然后对其进行拓扑排序并列出排序的表格(您可以从中构建 insert
语句),或列表如果无法按拓扑排序,则具有依赖循环的表:
http://share.linqpad.net/47qds2.linq
void Main()
{
var targetDb = "MyDb";
var relationSql = @"SELECT pk.TABLE_NAME as PrimaryKeyTable,
fk.TABLE_NAME as ForeignKeyRefTable
FROM
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS r
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS fk
ON
fk.CONSTRAINT_NAME = r.CONSTRAINT_NAME
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk
ON
pk.CONSTRAINT_NAME = r.UNIQUE_CONSTRAINT_NAME
ORDER BY
ForeignKeyRefTable";
var tableSql = @"select t.[object_id] as TableId, t.[name] as TableName, c.[name] as ColName
from sys.tables t
inner join sys.columns c
on t.object_id = c.object_id";
using(var conn = new SqlConnection(String.Format(@"Data Source=(localdb)\v11.0;Initial Catalog={0}", targetDb)))
{
var tables = conn.Query<Table>(tableSql);
var relations = conn.Query<Relation>(relationSql);
var relationGraph = new QuickGraph.AdjacencyGraph<String, Edge<String>>();
relationGraph.AddVertexRange(tables.Select(t => t.TableName));
var relationEdges = from r in relations
where r.ForeignKeyRefTable != r.PrimaryKeyTable
select new QuickGraph.Edge<String>(r.PrimaryKeyTable, r.ForeignKeyRefTable);
relationGraph.AddEdgeRange(relationEdges);
// The graph can be topologically sorted only if it is acyclic
if (relationGraph.IsDirectedAcyclicGraph())
{
var inRelationOrder = relationGraph.TopologicalSort();
inRelationOrder.Dump("Sorted Tables");
}
else
{
var connected = AlgorithmExtensions.CondensateStronglyConnected<String, Edge<String>, AdjacencyGraph<String, Edge<String>>>(relationGraph);
var cycles = from v in connected.Vertices
where v.VertexCount > 1
select v.Vertices;
cycles.Dump("Dependency Cycles");
}
}
}
public class Table
{
public Int32 TableId { get; set; }
public String TableName { get; set; }
public String ColName{ get; set; }
}
public class Relation
{
public String PrimaryKeyTable { get; set; }
public String ForeignKeyRefTable { get; set; }
}
我有一个数据库模式,其中有许多表和表之间的关系 - 你知道,标准 SQL 数据库设置。
我想生成 insert
语句来 "copy" 从一个数据库到另一个具有相同架构的数据,但是 none 的数据。
问题是,如果我以任何顺序执行此操作,它可能无法正常工作,因为先插入的数据可能取决于稍后才编写脚本的数据。
如何对 insert
语句进行排序,以便数据依赖项的顺序正确?
您想要的类型称为 topological sort. This orders elements (in your case, tables) so that depending elements come after dependency elements. One common technique to perform this kind of sorting is to build a graph structure and apply the sorting algorithm on it. Many frameworks have libraries which build graphs and have algorithms to perform the topological sort for you (.Net, Java, Python, C++)。
您将面临的一个问题是您的表是否具有循环关系。例如:
[a] --> [b] --> [a]
这个循环阻止了图的拓扑排序,除非你知道 [a]
中的 none 个实体引用 [b]
中的实体,而后者又引用相同的实体[a]
中的实体,您无法确定是否会避免引用冲突。
这是 C# 中的示例脚本(使用 LinqPad) which queries the relationships in a database schema, uses Quickgraph 构建图形,然后对其进行拓扑排序并列出排序的表格(您可以从中构建 insert
语句),或列表如果无法按拓扑排序,则具有依赖循环的表:
http://share.linqpad.net/47qds2.linq
void Main()
{
var targetDb = "MyDb";
var relationSql = @"SELECT pk.TABLE_NAME as PrimaryKeyTable,
fk.TABLE_NAME as ForeignKeyRefTable
FROM
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS r
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS fk
ON
fk.CONSTRAINT_NAME = r.CONSTRAINT_NAME
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk
ON
pk.CONSTRAINT_NAME = r.UNIQUE_CONSTRAINT_NAME
ORDER BY
ForeignKeyRefTable";
var tableSql = @"select t.[object_id] as TableId, t.[name] as TableName, c.[name] as ColName
from sys.tables t
inner join sys.columns c
on t.object_id = c.object_id";
using(var conn = new SqlConnection(String.Format(@"Data Source=(localdb)\v11.0;Initial Catalog={0}", targetDb)))
{
var tables = conn.Query<Table>(tableSql);
var relations = conn.Query<Relation>(relationSql);
var relationGraph = new QuickGraph.AdjacencyGraph<String, Edge<String>>();
relationGraph.AddVertexRange(tables.Select(t => t.TableName));
var relationEdges = from r in relations
where r.ForeignKeyRefTable != r.PrimaryKeyTable
select new QuickGraph.Edge<String>(r.PrimaryKeyTable, r.ForeignKeyRefTable);
relationGraph.AddEdgeRange(relationEdges);
// The graph can be topologically sorted only if it is acyclic
if (relationGraph.IsDirectedAcyclicGraph())
{
var inRelationOrder = relationGraph.TopologicalSort();
inRelationOrder.Dump("Sorted Tables");
}
else
{
var connected = AlgorithmExtensions.CondensateStronglyConnected<String, Edge<String>, AdjacencyGraph<String, Edge<String>>>(relationGraph);
var cycles = from v in connected.Vertices
where v.VertexCount > 1
select v.Vertices;
cycles.Dump("Dependency Cycles");
}
}
}
public class Table
{
public Int32 TableId { get; set; }
public String TableName { get; set; }
public String ColName{ get; set; }
}
public class Relation
{
public String PrimaryKeyTable { get; set; }
public String ForeignKeyRefTable { get; set; }
}