使用 NSPersistentCloudKitContainer 时预填充核心数据存储的最佳方法是什么?
What's the best approach to prefill Core Data store when using NSPersistentCloudKitContainer?
我遇到以下情况,我从 JSON 文件中解析对象并将它们存储到我的核心数据存储中。现在我正在使用 NSPersistentCloudKitContainer
,当我在另一台设备上 运行 应用程序时,它还会解析 JSON 文件并将对象添加到核心数据。这会导致重复对象。
现在想知道有没有:
- 如果我可以检查一个实体是否已经远程存在,这是一种简单的方法吗?
- 还有其他方法可以避免对象在 CloudKit 中被保存两次吗?
- 从远程获取数据完成时收到通知?
也许现在回答为时已晚,但我最近正在处理同样的问题。经过几周的研究,我想把我学到的东西留在这里,希望能帮助遇到同样问题的人。
An easy way if I can check that an entity already exists remotely?
Any other way to avoid objects being saved twice in CloudKit?
是的,我们可以检查 iCloud 上是否已经存在该实体,但这不是决定是否解析 JSON 文件并将其保存到 CoreData persistentStore 的最佳方法。该应用程序可能未连接到 Apple ID / iCloud,或者存在一些网络问题,导致无法可靠地检查该实体是否存在于远程。
目前的解决方案是我们自己对数据进行去重,方法是为每个从 JSON 文件添加的数据 object 添加一个 UUID 字段,并删除具有相同 UUID 的 object .
大多数时候我还会添加一个 lastUpdate 字段,这样我们就可以保留最新的数据 object.
Getting notified when fetching data from remote has finished?
我们可以添加 NSPersistentStoreRemoteChange 的观察者,并在远程存储发生变化时收到通知。
Apple 提供了一个使用 CoreData 和 CloudKit 的演示项目,并很好地解释了重复数据删除。
将本地商店同步到云端
https://developer.apple.com/documentation/coredata/synchronizing_a_local_store_to_the_cloud
WWDC2019 session 202:将 CoreData 与 CloudKit 结合使用
https://developer.apple.com/videos/play/wwdc2019/202
整个想法是侦听远程存储中的更改,跟踪更改历史记录,并在有任何新数据进入时对我们的数据进行重复数据删除。 (当然我们需要一些字段来确定数据是否重复)。持久存储提供历史跟踪功能,我们可以在它们合并到本地存储时获取这些事务,以及 运行 我们的重复数据删除过程。假设我们将在应用启动时解析 JSON 并导入标签:
// Use a custom queue to ensure only one process of history handling at the same time
private lazy var historyQueue: OperationQueue = {
let queue = OperationQueue()
queue.maxConcurrentOperationCount = 1
return queue
}()
lazy var persistentContainer: NSPersistentContainer = {
let container = NSPersistentCloudKitContainer(name: "CoreDataCloudKitDemo")
...
// set the persistentStoreDescription to track history and generate notificaiton (NSPersistentHistoryTrackingKey, NSPersistentStoreRemoteChangeNotificationPostOptionKey)
// load the persistentStores
// set the mergePolicy of the viewContext
...
// Observe Core Data remote change notifications.
NotificationCenter.default.addObserver(
self, selector: #selector(type(of: self).storeRemoteChange(_:)),
name: .NSPersistentStoreRemoteChange, object: container.persistentStoreCoordinator)
return container
}()
@objc func storeRemoteChange(_ notification: Notification) {
// Process persistent history to merge changes from other coordinators.
historyQueue.addOperation {
self.processPersistentHistory()
}
}
// To fetch change since last update, deduplicate if any new insert data, and save the updated token
private func processPersistentHistory() {
// run in a background context and not blocking the view context.
// when background context is saved, it will merge to the view context based on the merge policy
let taskContext = persistentContainer.newBackgroundContext()
taskContext.performAndWait {
// Fetch history received from outside the app since the last token
let historyFetchRequest = NSPersistentHistoryTransaction.fetchRequest!
let request = NSPersistentHistoryChangeRequest.fetchHistory(after: lastHistoryToken)
request.fetchRequest = historyFetchRequest
let result = (try? taskContext.execute(request)) as? NSPersistentHistoryResult
guard let transactions = result?.result as? [NSPersistentHistoryTransaction],
!transactions.isEmpty
else { return }
// Tags from remote store
var newTagObjectIDs = [NSManagedObjectID]()
let tagEntityName = Tag.entity().name
// Append those .insert change in the trasactions that we want to deduplicate
for transaction in transactions where transaction.changes != nil {
for change in transaction.changes!
where change.changedObjectID.entity.name == tagEntityName && change.changeType == .insert {
newTagObjectIDs.append(change.changedObjectID)
}
}
if !newTagObjectIDs.isEmpty {
deduplicateAndWait(tagObjectIDs: newTagObjectIDs)
}
// Update the history token using the last transaction.
lastHistoryToken = transactions.last!.token
}
}
我们在这里保存添加标签的 ObjectID,以便我们可以在任何其他 object 上下文中对它们进行重复数据删除,
private func deduplicateAndWait(tagObjectIDs: [NSManagedObjectID]) {
let taskContext = persistentContainer.backgroundContext()
// Use performAndWait because each step relies on the sequence. Since historyQueue runs in the background, waiting won’t block the main queue.
taskContext.performAndWait {
tagObjectIDs.forEach { tagObjectID in
self.deduplicate(tagObjectID: tagObjectID, performingContext: taskContext)
}
// Save the background context to trigger a notification and merge the result into the viewContext.
taskContext.save(with: .deduplicate)
}
}
private func deduplicate(tagObjectID: NSManagedObjectID, performingContext: NSManagedObjectContext) {
// Get tag by the objectID
guard let tag = performingContext.object(with: tagObjectID) as? Tag,
let tagUUID = tag.uuid else {
fatalError("###\(#function): Failed to retrieve a valid tag with ID: \(tagObjectID)")
}
// Fetch all tags with the same uuid
let fetchRequest: NSFetchRequest<Tag> = Tag.fetchRequest()
// Sort by lastUpdate, keep the latest Tag
fetchRequest.sortDescriptors = [NSSortDescriptor(key: "lastUpdate", ascending: false)]
fetchRequest.predicate = NSPredicate(format: "uuid == %@", tagUUID)
// Return if there are no duplicates.
guard var duplicatedTags = try? performingContext.fetch(fetchRequest), duplicatedTags.count > 1 else {
return
}
// Pick the first tag as the winner.
guard let winner = duplicatedTags.first else {
fatalError("###\(#function): Failed to retrieve the first duplicated tag")
}
duplicatedTags.removeFirst()
remove(duplicatedTags: duplicatedTags, winner: winner, performingContext: performingContext)
}
最困难的部分(在我看来)是处理那些被删除的重复 object 的关系,假设我们的标签 object 与 one-to-many 有关系一个类别object(每个标签可能有多个类别)
private func remove(duplicatedTags: [Tag], winner: Tag, performingContext: NSManagedObjectContext) {
duplicatedTags.forEach { tag in
// delete the tag AFTER we handle the relationship
// and be careful that the delete rule will also activate
defer { performingContext.delete(tag) }
if let categorys = tag.categorys as? Set<Category> {
for category in categorys {
// re-map those category to the winner Tag, or it will become nil when the duplicated Tag got delete
category.ofTag = winner
}
}
}
}
一个有趣的事情是,如果Category object也是从远程存储中添加的,那么当我们处理关系时它们可能还不存在,但那是另外一回事了。
我遇到以下情况,我从 JSON 文件中解析对象并将它们存储到我的核心数据存储中。现在我正在使用 NSPersistentCloudKitContainer
,当我在另一台设备上 运行 应用程序时,它还会解析 JSON 文件并将对象添加到核心数据。这会导致重复对象。
现在想知道有没有:
- 如果我可以检查一个实体是否已经远程存在,这是一种简单的方法吗?
- 还有其他方法可以避免对象在 CloudKit 中被保存两次吗?
- 从远程获取数据完成时收到通知?
也许现在回答为时已晚,但我最近正在处理同样的问题。经过几周的研究,我想把我学到的东西留在这里,希望能帮助遇到同样问题的人。
An easy way if I can check that an entity already exists remotely?
Any other way to avoid objects being saved twice in CloudKit?
是的,我们可以检查 iCloud 上是否已经存在该实体,但这不是决定是否解析 JSON 文件并将其保存到 CoreData persistentStore 的最佳方法。该应用程序可能未连接到 Apple ID / iCloud,或者存在一些网络问题,导致无法可靠地检查该实体是否存在于远程。
目前的解决方案是我们自己对数据进行去重,方法是为每个从 JSON 文件添加的数据 object 添加一个 UUID 字段,并删除具有相同 UUID 的 object . 大多数时候我还会添加一个 lastUpdate 字段,这样我们就可以保留最新的数据 object.
Getting notified when fetching data from remote has finished?
我们可以添加 NSPersistentStoreRemoteChange 的观察者,并在远程存储发生变化时收到通知。
Apple 提供了一个使用 CoreData 和 CloudKit 的演示项目,并很好地解释了重复数据删除。
将本地商店同步到云端 https://developer.apple.com/documentation/coredata/synchronizing_a_local_store_to_the_cloud
WWDC2019 session 202:将 CoreData 与 CloudKit 结合使用 https://developer.apple.com/videos/play/wwdc2019/202
整个想法是侦听远程存储中的更改,跟踪更改历史记录,并在有任何新数据进入时对我们的数据进行重复数据删除。 (当然我们需要一些字段来确定数据是否重复)。持久存储提供历史跟踪功能,我们可以在它们合并到本地存储时获取这些事务,以及 运行 我们的重复数据删除过程。假设我们将在应用启动时解析 JSON 并导入标签:
// Use a custom queue to ensure only one process of history handling at the same time
private lazy var historyQueue: OperationQueue = {
let queue = OperationQueue()
queue.maxConcurrentOperationCount = 1
return queue
}()
lazy var persistentContainer: NSPersistentContainer = {
let container = NSPersistentCloudKitContainer(name: "CoreDataCloudKitDemo")
...
// set the persistentStoreDescription to track history and generate notificaiton (NSPersistentHistoryTrackingKey, NSPersistentStoreRemoteChangeNotificationPostOptionKey)
// load the persistentStores
// set the mergePolicy of the viewContext
...
// Observe Core Data remote change notifications.
NotificationCenter.default.addObserver(
self, selector: #selector(type(of: self).storeRemoteChange(_:)),
name: .NSPersistentStoreRemoteChange, object: container.persistentStoreCoordinator)
return container
}()
@objc func storeRemoteChange(_ notification: Notification) {
// Process persistent history to merge changes from other coordinators.
historyQueue.addOperation {
self.processPersistentHistory()
}
}
// To fetch change since last update, deduplicate if any new insert data, and save the updated token
private func processPersistentHistory() {
// run in a background context and not blocking the view context.
// when background context is saved, it will merge to the view context based on the merge policy
let taskContext = persistentContainer.newBackgroundContext()
taskContext.performAndWait {
// Fetch history received from outside the app since the last token
let historyFetchRequest = NSPersistentHistoryTransaction.fetchRequest!
let request = NSPersistentHistoryChangeRequest.fetchHistory(after: lastHistoryToken)
request.fetchRequest = historyFetchRequest
let result = (try? taskContext.execute(request)) as? NSPersistentHistoryResult
guard let transactions = result?.result as? [NSPersistentHistoryTransaction],
!transactions.isEmpty
else { return }
// Tags from remote store
var newTagObjectIDs = [NSManagedObjectID]()
let tagEntityName = Tag.entity().name
// Append those .insert change in the trasactions that we want to deduplicate
for transaction in transactions where transaction.changes != nil {
for change in transaction.changes!
where change.changedObjectID.entity.name == tagEntityName && change.changeType == .insert {
newTagObjectIDs.append(change.changedObjectID)
}
}
if !newTagObjectIDs.isEmpty {
deduplicateAndWait(tagObjectIDs: newTagObjectIDs)
}
// Update the history token using the last transaction.
lastHistoryToken = transactions.last!.token
}
}
我们在这里保存添加标签的 ObjectID,以便我们可以在任何其他 object 上下文中对它们进行重复数据删除,
private func deduplicateAndWait(tagObjectIDs: [NSManagedObjectID]) {
let taskContext = persistentContainer.backgroundContext()
// Use performAndWait because each step relies on the sequence. Since historyQueue runs in the background, waiting won’t block the main queue.
taskContext.performAndWait {
tagObjectIDs.forEach { tagObjectID in
self.deduplicate(tagObjectID: tagObjectID, performingContext: taskContext)
}
// Save the background context to trigger a notification and merge the result into the viewContext.
taskContext.save(with: .deduplicate)
}
}
private func deduplicate(tagObjectID: NSManagedObjectID, performingContext: NSManagedObjectContext) {
// Get tag by the objectID
guard let tag = performingContext.object(with: tagObjectID) as? Tag,
let tagUUID = tag.uuid else {
fatalError("###\(#function): Failed to retrieve a valid tag with ID: \(tagObjectID)")
}
// Fetch all tags with the same uuid
let fetchRequest: NSFetchRequest<Tag> = Tag.fetchRequest()
// Sort by lastUpdate, keep the latest Tag
fetchRequest.sortDescriptors = [NSSortDescriptor(key: "lastUpdate", ascending: false)]
fetchRequest.predicate = NSPredicate(format: "uuid == %@", tagUUID)
// Return if there are no duplicates.
guard var duplicatedTags = try? performingContext.fetch(fetchRequest), duplicatedTags.count > 1 else {
return
}
// Pick the first tag as the winner.
guard let winner = duplicatedTags.first else {
fatalError("###\(#function): Failed to retrieve the first duplicated tag")
}
duplicatedTags.removeFirst()
remove(duplicatedTags: duplicatedTags, winner: winner, performingContext: performingContext)
}
最困难的部分(在我看来)是处理那些被删除的重复 object 的关系,假设我们的标签 object 与 one-to-many 有关系一个类别object(每个标签可能有多个类别)
private func remove(duplicatedTags: [Tag], winner: Tag, performingContext: NSManagedObjectContext) {
duplicatedTags.forEach { tag in
// delete the tag AFTER we handle the relationship
// and be careful that the delete rule will also activate
defer { performingContext.delete(tag) }
if let categorys = tag.categorys as? Set<Category> {
for category in categorys {
// re-map those category to the winner Tag, or it will become nil when the duplicated Tag got delete
category.ofTag = winner
}
}
}
}
一个有趣的事情是,如果Category object也是从远程存储中添加的,那么当我们处理关系时它们可能还不存在,但那是另外一回事了。