GCP Healthcare FHIR 摄取太慢

GCP Healthcare FHIR Ingestion is too slow

我正在尝试在 google 医疗保健数据集的 FHIR 存储中摄取 100 万个 FHIR JSON 文件(每个文件以字节为单位)。摄取需要很长时间(一个多小时)。有什么方法可以优化医疗保健的速度API.

注意:我也想摄取、去识别化并导出到 bigquery。所以整个过程用了3个多小时。

提前致谢

在 Google Cloud Healthcare API 中批量导入 FHIR 的一些性能提示:

  • Make sure your input GCS bucket is in the same region as the healthcare dataset. Cross-region imports will be slower.
  • Check your project quota. The relevant quota for bulk imports is "FHIR storage ingress in bytes per minute". You can request a quota increase if this becomes the limiting factor.
  • Performance may vary depending on the overall load in the region you are using. us-central1 is a very popular region because it's referenced in the codelab; you might achieve higher throughput elsewhere (see https://cloud.google.com/healthcare/docs/concepts/regions for available regions).