GCP Healthcare FHIR 摄取太慢
GCP Healthcare FHIR Ingestion is too slow
我正在尝试在 google 医疗保健数据集的 FHIR 存储中摄取 100 万个 FHIR JSON 文件(每个文件以字节为单位)。摄取需要很长时间(一个多小时)。有什么方法可以优化医疗保健的速度API.
注意:我也想摄取、去识别化并导出到 bigquery。所以整个过程用了3个多小时。
提前致谢
在 Google Cloud Healthcare API 中批量导入 FHIR 的一些性能提示:
- Make sure your input GCS bucket is in the same region as the healthcare dataset. Cross-region imports will be slower.
- Check your project quota. The relevant quota for bulk imports is "FHIR storage ingress in bytes per minute". You can request a quota
increase if this becomes the limiting factor.
- Performance may vary depending on the overall load in the region you are using. us-central1 is a very popular region because it's
referenced in the codelab; you might achieve higher throughput
elsewhere (see
https://cloud.google.com/healthcare/docs/concepts/regions for
available regions).
我正在尝试在 google 医疗保健数据集的 FHIR 存储中摄取 100 万个 FHIR JSON 文件(每个文件以字节为单位)。摄取需要很长时间(一个多小时)。有什么方法可以优化医疗保健的速度API.
注意:我也想摄取、去识别化并导出到 bigquery。所以整个过程用了3个多小时。
提前致谢
在 Google Cloud Healthcare API 中批量导入 FHIR 的一些性能提示:
- Make sure your input GCS bucket is in the same region as the healthcare dataset. Cross-region imports will be slower.
- Check your project quota. The relevant quota for bulk imports is "FHIR storage ingress in bytes per minute". You can request a quota increase if this becomes the limiting factor.
- Performance may vary depending on the overall load in the region you are using. us-central1 is a very popular region because it's referenced in the codelab; you might achieve higher throughput elsewhere (see https://cloud.google.com/healthcare/docs/concepts/regions for available regions).