降低 gcloud 日志记录的成本和 space

Question

我创建了一个日志接收器来捕获我们项目中使用的组件生成的日志。下面给出了接收器的详细信息：

gcloud logging sinks describe test-project-instance-activity

bigqueryOptions:
  usePartitionedTables: true
  usesTimestampColumnPartitioning: true
createTime: '2021-10-17T05:15:48.434334305Z'
description: test sink to capture the instance activities
destination: bigquery.googleapis.com/projects/test-project/datasets/test_logging
filter: |-
  resource.type = cloud_composer_environment OR
  resource.type = cloud_dataproc_cluster OR
  resource.type = gce_disk OR
  resource.type = gce_vm_instance OR
  resource.type = gke_container OR
  resource.type = k8s_cluster
name: test-project-instance-activity
updateTime: '2021-10-17T05:15:48.434334305Z'
writerIdentity: serviceAccount:p121-639060@gcp-sa-logging.iam.gserviceaccount.com

我在创建以下表列表的大查询数据集中捕获日志详细信息：

SELECT table_id FROM `test-project.test_logging`.__TABLES__;

我检查并发现大多数表格都包括 INFO 日志，并且它们为围绕这些 google API 发生的任何 activity 生成大量日志.我们真的需要这么多信息日志吗？排除或过滤它们的最佳方法是什么？

Exclusion filter(s):
resource.type="container"
severity="INFO"

根据 google 文档：日志在被日志记录 API

接收后被排除在外

这是否意味着我只能保存 space 我保留排除的 INFO 日志条目的地方.. 例如 gcs 或 bq。

或者我是否需要更改我的应用程序代码以减少对日志记录的报告......或者可以在 airflow.cfg 文件中进行更改。

任何指向 sqls 的指针来分析这些日志表？

只是一个总结：以防万一。我们运行 airflow dags 将 gcs 桶数据提取到 bq 并使用 spark 对它们进行一些聚合，我们在一天中每 15 分钟提取大量数据。

请建议尽量减少日志记录成本。我们每个月都会生成大量日志。

我们也会为 _Default 日志存储桶付费吗？如果禁用它，我会错过什么。

Answer 1

很难回答你的问题，这在很大程度上取决于你做什么和你需要什么！！

Are we really needed these many info logs?

我不知道。你用它们吗？如果没有，你可以跳过它们。

what would be the best way to exclude or filter them?

在您的过滤器中，您可以添加 severity>"INFO"（排除优先级 INFO 及以下）或 severity!="INFO"（仅排除信息跟踪）

as per google docs: Logs are excluded after they are received by the Logging API Does it mean that I can only save the space where I am keeping my excluded INFO logs entries..

这意味着日志到达 Cloud Logging，然后进行路由。有些过滤器可以包括它们，有些可以排除它们。这意味着您要排除每条路线！

Do we get billed for _Default log bucket as well?

是的，所有与默认过滤器匹配的日志都将进入默认存储桶。但是，价格最近发生了变化，您不会为 default storage period (30 days)

付费

what I am going to miss if I disable it.

如果您有自己的过滤器和桶，并且它们正确且充足，您将不会遗漏任何东西。取决于你做什么和你需要什么。

降低 gcloud 日志记录的成本和 space

reducing cost and space for gcloud logging

google-cloud-platform

google-cloud-logging