使用 GCP/GAE 搜索和缓存 API

Question

如果我使用 ElasticSearch image from Bitnami in GCE would I need a separate Memcached VM or is caching with Memcached preferrably achieved by other means (locally at the client or via web cache) or even built-into ElasticSearch? Should I rather extend the runtime with Elasticsearch and Memcached in a docker container in the appengine flexible envionment similar to this sample?

背景是我想升级最初是 python2.7 google appengine webapp 的项目，但是 google appengine 的 python3 版本 python 已经弃用了 memcached API 和 ndb search API 所以我正在考虑是否在 GCE 中使用带有 ElasticSearch and/or Memcached 的实例，这样我就可以划分python3.8 appengine webapp 和一些运行 ElasticSearch 的实例之间的服务。我试了一下，体验不错

出于我的目的，我还准备考虑除 ElasticSearch 之外的其他替代方案（网络 UI 目前是使用 semantic-ui and custom JS). Migrating away from the user-model of webapp2 we are going to use firebase for user authentication and keep the python app-engine-ndb 创建的，但我们正在考虑放弃 NDB 模型，因为主要我们存储的数据是用户配置文件（现在可以存储在 Firebase 中）和短期数据（保存在 appengine 数据存储区中）。如果这个项目是今天从头开始创建的，我可能会使用 Firebase 来处理所有事情并直接连接到它通过 APIs 的前端层，但我知道如果我使用 Firebase

Answer 1

我建议您在添加额外的缓存层之前先优化您的 elasticsearch。添加额外的缓存层会随着维护要求的增加而增加成本，因此最好花费成本和精力优化 elasticsearch。

在优化 elasticsearch 时，您需要考虑您的查询有多复杂以及您需要多大的页面大小。 Elasticsearch 非常强大，可以处理大量请求，并且通过 Google Marketplace 使用托管的 elasticsearch 集群，您可以轻松地增加弹性和可扩展性。我建议您查看定价是否符合您的要求。如果需要，您现在可以通过 GCP Billing 进行合并结算。参见：https://console.cloud.google.com/marketplace/details/google/elasticsearch

我建议您将数据加载到您的 elasticsearch 中，然后对您的 elasticsearch 实例进行负载测试，看看您获得了什么样的吞吐量和响应时间。您可以使用开发工具中的 Kibana 分析您的查询性能

Elasticsearch 查询缓存

默认启用缓存，但您可以通过查询字符串对其进行管理。如果设置，它将覆盖 index-level 设置：

GET /my_index/_search?request_cache=true
{
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "colors"
      }
    }
  }
}

参见：https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-request-cache.html

请求压缩响应 特别是当您的响应大时很有用，您应该请求压缩响应，这将帮助您提高吞吐量。默认情况下不压缩。您可以通过将以下 header 添加到 elasticsearch 查询请求 header.

来完成此操作

Accept-Encoding: deflate, gzip

有效管理分片和副本：

根据您在 elasticsearch 中存储的数据类型以及查询数据的方式，您可能需要进一步优化。如果你的查询性能不够，那么你可以进行分析和优化。这是一个好的开始：https://www.elastic.co/blog/advanced-tuning-finding-and-fixing-slow-elasticsearch-queries

添加副本相当简单，但更改分片需要重建集群。所以最好在上线前弄好，在创建索引的时候，也就是e

PUT /twitter
{
    "settings" : {
        "index" : {
            "number_of_shards" : 3, 
            "number_of_replicas" : 2 
        }
    }
}

以下是更改索引副本的方法

PUT /twitter/_settings
    {
        "index" : {
            "number_of_replicas" : 2
        }
    }

使用 GCP/GAE 搜索和缓存 API

Search and caching API with GCP/GAE

bitnami

elasticsearch

google-compute-engine

google-cloud-platform

google-app-engine-python