RTL 中的镜像金丝雀部署？

Mirror canary deployment in RTL?

我是金丝雀部署的新手。我们将开始通过 Istio 进行金丝雀部署。

我假设这只是一种部署机制，可能在预生产环境中进行一些 Istio 路由测试，但在早期的测试环境中，我们会像今天一样围栏到正在测试的版本。

有人建议将金丝雀概念应用于所有测试环境，因此我们有效地运行我们希望在 Route To Live 的产品中进行金丝雀测试的所有版本。

想知道其他人采用什么方法吗？

镜像

如前所述here

Using Istio, you can use traffic mirroring to duplicate traffic to another service. You can incorporate a traffic mirroring rule as part of a canary deployment pipeline, allowing you to analyze a service's behavior before sending live traffic to it.

如果您正在寻找最佳实践，我建议您从中等 tutorial 开始，因为这里解释得很好。

流量镜像的工作原理

Traffic mirroring works using the steps below:

You deploy a new version of the application and switch on traffic mirroring.

The old version responds to requests like before but also sends an asynchronous copy to the new version.

The new version processes the traffic but does not respond to the user.

The operations team monitor the new version and report any issues to the development team.

As the application processes live traffic, it helps the team uncover issues that they would typically not find in a pre-production environment. You can use monitoring tools, such as Prometheus and Grafana, for recording and monitoring your test results.

此外，还有一个 nginx 示例完美地展示了它应该如何工作。

金丝雀部署

如前所述here

One of the benefits of the Istio project is that it provides the control needed to deploy canary services. The idea behind canary deployment (or rollout) is to introduce a new version of a service by first testing it using a small percentage of user traffic, and then if all goes well, increase, possibly gradually in increments, the percentage while simultaneously phasing out the old version. If anything goes wrong along the way, we abort and rollback to the previous version. In its simplest form, the traffic sent to the canary version is a randomly selected percentage of requests, but in more sophisticated schemes it can be based on the region, user, or other properties of the request.

Depending on your level of expertise in this area, you may wonder why Istio’s support for canary deployment is even needed, given that platforms like Kubernetes already provide a way to do version rollout and canary deployment. Problem solved, right? Well, not exactly. Although doing a rollout this way works in simple cases, it’s very limited, especially in large scale cloud environments receiving lots of (and especially varying amounts of) traffic, where autoscaling is needed.

There是k8s金丝雀部署和istio金丝雀部署的区别

k8s

As an example, let’s say we have a deployed service, helloworld version v1, for which we would like to test (or simply rollout) a new version, v2. Using Kubernetes, you can rollout a new version of the helloworld service by simply updating the image in the service’s corresponding Deployment and letting the rollout happen automatically. If we take particular care to ensure that there are enough v1 replicas running when we start and pause the rollout after only one or two v2 replicas have been started, we can keep the canary’s effect on the system very small. We can then observe the effect before deciding to proceed or, if necessary, rollback. Best of all, we can even attach a horizontal pod autoscaler to the Deployment and it will keep the replica ratios consistent if, during the rollout process, it also needs to scale replicas up or down to handle traffic load.

Although fine for what it does, this approach is only useful when we have a properly tested version that we want to deploy, i.e., more of a blue/green, a.k.a. red/black, kind of upgrade than a “dip your feet in the water” kind of canary deployment. In fact, for the latter (for example, testing a canary version that may not even be ready or intended for wider exposure), the canary deployment in Kubernetes would be done using two Deployments with common pod labels. In this case, we can’t use autoscaling anymore because it’s now being done by two independent autoscalers, one for each Deployment, so the replica ratios (percentages) may vary from the desired ratio, depending purely on load.

Whether we use one deployment or two, canary management using deployment features of container orchestration platforms like Docker, Mesos/Marathon, or Kubernetes has a fundamental problem: the use of instance scaling to manage the traffic; traffic version distribution and replica deployment are not independent in these systems. All replica pods, regardless of version, are treated the same in the kube-proxy round-robin pool, so the only way to manage the amount of traffic that a particular version receives is by controlling the replica ratio. Maintaining canary traffic at small percentages requires many replicas (e.g., 1% would require a minimum of 100 replicas). Even if we ignore this problem, the deployment approach is still very limited in that it only supports the simple (random percentage) canary approach. If, instead, we wanted to limit the visibility of the canary to requests based on some specific criteria, we still need another solution.

istio

With Istio, traffic routing and replica deployment are two completely independent functions. The number of pods implementing services are free to scale up and down based on traffic load, completely orthogonal to the control of version traffic routing. This makes managing a canary version in the presence of autoscaling a much simpler problem. Autoscalers may, in fact, respond to load variations resulting from traffic routing changes, but they are nevertheless functioning independently and no differently than when loads change for other reasons.

Istio’s routing rules also provide other important advantages; you can easily control fine-grained traffic percentages (e.g., route 1% of traffic without requiring 100 pods) and you can control traffic using other criteria (e.g., route traffic for specific users to the canary version). To illustrate, let’s look at deploying the helloworld service and see how simple the problem becomes.

There就是一个例子。

关于 istio 中的流量镜像，您可能还需要查看其他资源：

RTL 中的镜像金丝雀部署？

Mirror canary deployment in RTL?

testing

canary-deployment

istio

镜像

流量镜像的工作原理

金丝雀部署