Gcloud kubernetes/docker 部署工作但在 10 分钟后停止响应

Question

所以我已经在 google 容器引擎上进行了部署，并且运行出现了一种我真的不知道如何调试的奇怪行为。我正在使用 docker 和 kubernetes 在 rails 应用程序上部署 ruby。

我基本上是在学习这个教程： https://cloud.google.com/container-engine/docs/tutorials/hello-node#step_2_create_a_docker_container_image 跳过旋转副本部分并且它有效。我可以在 building/deploying 之后转到外部 IP，并且我的应用程序按预期方式运行。但是，大约 10 分钟后它停止了。请求永远旋转。

我发现日志文件相对没有帮助，只看到以下会发出提示的内容：

{
"log": "2015/11/10 05:35:18 Worker running nslookup kubernetes.default.svc.cluster.local localhost >/dev/null\n",
"stream": "stderr"
}

{
"log": "2015/11/10 05:35:19 Client ip xx.xxx.x.x:xxxxx requesting    /healthz probe servicing cmd nslookup kubernetes.default.svc.cluster.local     localhost >/dev/null\n",
"stream": "stderr"
}

我已经完成了此页面上的大部分调试建议： https://cloud.google.com/container-engine/docs/debugging/

kubectl 记录 ${pod}:

[2015-11-10 05:07:44] INFO  WEBrick 1.3.1
[2015-11-10 05:07:44] INFO  ruby 2.1.6 (2015-04-13) [x86_64-linux]
[2015-11-10 05:07:44] INFO  WEBrick::HTTPServer#start: pid=1 port=80

kubectl 令人不安地记录 $pod $instance returns:

Container "x" not found in Pod "x"

Dockerfile 几乎直接来自 google:

FROM google/ruby

# [START postgres-dep]
RUN apt-get update && \
apt-get install -qy --no-install-recommends libpq-dev && \
apt-get clean
# [END postgres-dep]

ENV RACK_ENV production

WORKDIR /app
ADD Gemfile /app/Gemfile
ADD Gemfile.lock /app/Gemfile.lock
RUN /usr/bin/bundle install --deployment --without development:test
ADD . /app
RUN bundle exec rake assets:precompile
RUN bundle exec rake db:migrate
EXPOSE 8080
ENV RACK_ENV production
CMD ["/usr/bin/bundle", "exec", "rackup", "-p", "80", "/app/config.ru", "-s", "webrick", "-E", "production"]

ping returns 以下内容：

PING xxxxx (xxxxx): 56 data bytes
64 bytes from xxxx: icmp_seq=0 ttl=49 time=48.462 ms
64 bytes from xxxx: icmp_seq=1 ttl=49 time=48.177 ms
64 bytes from 1xxxx: icmp_seq=2 ttl=49 time=48.181 ms
64 bytes from xxxx: icmp_seq=3 ttl=49 time=48.240 ms
64 bytes from 1xxxxx: icmp_seq=4 ttl=49 time=48.337 ms
64 bytes from xxxxx: icmp_seq=5 ttl=49 time=48.149 ms 
64 bytes from xxxxx: icmp_seq=6 ttl=49 time=48.053 ms
64 bytes from xxxx: icmp_seq=7 ttl=49 time=47.958 ms
64 bytes from xxxxx: icmp_seq=8 ttl=49 time=48.137 ms

延迟看起来很糟糕。永远之后它确实指向红色rails死亡屏幕'

问题：我该死的应用程序日志在哪里？我在开发人员控制台中没有看到任何 rails 之类的东西，也无法通过 ssh 找到它们。我有点假设这是一个 balancer/pod 配置问题，但无论如何知道它会很高兴。为什么它一开始可以工作，一段时间后就停止工作了？当一切都显示绿灯且没有关键日志时，我应该从哪里开始对此类行为进行故障排除？
滚动更新（https://cloud.google.com/container-engine/docs/rolling-updates）是重新部署代码更改的过程，而不需要旋转和 down/re 创建一切吗？提前致谢

Answer 1

Where are my darn application logs?

kubectl logs 将抓取写入 stdout / stderr 的所有日志。如果您的应用程序记录到一个文件，那么您将需要直接查看该文件以查看您的日志。尝试 kubectl exec 在您的 pod 中获取 shell，然后使用您最喜欢的工具（cat、grep、less 等）查看日志文件（如果您还没有，请查看 this blog post已经看到 kubectl 的一些简洁用法，包括 kubectl exec).

的示例

Why does it work initially and after a while stop functioning?

这可能取决于您的应用程序。获得日志后，您应该能够分辨出来。

Are rolling updates the process for re-deploying code changes without spinning up and down/re creating everything?

是的。

Gcloud kubernetes/docker 部署工作但在 10 分钟后停止响应

Gcloud kubernetes/docker deploy works but stops responding after 10 minutes

ruby-on-rails

docker

google-cloud-platform

kubernetes

google-kubernetes-engine