Gcloud - 云 运行 部署部署到 GKE 失败
Gcloud - cloud run deployment fails for deployment to GKE
我正在尝试将示例 angular 应用程序部署到 GKE。我创建了一个示例集群,其中启用了云 运行 和 istio 服务
gcloud beta container clusters create new-cluster \
--addons=HorizontalPodAutoscaling,HttpLoadBalancing,Istio,CloudRun \
--machine-type=n1-standard-2 \
--cluster-version=latest \
--zone=us-east1-b \
--enable-stackdriver-kubernetes --enable-ip-alias \
--scopes cloud-platform --num-nodes 4 --disk-size "10" --image-type "COS"
以下是我的 cloudbuild.yaml 文件
步骤:
# build the container image
- name: gcr.io/cloud-builders/docker
args: [ build, -t, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, . ]
# push the container image to Container Registry
- name: gcr.io/cloud-builders/docker
args: [ push, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 ]
# Deploy container image to Cloud Run
- name: gcr.io/cloud-builders/gcloud
args: [ beta, run, deploy, feedback-ui-deploy-anthos, --image, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, --platform, gke, --cluster, cloudrun-angular-cluster, --cluster-location, us-central1-a ]
images:
- gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01
我已经为 gcloud prj 设置了环境变量。现在,当我尝试将它部署到上面创建的 gke 集群时,我总是以修订不可用错误结束:
Deploying new service... Configuration "service-1" does not have any ready Revision.
- Creating Revision...
X Routing traffic... Configuration "service-1" does not have any ready Revision.
这是我用来部署到云的命令运行
gcloud beta run deploy --platform gke --cluster new-cluster --image gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 --cluster-location us-east1-b
另一个完全托管的云 运行 运行完美。但是当我部署到现有的 gke 集群时,我最终会遇到错误。
我通读了文档,它说如果它是一项新服务,则会自动创建修订版,但不确定为什么我的服务没有发生这种情况
编辑:
这是 kubectl describe 输出。我删除了所有集群并重新创建了一个新集群,但结果仍然相同。
所以在描述服务时,这就是我得到的
注意:我使用的是默认命名空间。不确定它是否与这个问题有任何关系。
Status:
Conditions:
Last Transition Time: 2019-12-04T12:49:59Z
Message: Revision "gke-service-00001-pef" failed with message: Container failed with: nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
2019/12/04 12:49:40 [emerg] 1#1: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
.
Reason: RevisionFailed
Status: False
Type: ConfigurationsReady
Last Transition Time: 2019-12-04T12:49:59Z
Message: Configuration "gke-service" does not have any ready Revision.
Reason: RevisionMissing
Status: False
Type: Ready
Last Transition Time: 2019-12-04T12:49:59Z
Message: Configuration "gke-service" does not have any ready Revision.
Reason: RevisionMissing
Status: False
Type: RoutesReady
Latest Created Revision Name: gke-service-00001-pef
Observed Generation: 1
URL: http://gke-service.default.example.com
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 2m21s service-controller Created Configuration "gke-service"
Normal Created 2m21s service-controller Created Route "gke-service"
Normal Updated 20s (x5 over 2m21s) service-controller Updated Service "gke-service"
由于我通过 nginx 公开 angular index.html 文件,这是我的配置:
server {
listen 8080 default_server;
sendfile on;
default_type application/octet-stream;
gzip on;
gzip_http_version 1.1;
gzip_disable "MSIE [1-6]\.";
gzip_min_length 1100;
gzip_vary on;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;
gzip_comp_level 9;
root /usr/share/nginx/html;
location / {
try_files $uri $uri/ /index.html =404;
#proxy_pass: "http://localhost:8080/AdTechUIContent"
#uncomment to include naxsi rules
#include /etc/nginx/naxsi.rules
}
}
当我在本地构建 docker 图像并且我能够访问它时,这工作正常。以防万一,这是我的 docker 文件
FROM node:12.13-alpine as app-ui-builder
#Now install angular cli globally
RUN npm install -g @angular/cli@8.3.14
#RUN npm config set registry https://registry.cnpmjs.org
#Install git and openssh because alpine image doenst have git and all modules in npm has the dependicies which are all uploaded in git
#so to use them we need to be able git
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i
RUN ls && $(npm bin)/ng build --prod
FROM nginx:1.17.5-alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
COPY app-ui-nginx.conf /etc/nginx/conf.d
RUN rm -rf /usr/share/nginx/html/*
COPY --from=app-ui-builder /app/dist/app-ui /usr/share/nginx/html
RUN ls /usr/share/nginx/html
RUN chmod -R a+r /usr/share/nginx/html
EXPOSE 8080
#
CMD ["nginx", "-g", " daemon off;"]
@AhmetB。你能告诉我为什么 nginx 在这里抛出错误吗
编辑:
我确实尝试使用带有部署和服务的普通 Kubectl 命令来部署应用程序。它运作良好。因此,即使可以找到该文件,也不确定它违反了哪些云 运行 合同以使用 nginx 记录错误
您的集群是否有任何基于角色的访问控制 Storage permissions. I also suggest that you verify Permissions required 部署或云 运行 用于 Anthos
检查您是否有 Storage permission and scopes4
Deploying new service... Configuration "service-1" does not have any ready Revision.
此错误意味着它已部署但由于某种原因 pod 崩溃或未安排。这可能由于各种原因而发生,例如节点上 CPU/memory 不足,无法从 GCR 中提取图像,或者应用程序崩溃。
查看您的应用程序的 "kubectl logs" 和 "kubectl describe" 输出。尝试:
- kubectl 获取 ksvc
- kubectl 获取 pods
- kubectl 描述 ksvc 名称
- kubectl 记录 NAME -c 用户容器
我发现了问题。看起来应该在自定义文件夹中创建日志文件(错误和访问日志文件)以供云 运行 访问。 Cloud 运行 在启动修订之前检查这些文件夹是否可用。当我使用旧的 nginx 配置文件时,没有创建自定义文件夹。现在修改了 nginx conf 文件并部署了它,它工作正常
创建了两个文件
nginx.conf
user nginx;
worker_processes 1;
error_log /var/logs/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/logs/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
default.conf
server {
listen 8080;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
也修改了dockerfile
FROM node:12.13-alpine as app-ui-builder
RUN npm install -g @angular/cli@8.3.14
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i
RUN ls && $(npm bin)/ng build --prod
FROM nginx:alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
#RUN rm -rf /etc/nginx/conf.d/*
RUN mkdir /var/logs
RUN mkdir /var/logs/nginx
COPY ./docker/nginx.conf /etc/nginx/
## Copy a new configuration file setting listen port to 8080
COPY ./docker/default.conf /etc/nginx/conf.d/
RUN rm -rf /usr/share/nginx/html/*
#
COPY --from=app-ui-builder /app/dist/app-ui
/usr/share/nginx/html
EXPOSE 8080
CMD ["nginx", "-g", " daemon off;"]
通过这个找到它medium post
我正在尝试将示例 angular 应用程序部署到 GKE。我创建了一个示例集群,其中启用了云 运行 和 istio 服务
gcloud beta container clusters create new-cluster \
--addons=HorizontalPodAutoscaling,HttpLoadBalancing,Istio,CloudRun \
--machine-type=n1-standard-2 \
--cluster-version=latest \
--zone=us-east1-b \
--enable-stackdriver-kubernetes --enable-ip-alias \
--scopes cloud-platform --num-nodes 4 --disk-size "10" --image-type "COS"
以下是我的 cloudbuild.yaml 文件 步骤:
# build the container image
- name: gcr.io/cloud-builders/docker
args: [ build, -t, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, . ]
# push the container image to Container Registry
- name: gcr.io/cloud-builders/docker
args: [ push, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 ]
# Deploy container image to Cloud Run
- name: gcr.io/cloud-builders/gcloud
args: [ beta, run, deploy, feedback-ui-deploy-anthos, --image, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, --platform, gke, --cluster, cloudrun-angular-cluster, --cluster-location, us-central1-a ]
images:
- gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01
我已经为 gcloud prj 设置了环境变量。现在,当我尝试将它部署到上面创建的 gke 集群时,我总是以修订不可用错误结束:
Deploying new service... Configuration "service-1" does not have any ready Revision.
- Creating Revision...
X Routing traffic... Configuration "service-1" does not have any ready Revision.
这是我用来部署到云的命令运行
gcloud beta run deploy --platform gke --cluster new-cluster --image gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 --cluster-location us-east1-b
另一个完全托管的云 运行 运行完美。但是当我部署到现有的 gke 集群时,我最终会遇到错误。 我通读了文档,它说如果它是一项新服务,则会自动创建修订版,但不确定为什么我的服务没有发生这种情况
编辑: 这是 kubectl describe 输出。我删除了所有集群并重新创建了一个新集群,但结果仍然相同。
所以在描述服务时,这就是我得到的
注意:我使用的是默认命名空间。不确定它是否与这个问题有任何关系。
Status:
Conditions:
Last Transition Time: 2019-12-04T12:49:59Z
Message: Revision "gke-service-00001-pef" failed with message: Container failed with: nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
2019/12/04 12:49:40 [emerg] 1#1: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
.
Reason: RevisionFailed
Status: False
Type: ConfigurationsReady
Last Transition Time: 2019-12-04T12:49:59Z
Message: Configuration "gke-service" does not have any ready Revision.
Reason: RevisionMissing
Status: False
Type: Ready
Last Transition Time: 2019-12-04T12:49:59Z
Message: Configuration "gke-service" does not have any ready Revision.
Reason: RevisionMissing
Status: False
Type: RoutesReady
Latest Created Revision Name: gke-service-00001-pef
Observed Generation: 1
URL: http://gke-service.default.example.com
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 2m21s service-controller Created Configuration "gke-service"
Normal Created 2m21s service-controller Created Route "gke-service"
Normal Updated 20s (x5 over 2m21s) service-controller Updated Service "gke-service"
由于我通过 nginx 公开 angular index.html 文件,这是我的配置:
server {
listen 8080 default_server;
sendfile on;
default_type application/octet-stream;
gzip on;
gzip_http_version 1.1;
gzip_disable "MSIE [1-6]\.";
gzip_min_length 1100;
gzip_vary on;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;
gzip_comp_level 9;
root /usr/share/nginx/html;
location / {
try_files $uri $uri/ /index.html =404;
#proxy_pass: "http://localhost:8080/AdTechUIContent"
#uncomment to include naxsi rules
#include /etc/nginx/naxsi.rules
}
}
当我在本地构建 docker 图像并且我能够访问它时,这工作正常。以防万一,这是我的 docker 文件
FROM node:12.13-alpine as app-ui-builder
#Now install angular cli globally
RUN npm install -g @angular/cli@8.3.14
#RUN npm config set registry https://registry.cnpmjs.org
#Install git and openssh because alpine image doenst have git and all modules in npm has the dependicies which are all uploaded in git
#so to use them we need to be able git
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i
RUN ls && $(npm bin)/ng build --prod
FROM nginx:1.17.5-alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
COPY app-ui-nginx.conf /etc/nginx/conf.d
RUN rm -rf /usr/share/nginx/html/*
COPY --from=app-ui-builder /app/dist/app-ui /usr/share/nginx/html
RUN ls /usr/share/nginx/html
RUN chmod -R a+r /usr/share/nginx/html
EXPOSE 8080
#
CMD ["nginx", "-g", " daemon off;"]
@AhmetB。你能告诉我为什么 nginx 在这里抛出错误吗
编辑: 我确实尝试使用带有部署和服务的普通 Kubectl 命令来部署应用程序。它运作良好。因此,即使可以找到该文件,也不确定它违反了哪些云 运行 合同以使用 nginx 记录错误
您的集群是否有任何基于角色的访问控制 Storage permissions. I also suggest that you verify Permissions required 部署或云 运行 用于 Anthos
检查您是否有 Storage permission and scopes4
Deploying new service... Configuration "service-1" does not have any ready Revision.
此错误意味着它已部署但由于某种原因 pod 崩溃或未安排。这可能由于各种原因而发生,例如节点上 CPU/memory 不足,无法从 GCR 中提取图像,或者应用程序崩溃。
查看您的应用程序的 "kubectl logs" 和 "kubectl describe" 输出。尝试:
- kubectl 获取 ksvc
- kubectl 获取 pods
- kubectl 描述 ksvc 名称
- kubectl 记录 NAME -c 用户容器
我发现了问题。看起来应该在自定义文件夹中创建日志文件(错误和访问日志文件)以供云 运行 访问。 Cloud 运行 在启动修订之前检查这些文件夹是否可用。当我使用旧的 nginx 配置文件时,没有创建自定义文件夹。现在修改了 nginx conf 文件并部署了它,它工作正常
创建了两个文件 nginx.conf
user nginx;
worker_processes 1;
error_log /var/logs/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/logs/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
default.conf
server {
listen 8080;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
也修改了dockerfile
FROM node:12.13-alpine as app-ui-builder
RUN npm install -g @angular/cli@8.3.14
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i
RUN ls && $(npm bin)/ng build --prod
FROM nginx:alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
#RUN rm -rf /etc/nginx/conf.d/*
RUN mkdir /var/logs
RUN mkdir /var/logs/nginx
COPY ./docker/nginx.conf /etc/nginx/
## Copy a new configuration file setting listen port to 8080
COPY ./docker/default.conf /etc/nginx/conf.d/
RUN rm -rf /usr/share/nginx/html/*
#
COPY --from=app-ui-builder /app/dist/app-ui
/usr/share/nginx/html
EXPOSE 8080
CMD ["nginx", "-g", " daemon off;"]
通过这个找到它medium post