带有 Service Fabric 的 Traefik——无法连接到 Service Fabric 服务器
Traefik with Service Fabric -- failed to connect to Service Fabric server
我已经使用以下配置在我的 Azure Service Fabric 集群上部署了 Traefik:
# Enable Service Fabric configuration backend
[servicefabric]
# Service Fabric Management Endpoint
clustermanagementurl = "https://localhost:19080"
# Service Fabric Management Endpoint API Version
apiversion = "3.0"
insecureSkipVerify = true
但是,当打开 Traefik 仪表板时,我看到一个空白屏幕,因为它无法映射我所有的 Fabric 应用程序。
查看我的一台虚拟机上的 Traefik 日志,我反复看到此错误:
level=error msg="failed to connect to Service Fabric server Get https://localhost:19080/Applications/?api-version=3.0: x509: certificate is valid for <hidden>.eastus.cloudapp.azure.com, not localhost on https://localhost:19080/Applications/?api-version=3.0"
我的 Azure Service Fabric 群集具有由受信任的 CA 签名的 SSL 证书:
我该如何解决这个问题?
编辑 1:
如果有帮助,这是 Traefik 加载的配置(根据日志):
{
"LifeCycle": {
"RequestAcceptGraceTimeout": 0,
"GraceTimeOut": 0
},
"GraceTimeOut": 0,
"Debug": true,
"CheckNewVersion": true,
"AccessLogsFile": "",
"AccessLog": null,
"TraefikLogsFile": "",
"TraefikLog": null,
"LogLevel": "DEBUG",
"EntryPoints": {
"http": {
"Network": "",
"Address": ":80",
"TLS": null,
"Redirect": null,
"Auth": null,
"WhitelistSourceRange": null,
"Compress": false,
"ProxyProtocol": null,
"ForwardedHeaders": {
"Insecure": true,
"TrustedIPs": null
}
}
},
"Cluster": null,
"Constraints": [],
"ACME": null,
"DefaultEntryPoints": [
"http"
],
"ProvidersThrottleDuration": 2000000000,
"MaxIdleConnsPerHost": 200,
"IdleTimeout": 0,
"InsecureSkipVerify": true,
"RootCAs": null,
"Retry": null,
"HealthCheck": {
"Interval": 30000000000
},
"RespondingTimeouts": null,
"ForwardingTimeouts": null,
"Docker": null,
"File": null,
"Web": {
"Address": ":9000",
"CertFile": "",
"KeyFile": "",
"ReadOnly": false,
"Statistics": null,
"Metrics": null,
"Path": "/",
"Auth": null,
"Debug": false,
"CurrentConfigurations": null,
"Stats": null,
"StatsRecorder": null
},
"Marathon": null,
"Consul": null,
"ConsulCatalog": null,
"Etcd": null,
"Zookeeper": null,
"Boltdb": null,
"Kubernetes": null,
"Mesos": null,
"Eureka": null,
"ECS": null,
"Rancher": null,
"DynamoDB": null,
"ServiceFabric": {
"Watch": false,
"Filename": "",
"Constraints": null,
"Trace": false,
"DebugLogGeneratedTemplate": false,
"ClusterManagementURL": "https://localhost:19080",
"APIVersion": "3.0",
"UseCertificateAuth": false,
"ClientCertFilePath": "",
"ClientCertKeyFilePath": "",
"InsecureSkipVerify": true
}
}
编辑 2:
有人建议使用我的集群的远程地址而不是 localhost
,这样做会导致不同的错误:
Provider connection error: failed to connect to Service Fabric server Get https://<hidden>.eastus.cloudapp.azure.com:19080/Applications/?api-version=3.0: stream error: stream ID 1; HTTP_1_1_REQUIRED on https://<hidden>.eastus.cloudapp.azure.com:19080/Applications/?api-version=3.0; retrying in 656.765021ms
要向 ServiceFabric API 进行身份验证,您必须使用证书,在您的配置中您错过了这个细节。
在 Traefik 设置中你应该有这样的东西:
# [serviceFabric.tls]
cert = "certs/servicefabric.crt"
key = "certs/servicefabric.key"
insecureskipverify = true
下面post一步步描述
感谢 Diego 的评论(在我的问题下),我通过添加以下内容成功解决了这个问题。
问题是什么?
- 我的 SF 集群是安全的,需要客户端证书才能登录 -- Traefik TOML 文件中未指定。 (希望记录的错误信息更丰富)
查看 Traefik 日志,特别是在 SF 部分(查找以 Starting provider *servicefabric.Provider
开头的跟踪:
"Watch": false,
"Filename": "",
"Constraints": null,
"Trace": false,
"DebugLogGeneratedTemplate": false,
"ClusterManagementURL": "https://localhost:19080",
"APIVersion": "3.0",
"UseCertificateAuth": false, <-------- Important
"ClientCertFilePath": "", <-------- Important
"ClientCertKeyFilePath": "", <-------- Important
"InsecureSkipVerify": false
UseCertificateAuth
-- Traefik查询集群管理端点时是否使用客户端证书。
ClientCertFilePath
-- 包含客户端证书public密钥的文件路径。
ClientCertKeyFilePath
-- 包含客户端证书私钥的文件路径。
(两条路径都应该相对于traefik.exe
)
不安全跳过验证
Traefik 的 SF 配置(上图)包括一个名为 InsecureSkipVerify
的设置
InsecureSkipVerify
-- 如果设置为 false
,那么 Traefik 将拒绝与管理端点的连接,除非使用的 SSL 证书是由受信任的 CA 签名的。
- 如果证书是为远程地址签名的,这可能是个问题,而 Traefik 使用
https://localhost
作为集群的端点——因为 Traefik 会打印类似于这样的错误:
failed to connect to Service Fabric server Get https://localhost:19080/Applications/?api-version=3.0: x509: certificate is valid for .eastus.cloudapp.azure.com, not localhost
要克服这一点,您可以
- 设置
InsecureSkipVerify = true
并重新部署
- 将管理端点设置为远程地址:
clustermanagementurl = "https://<hidden>.eastus.cloudapp.azure.com:19080"
再次感谢 Diego 给我的提示,让我理解并分享了上述解释。
我知道这是一个旧的 post 但我们只是 运行 进入了这个确切的情况,这是我看到提到的客户端设置的唯一地方。这是最终似乎对我们有用的提供者部分:
################################################################
# Service Fabric provider
################################################################
# Enable Service Fabric configuration backend
[servicefabric]
# Service Fabric Management Endpoint
clustermanagementurl = "https://localhost:19080"
# Note: use "https://localhost:19080" if you're using a secure cluster
# Service Fabric Management Endpoint API Version
apiversion = "3.0"
# Enable TLS connection.
#
# Optional
#
[serviceFabric.tls]
cert = "certs/servicefabric.crt"
key = "certs/servicefabric.key"
insecureskipverify = true
UseCertificateAuth = true
ClientCertFilePath = "certs/traefik.crt"
ClientCertKeyFilePath = "certs/traefik.key"
InsecureSkipVerify = true
我已经使用以下配置在我的 Azure Service Fabric 集群上部署了 Traefik:
# Enable Service Fabric configuration backend
[servicefabric]
# Service Fabric Management Endpoint
clustermanagementurl = "https://localhost:19080"
# Service Fabric Management Endpoint API Version
apiversion = "3.0"
insecureSkipVerify = true
但是,当打开 Traefik 仪表板时,我看到一个空白屏幕,因为它无法映射我所有的 Fabric 应用程序。
查看我的一台虚拟机上的 Traefik 日志,我反复看到此错误:
level=error msg="failed to connect to Service Fabric server Get https://localhost:19080/Applications/?api-version=3.0: x509: certificate is valid for <hidden>.eastus.cloudapp.azure.com, not localhost on https://localhost:19080/Applications/?api-version=3.0"
我的 Azure Service Fabric 群集具有由受信任的 CA 签名的 SSL 证书:
我该如何解决这个问题?
编辑 1:
如果有帮助,这是 Traefik 加载的配置(根据日志):
{
"LifeCycle": {
"RequestAcceptGraceTimeout": 0,
"GraceTimeOut": 0
},
"GraceTimeOut": 0,
"Debug": true,
"CheckNewVersion": true,
"AccessLogsFile": "",
"AccessLog": null,
"TraefikLogsFile": "",
"TraefikLog": null,
"LogLevel": "DEBUG",
"EntryPoints": {
"http": {
"Network": "",
"Address": ":80",
"TLS": null,
"Redirect": null,
"Auth": null,
"WhitelistSourceRange": null,
"Compress": false,
"ProxyProtocol": null,
"ForwardedHeaders": {
"Insecure": true,
"TrustedIPs": null
}
}
},
"Cluster": null,
"Constraints": [],
"ACME": null,
"DefaultEntryPoints": [
"http"
],
"ProvidersThrottleDuration": 2000000000,
"MaxIdleConnsPerHost": 200,
"IdleTimeout": 0,
"InsecureSkipVerify": true,
"RootCAs": null,
"Retry": null,
"HealthCheck": {
"Interval": 30000000000
},
"RespondingTimeouts": null,
"ForwardingTimeouts": null,
"Docker": null,
"File": null,
"Web": {
"Address": ":9000",
"CertFile": "",
"KeyFile": "",
"ReadOnly": false,
"Statistics": null,
"Metrics": null,
"Path": "/",
"Auth": null,
"Debug": false,
"CurrentConfigurations": null,
"Stats": null,
"StatsRecorder": null
},
"Marathon": null,
"Consul": null,
"ConsulCatalog": null,
"Etcd": null,
"Zookeeper": null,
"Boltdb": null,
"Kubernetes": null,
"Mesos": null,
"Eureka": null,
"ECS": null,
"Rancher": null,
"DynamoDB": null,
"ServiceFabric": {
"Watch": false,
"Filename": "",
"Constraints": null,
"Trace": false,
"DebugLogGeneratedTemplate": false,
"ClusterManagementURL": "https://localhost:19080",
"APIVersion": "3.0",
"UseCertificateAuth": false,
"ClientCertFilePath": "",
"ClientCertKeyFilePath": "",
"InsecureSkipVerify": true
}
}
编辑 2:
有人建议使用我的集群的远程地址而不是 localhost
,这样做会导致不同的错误:
Provider connection error: failed to connect to Service Fabric server Get https://<hidden>.eastus.cloudapp.azure.com:19080/Applications/?api-version=3.0: stream error: stream ID 1; HTTP_1_1_REQUIRED on https://<hidden>.eastus.cloudapp.azure.com:19080/Applications/?api-version=3.0; retrying in 656.765021ms
要向 ServiceFabric API 进行身份验证,您必须使用证书,在您的配置中您错过了这个细节。
在 Traefik 设置中你应该有这样的东西:
# [serviceFabric.tls]
cert = "certs/servicefabric.crt"
key = "certs/servicefabric.key"
insecureskipverify = true
下面post一步步描述
感谢 Diego 的评论(在我的问题下),我通过添加以下内容成功解决了这个问题。
问题是什么?
- 我的 SF 集群是安全的,需要客户端证书才能登录 -- Traefik TOML 文件中未指定。 (希望记录的错误信息更丰富)
查看 Traefik 日志,特别是在 SF 部分(查找以
Starting provider *servicefabric.Provider
开头的跟踪:"Watch": false, "Filename": "", "Constraints": null, "Trace": false, "DebugLogGeneratedTemplate": false, "ClusterManagementURL": "https://localhost:19080", "APIVersion": "3.0", "UseCertificateAuth": false, <-------- Important "ClientCertFilePath": "", <-------- Important "ClientCertKeyFilePath": "", <-------- Important "InsecureSkipVerify": false
UseCertificateAuth
-- Traefik查询集群管理端点时是否使用客户端证书。ClientCertFilePath
-- 包含客户端证书public密钥的文件路径。ClientCertKeyFilePath
-- 包含客户端证书私钥的文件路径。
(两条路径都应该相对于traefik.exe
)
不安全跳过验证
Traefik 的 SF 配置(上图)包括一个名为 InsecureSkipVerify
InsecureSkipVerify
-- 如果设置为false
,那么 Traefik 将拒绝与管理端点的连接,除非使用的 SSL 证书是由受信任的 CA 签名的。- 如果证书是为远程地址签名的,这可能是个问题,而 Traefik 使用
https://localhost
作为集群的端点——因为 Traefik 会打印类似于这样的错误:
failed to connect to Service Fabric server Get https://localhost:19080/Applications/?api-version=3.0: x509: certificate is valid for .eastus.cloudapp.azure.com, not localhost
要克服这一点,您可以
- 设置
InsecureSkipVerify = true
并重新部署 - 将管理端点设置为远程地址:
clustermanagementurl = "https://<hidden>.eastus.cloudapp.azure.com:19080"
再次感谢 Diego 给我的提示,让我理解并分享了上述解释。
我知道这是一个旧的 post 但我们只是 运行 进入了这个确切的情况,这是我看到提到的客户端设置的唯一地方。这是最终似乎对我们有用的提供者部分:
################################################################
# Service Fabric provider
################################################################
# Enable Service Fabric configuration backend
[servicefabric]
# Service Fabric Management Endpoint
clustermanagementurl = "https://localhost:19080"
# Note: use "https://localhost:19080" if you're using a secure cluster
# Service Fabric Management Endpoint API Version
apiversion = "3.0"
# Enable TLS connection.
#
# Optional
#
[serviceFabric.tls]
cert = "certs/servicefabric.crt"
key = "certs/servicefabric.key"
insecureskipverify = true
UseCertificateAuth = true
ClientCertFilePath = "certs/traefik.crt"
ClientCertKeyFilePath = "certs/traefik.key"
InsecureSkipVerify = true