为什么我无法连接到此 Service Fabric 群集?

Why can't I connect to this Service Fabric cluster?

我在使用 Connect-ServiceFabricCluster PowerShell 连接到内部部署的远程服务结构集群 运行 时被错误阻止(在 Azure 上不是)联网虚拟机的命令:

WARNING: Failed to contact Naming Service. Attempting to contact Failover Manager Service...
WARNING: Failed to contact Failover Manager Service, Attempting to contact FMM...
False
WARNING: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 192.168.1.102:19000
Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
At Install.ps1:3 char:1
+ Connect-ServiceFabricCluster -ConnectionEndpoint "FABRICTESTSRV:19000" -WindowsCred ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Connect-ServiceFabricCluster], FabricException
    + FullyQualifiedErrorId : TestClusterConnectionErrorId,Microsoft.ServiceFabric.Powershell.ConnectCluster

命令是:

Connect-ServiceFabricCluster -ConnectionEndpoint "FABRICTESTSRV:19000" -WindowsCredential:$True

为什么不起作用?

这是我尝试过的:

注意:这不是 Azure 托管的虚拟机。这只是一个网络连接的虚拟机 运行 Service Fabric Core,vanilla Windows 8.1 x64 完全最新。

编辑Get-ServiceFabricClusterManifest内容如下:

<ClusterManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="ComputerName-Local-Cluster" Version=
"1.0" xmlns="http://schemas.microsoft.com/2011/01/fabric">
  <NodeTypes>
    <NodeType Name="NodeType0">
      <Endpoints>
        <ClientConnectionEndpoint Port="19000" />
        <LeaseDriverEndpoint Port="19001" />
        <ClusterConnectionEndpoint Port="19002" />
        <HttpGatewayEndpoint Port="19080" Protocol="http" />
        <HttpApplicationGatewayEndpoint Port="19081" Protocol="http" />
        <ServiceConnectionEndpoint Port="19006" />
        <ApplicationEndpoints StartPort="30001" EndPort="31000" />
      </Endpoints>
    </NodeType>
    <NodeType Name="NodeType1">
      <Endpoints>
        <ClientConnectionEndpoint Port="19010" />
        <LeaseDriverEndpoint Port="19011" />
        <ClusterConnectionEndpoint Port="19012" />
        <HttpGatewayEndpoint Port="19082" Protocol="http" />
        <HttpApplicationGatewayEndpoint Port="19083" Protocol="http" />
        <ServiceConnectionEndpoint Port="19016" />
        <ApplicationEndpoints StartPort="31001" EndPort="32000" />
      </Endpoints>
    </NodeType>
    <NodeType Name="NodeType2">
      <Endpoints>
        <ClientConnectionEndpoint Port="19020" />
        <LeaseDriverEndpoint Port="19021" />
        <ClusterConnectionEndpoint Port="19022" />
        <HttpGatewayEndpoint Port="19084" Protocol="http" />
        <HttpApplicationGatewayEndpoint Port="19085" Protocol="http" />
        <ServiceConnectionEndpoint Port="19026" />
        <ApplicationEndpoints StartPort="32001" EndPort="33000" />
      </Endpoints>
    </NodeType>
    <NodeType Name="NodeType3">
      <Endpoints>
        <ClientConnectionEndpoint Port="19030" />
        <LeaseDriverEndpoint Port="19031" />
        <ClusterConnectionEndpoint Port="19032" />
        <HttpGatewayEndpoint Port="19086" Protocol="http" />
        <HttpApplicationGatewayEndpoint Port="19087" Protocol="http" />
        <ServiceConnectionEndpoint Port="19036" />
        <ApplicationEndpoints StartPort="33001" EndPort="34000" />
      </Endpoints>
    </NodeType>
    <NodeType Name="NodeType4">
      <Endpoints>
        <ClientConnectionEndpoint Port="19040" />
        <LeaseDriverEndpoint Port="19041" />
        <ClusterConnectionEndpoint Port="19042" />
        <HttpGatewayEndpoint Port="19088" Protocol="http" />
        <HttpApplicationGatewayEndpoint Port="19089" Protocol="http" />
        <ServiceConnectionEndpoint Port="19046" />
        <ApplicationEndpoints StartPort="34001" EndPort="35000" />
      </Endpoints>
    </NodeType>
  </NodeTypes>
  <Infrastructure>
    <WindowsServer IsScaleMin="true">
      <NodeList>
        <Node NodeName="_Node_0" IPAddressOrFQDN="localhost" IsSeedNode="true" NodeTypeRef="NodeType0" FaultDomain="fd:/0" UpgradeDomain="0" />
        <Node NodeName="_Node_1" IPAddressOrFQDN="localhost" IsSeedNode="true" NodeTypeRef="NodeType1" FaultDomain="fd:/1" UpgradeDomain="1" />
        <Node NodeName="_Node_2" IPAddressOrFQDN="localhost" IsSeedNode="true" NodeTypeRef="NodeType2" FaultDomain="fd:/2" UpgradeDomain="2" />
        <Node NodeName="_Node_3" IPAddressOrFQDN="localhost" NodeTypeRef="NodeType3" FaultDomain="fd:/3" UpgradeDomain="3" />
        <Node NodeName="_Node_4" IPAddressOrFQDN="localhost" NodeTypeRef="NodeType4" FaultDomain="fd:/4" UpgradeDomain="4" />
      </NodeList>
    </WindowsServer>
  </Infrastructure>
  <FabricSettings>
    <Section Name="Security">
      <Parameter Name="ClusterCredentialType" Value="None" />
      <Parameter Name="ServerAuthCredentialType" Value="None" />
    </Section>
    <Section Name="FailoverManager">
      <Parameter Name="ExpectedClusterSize" Value="4" />
      <Parameter Name="TargetReplicaSetSize" Value="3" />
      <Parameter Name="MinReplicaSetSize" Value="3" />
      <Parameter Name="ReconfigurationTimeLimit" Value="20" />
      <Parameter Name="BuildReplicaTimeLimit" Value="20" />
      <Parameter Name="CreateInstanceTimeLimit" Value="20" />
      <Parameter Name="PlacementTimeLimit" Value="20" />
    </Section>
    <Section Name="ReconfigurationAgent">
      <Parameter Name="ServiceApiHealthDuration" Value="20" />
      <Parameter Name="ServiceReconfigurationApiHealthDuration" Value="20" />
      <Parameter Name="LocalHealthReportingTimerInterval" Value="5" />
      <Parameter Name="IsDeactivationInfoEnabled" Value="true" />
      <Parameter Name="RAUpgradeProgressCheckInterval" Value="3" />
    </Section>
    <Section Name="ClusterManager">
      <Parameter Name="TargetReplicaSetSize" Value="3" />
      <Parameter Name="MinReplicaSetSize" Value="3" />
      <Parameter Name="UpgradeStatusPollInterval" Value="5" />
      <Parameter Name="UpgradeHealthCheckInterval" Value="5" />
      <Parameter Name="FabricUpgradeHealthCheckInterval" Value="5" />
    </Section>
    <Section Name="NamingService">
      <Parameter Name="TargetReplicaSetSize" Value="3" />
      <Parameter Name="MinReplicaSetSize" Value="3" />
    </Section>
    <Section Name="Management">
      <Parameter Name="ImageStoreConnectionString" Value="file:C:\SfDevCluster\Data\ImageStoreShare" />
      <Parameter Name="ImageCachingEnabled" Value="false" />
      <Parameter Name="EnableDeploymentAtDataRoot" Value="true" />
    </Section>
    <Section Name="Hosting">
      <Parameter Name="EndpointProviderEnabled" Value="true" />
      <Parameter Name="RunAsPolicyEnabled" Value="true" />
      <Parameter Name="DeactivationScanInterval" Value="60" />
      <Parameter Name="DeactivationGraceInterval" Value="10" />
      <Parameter Name="EnableProcessDebugging" Value="true" />
      <Parameter Name="ServiceTypeRegistrationTimeout" Value="20" />
      <Parameter Name="CacheCleanupScanInterval" Value="300" />
    </Section>
    <Section Name="HttpGateway">
      <Parameter Name="IsEnabled" Value="true" />
    </Section>
    <Section Name="PlacementAndLoadBalancing">
      <Parameter Name="MinLoadBalancingInterval" Value="300" />
    </Section>
    <Section Name="Federation">
      <Parameter Name="NodeIdGeneratorVersion" Value="V4" />
      <Parameter Name="UnresponsiveDuration" Value="0" />
    </Section>
    <Section Name="ApplicationGateway/Http">
      <Parameter Name="IsEnabled" Value="true" />
    </Section>
    <Section Name="FaultAnalysisService">
      <Parameter Name="TargetReplicaSetSize" Value="3" />
      <Parameter Name="MinReplicaSetSize" Value="3" />
    </Section>
    <Section Name="Trace/Etw">
      <Parameter Name="Level" Value="4" />
    </Section>
    <Section Name="Diagnostics">
      <Parameter Name="ProducerInstances" Value="ServiceFabricEtlFile, ServiceFabricPerfCtrFolder" />
      <Parameter Name="MaxDiskQuotaInMB" Value="10240" />
    </Section>
    <Section Name="ServiceFabricEtlFile">
      <Parameter Name="ProducerType" Value="EtlFileProducer" />
      <Parameter Name="IsEnabled" Value="true" />
      <Parameter Name="EtlReadIntervalInMinutes" Value=" 5" />
      <Parameter Name="DataDeletionAgeInDays" Value="3" />
    </Section>
    <Section Name="ServiceFabricPerfCtrFolder">
      <Parameter Name="ProducerType" Value="FolderProducer" />
      <Parameter Name="IsEnabled" Value="true" />
      <Parameter Name="FolderType" Value="ServiceFabricPerformanceCounters" />
      <Parameter Name="DataDeletionAgeInDays" Value="3" />
    </Section>
    <Section Name="TransactionalReplicator">
      <Parameter Name="CheckpointThresholdInMB" Value="64" />
    </Section>
  </FabricSettings>
</ClusterManifest>

有一些问题,但正如@cassandrad 提到的,最大的问题是默认部署绑定到本地主机 (IPAddressOrFQDN="localhost") 的 TCP FQDN,而不是机器的 IP 地址,所以它默认只允许本地连接。

以下是解决我的问题的完整步骤:

  • 我首先 运行 netstat -a | FindStr "19000" 在命令提示符中检查哪些绑定处于活动状态,以确认@cassandrad 所说的内容。
  • 正在阅读 this guide, I decided to download the Service Fabric standalone package for Windows Server(在 Windows 服务器之外工作得很好,顺便说一句,在 Windows 8.1 x64 上)。
  • 我复制并修改了 ClusterConfig.Unsecure.DevCluster.json,在 nodes 部分下,我将所有节点的 iPAddress 更改为 192.168.1.102。我将新文件命名为 ClusterConfig.Unsecure.CustomDevCluster.json.
  • 我运行CreateServiceFabricCluster.ps1。它问我用什么JSON配置,我给了ClusterConfig.Unsecure.DevCluster.json.
  • 第一次失败是因为获取 Newtonsoft.JSON 版本 6.0.0.0 的错误,从痕迹中可以看出,这是一个相当烦人的混淆错误。错误是因为我没有 .NET Framework 4.6.2,所以我 downloaded 并安装了它。
  • 第二次失败,因为安装了 Microsoft Azure Service Fabric MSI。出现此错误是因为我之前安装了 MicrosoftAzure-ServiceFabric-CoreSDK.exe。我转到“程序和功能”并卸载了 Microsoft Azure Service Fabric(我没有安装 Microsoft Azure Service Fabric SDK)。
  • 我运行上次脚本,祈祷,终于成功了。
  • 这是一个不安全的集群,所以我可以使用 Connect-ServiceFabricCluster "192.168.1.102:19000" 简单地连接到它。如果您想启用其他身份验证机制,请修改并使用其他一些 .json 示例配置。

Why isn't it working?

它不起作用,因为您将节点的 IP 地址设置为 localhost,从而使它们无法被发现。它适用于本地调试集群,但对于本地和 Azure 集群,您必须指定有效且可访问的 IP 地址或限定名称。

另外,我现在不是 100% 确定,但如果您希望集群可以通过 URI 而不是 IP 访问,我建议您指定 FQDN 而不是 IP 地址。我记得我在这方面遇到过麻烦,但仍然不清楚是什么帮助了我——FQDN 还是其他什么。