如何更改现有 Service Fabric 群集上的 OS?
How to change the OS on an existing Service Fabric cluster?
我正在尝试更改我的 VMSS:
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServer",
"sku": "2016-Datacenter-with-Containers",
"version": "latest"
}
收件人:
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServerSemiAnnual",
"sku": "Datacenter-Core-1803-with-Containers-smalldisk",
"version": "latest"
}
我首先尝试的是:
Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku Datacenter-Core-1803-with-Containers-smalldisk -ImageReferenceOffer WindowsServerSemiAnnual
这给了我错误:
Update-AzureRmVmss : Changing property 'imageReference.offer' is not allowed.
ErrorCode: PropertyChangeNotAllowed
这在文档中得到确认;您只能在创建规模集时设置报价。
接下来我尝试 Add-AzureRmServiceFabricNodeType
添加一个新的节点类型,我想我可以在之后删除旧的。但是,此命令似乎不允许您设置 OS 图像。您只能设置 VM SKU(换句话说,集群上的所有 VM 必须具有相同的 OS)。
有没有办法在不删除整个集群并从头开始的情况下改变这一点?
编辑 如果您可以留在当前发布商+报价内,只需更改 SKU 即可非常轻松地切换 OS。 .
如果您确实需要更改报价,您可以这样做:
Upgrade the size and operating system of the primary node type VMs.
请注意,您需要考虑很多因素,例如可用性水平。集群也将暂时无法从外部访问。
大幅缩短:
- 将具有所需 OS 的第二个规模集添加到主节点类型
- 禁用旧的规模集,然后将其删除
- 切换负载均衡器
# Variables.
$groupname = "sfupgradetestgroup"
$clusterloc="southcentralus"
$subscriptionID="<your subscription ID>"
# sign in to your Azure account and select your subscription
Login-AzAccount -SubscriptionId $subscriptionID
# Create a new resource group for your deployment and give it a name and a location.
New-AzResourceGroup -Name $groupname -Location $clusterloc
# Deploy the two node type cluster.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.parameters.json" `
-TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.json" -Verbose
# Connect to the cluster and check the cluster health.
$ClusterName= "sfupgradetest.southcentralus.cloudapp.azure.com:19000"
$thumb="F361720F4BD5449F6F083DDE99DC51A86985B25B"
Connect-ServiceFabricCluster -ConnectionEndpoint $ClusterName -KeepAliveIntervalInSec 10 `
-X509Credential `
-ServerCertThumbprint $thumb `
-FindType FindByThumbprint `
-FindValue $thumb `
-StoreLocation CurrentUser `
-StoreName My
Get-ServiceFabricClusterHealth
# Deploy a new scale set into the primary node type. Create a new load balancer and public IP address for the new scale set.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.parameters.json" `
-TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.json" -Verbose
# Check the cluster health again. All 15 nodes should be healthy.
Get-ServiceFabricClusterHealth
# Disable the nodes in the original scale set.
$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}
Write-Host "Checking node status..."
foreach($name in $nodeNames){
$state = Get-ServiceFabricNode -NodeName $name
$loopTimeout = 50
do{
Start-Sleep 5
$loopTimeout -= 1
$state = Get-ServiceFabricNode -NodeName $name
Write-Host "$name state: " $state.NodeDeactivationInfo.Status
}
while (($state.NodeDeactivationInfo.Status -ne "Completed") -and ($loopTimeout -ne 0))
if ($state.NodeStatus -ne [System.Fabric.Query.NodeStatus]::Disabled)
{
Write-Error "$name node deactivation failed with state" $state.NodeStatus
exit
}
}
# Remove the scale set
$scaleSetName="NTvm1"
Remove-AzVmss -ResourceGroupName $groupname -VMScaleSetName $scaleSetName -Force
Write-Host "Removed scale set $scaleSetName"
$lbname="LB-sfupgradetest-NTvm1"
$oldPublicIpName="PublicIP-LB-FE-0"
$newPublicIpName="PublicIP-LB-FE-2"
# Store DNS settings of public IP address related to old Primary NodeType into variable
$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname
$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
# Remove Load Balancer related to old Primary NodeType. This will cause a brief period of downtime for the cluster
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
# Remove the old public IP
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force
# Replace DNS settings of Public IP address related to new Primary Node Type with DNS settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP
# Check the cluster health
Get-ServiceFabricClusterHealth
# Remove node state for the deleted nodes.
foreach($name in $nodeNames){
# Remove the node from the cluster
Remove-ServiceFabricNodeState -NodeName $name -TimeoutSec 300 -Force
Write-Host "Removed node state for node $name"
}
对于那些想要切换到另一个 OS 但可以切换到同一 publisher/Offer 中的 OS 图像的人来说,这是另一个(更简单的)答案。您可以使用以下命令获取可用 OS SKU 的列表:
Get-AzureRmVMImageSku -Location 'westus2' -PublisherName MicrosoftWindowsServer -Offer WindowsServer
然后,您可以升级集群以使用该映像:
Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku 2019-Datacenter-Core-with-Containers-smalldisk
该命令将需要一个小时或更长时间才能到达 运行。
我还 运行 研究了一些出现 "Image Not Found" 错误的 SKU,即使它们出现在列表中。不知道这是什么原因。但是,在这种情况下,我发现它对我有用。
我正在尝试更改我的 VMSS:
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServer",
"sku": "2016-Datacenter-with-Containers",
"version": "latest"
}
收件人:
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServerSemiAnnual",
"sku": "Datacenter-Core-1803-with-Containers-smalldisk",
"version": "latest"
}
我首先尝试的是:
Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku Datacenter-Core-1803-with-Containers-smalldisk -ImageReferenceOffer WindowsServerSemiAnnual
这给了我错误:
Update-AzureRmVmss : Changing property 'imageReference.offer' is not allowed. ErrorCode: PropertyChangeNotAllowed
这在文档中得到确认;您只能在创建规模集时设置报价。
接下来我尝试 Add-AzureRmServiceFabricNodeType
添加一个新的节点类型,我想我可以在之后删除旧的。但是,此命令似乎不允许您设置 OS 图像。您只能设置 VM SKU(换句话说,集群上的所有 VM 必须具有相同的 OS)。
有没有办法在不删除整个集群并从头开始的情况下改变这一点?
编辑 如果您可以留在当前发布商+报价内,只需更改 SKU 即可非常轻松地切换 OS。
如果您确实需要更改报价,您可以这样做:
Upgrade the size and operating system of the primary node type VMs.
请注意,您需要考虑很多因素,例如可用性水平。集群也将暂时无法从外部访问。
大幅缩短:
- 将具有所需 OS 的第二个规模集添加到主节点类型
- 禁用旧的规模集,然后将其删除
- 切换负载均衡器
# Variables.
$groupname = "sfupgradetestgroup"
$clusterloc="southcentralus"
$subscriptionID="<your subscription ID>"
# sign in to your Azure account and select your subscription
Login-AzAccount -SubscriptionId $subscriptionID
# Create a new resource group for your deployment and give it a name and a location.
New-AzResourceGroup -Name $groupname -Location $clusterloc
# Deploy the two node type cluster.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.parameters.json" `
-TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.json" -Verbose
# Connect to the cluster and check the cluster health.
$ClusterName= "sfupgradetest.southcentralus.cloudapp.azure.com:19000"
$thumb="F361720F4BD5449F6F083DDE99DC51A86985B25B"
Connect-ServiceFabricCluster -ConnectionEndpoint $ClusterName -KeepAliveIntervalInSec 10 `
-X509Credential `
-ServerCertThumbprint $thumb `
-FindType FindByThumbprint `
-FindValue $thumb `
-StoreLocation CurrentUser `
-StoreName My
Get-ServiceFabricClusterHealth
# Deploy a new scale set into the primary node type. Create a new load balancer and public IP address for the new scale set.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.parameters.json" `
-TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.json" -Verbose
# Check the cluster health again. All 15 nodes should be healthy.
Get-ServiceFabricClusterHealth
# Disable the nodes in the original scale set.
$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}
Write-Host "Checking node status..."
foreach($name in $nodeNames){
$state = Get-ServiceFabricNode -NodeName $name
$loopTimeout = 50
do{
Start-Sleep 5
$loopTimeout -= 1
$state = Get-ServiceFabricNode -NodeName $name
Write-Host "$name state: " $state.NodeDeactivationInfo.Status
}
while (($state.NodeDeactivationInfo.Status -ne "Completed") -and ($loopTimeout -ne 0))
if ($state.NodeStatus -ne [System.Fabric.Query.NodeStatus]::Disabled)
{
Write-Error "$name node deactivation failed with state" $state.NodeStatus
exit
}
}
# Remove the scale set
$scaleSetName="NTvm1"
Remove-AzVmss -ResourceGroupName $groupname -VMScaleSetName $scaleSetName -Force
Write-Host "Removed scale set $scaleSetName"
$lbname="LB-sfupgradetest-NTvm1"
$oldPublicIpName="PublicIP-LB-FE-0"
$newPublicIpName="PublicIP-LB-FE-2"
# Store DNS settings of public IP address related to old Primary NodeType into variable
$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname
$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
# Remove Load Balancer related to old Primary NodeType. This will cause a brief period of downtime for the cluster
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
# Remove the old public IP
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force
# Replace DNS settings of Public IP address related to new Primary Node Type with DNS settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP
# Check the cluster health
Get-ServiceFabricClusterHealth
# Remove node state for the deleted nodes.
foreach($name in $nodeNames){
# Remove the node from the cluster
Remove-ServiceFabricNodeState -NodeName $name -TimeoutSec 300 -Force
Write-Host "Removed node state for node $name"
}
对于那些想要切换到另一个 OS 但可以切换到同一 publisher/Offer 中的 OS 图像的人来说,这是另一个(更简单的)答案。您可以使用以下命令获取可用 OS SKU 的列表:
Get-AzureRmVMImageSku -Location 'westus2' -PublisherName MicrosoftWindowsServer -Offer WindowsServer
然后,您可以升级集群以使用该映像:
Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku 2019-Datacenter-Core-with-Containers-smalldisk
该命令将需要一个小时或更长时间才能到达 运行。
我还 运行 研究了一些出现 "Image Not Found" 错误的 SKU,即使它们出现在列表中。不知道这是什么原因。但是,在这种情况下,我发现它对我有用。