使用 Get-AzureStorageBlobContent 从 ADL Gen2 下载镶木地板文件

Download parquet file from ADL Gen2 using Get-AzureStorageBlobContent

我正在尝试使用 powershell 命令将 parquet 文件从 ADL Gen 2 下载到本地系统。 下面是代码片段

#this appid has access to ADL
[string] $AppID  = "bbb88818-aaaa-44fb-q2345678901y" 
 [string] $TenantId  = "ttt88888-xxxx-yyyy-q2345678901y"
 [string] $SubscriptionName  = "Sub Sample"
 [string] $LocalTargetFilePathName  = "D:\MoveToModern"


Write-Host "AppID = " $AppID
Write-Host "TenantId = " $TenantId
Write-Host "SubscriptionName = " $SubscriptionName
Write-Host "AzureDataLakeAccountName = " AzureDataLakeAccountName
Write-Host "AzureDataLakeSrcFilePath = " $AzureDataLakeSrcFilePath
Write-Host "LocalTargetFilePathName = " $LocalTargetFilePathName



#this is the access key of the appid
$AccessKeyValue = "1234567=u-r.testabcdefaORYsw5AN5"

$azurePassword    = ConvertTo-SecureString $AccessKeyValue -AsPlainText -Force
$psCred           = New-Object System.Management.Automation.PSCredential($AppID, $azurePassword)
Login-AzureRmAccount -Credential $psCred -ServicePrincipal -Tenant $TenantId

Get-AzureRmSubscription

Get-AzureRmSubscription -SubscriptionName $SubscriptionName  | Set-AzureRmContext
Get-AzureStorageBlobContent -Container "/Test/Partner/Account/" -Blob "Account.parquet" -Destination "D:\MoveToModern"

但是我遇到了以下错误

可能我们必须设置存储上下文。你能告诉我如何使用服务主体设置存储上下文吗? (我有服务主体的应用程序 ID 和应用程序密钥。W.r.t ADL Gen2 源代码,我只有路径详细信息。源代码团队提供了对服务主体的访问权限)

如果你想从 Azure Data Lake Gen2 下载文件,我建议你使用 PowerShell 模块 Az.Storage。同时,关于如何用服务主体来实现,你有两种选择。

1.使用 Azure RABC 角色

如果使用Azure RABC Role,需要将特殊角色(Storage Blob Data Reader)分配给sp.

例如

$AppID  = "" 
$AccessKeyValue  = ""
$TenantId=""
$SubscriptionName  = ""
#1. Assign role at the storage account level
#please use owner account to login
Connect-AzAccount -Tenant $TenantId -Subscription $SubscriptionName
New-AzRoleAssignment -ApplicationId $AppID  -RoleDefinitionName "Storage Blob Data Reader" `
      -Scope "/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.Storage/storageAccounts/<storage-account>"

# download

$azurePassword    = ConvertTo-SecureString $AccessKeyValue -AsPlainText -Force
$psCred           = New-Object System.Management.Automation.PSCredential($AppID, $azurePassword)
Connect-AzAccount -Credential $psCred -ServicePrincipal -Tenant $TenantId -Subscription $SubscriptionName

$AzureDataLakeAccountName  = "testadls05"

$ctx =New-AzStorageContext -StorageAccountName $AzureDataLakeAccountName -UseConnectedAccount 

$LocalTargetFilePathName  = "D:\test.parquet"
$filesystemName="test"
$path="2020/10/28/test.parquet"
Get-AzDataLakeGen2ItemContent -Context $ctx -FileSystem $filesystemName -Path $path -Destination $LocalTargetFilePathName 

  1. 使用访问控制列表

如果您使用该方法,要授予安全主体对文件的读取访问权限,您需要向安全主体授予容器的执行权限,以及文件夹层次结构中导致文件。

例如

$AppID  = "" 
$AccessKeyValue  = ""
$TenantId=""


$azurePassword    = ConvertTo-SecureString $AccessKeyValue -AsPlainText -Force
$psCred           = New-Object System.Management.Automation.PSCredential($AppID, $azurePassword)
Connect-AzAccount -Credential $psCred -ServicePrincipal -Tenant $TenantId


$AzureDataLakeAccountName  = "testadls05"
$ctx =New-AzStorageContext -StorageAccountName $AzureDataLakeAccountName -UseConnectedAccount 

$filesystemName="test"
$path="2020/10/28/test.parquet"
$LocalTargetFilePathName  = "D:\test1.parquet"
Get-AzDataLakeGen2ItemContent -Context $ctx -FileSystem $filesystemName -Path $path -Destination $LocalTargetFilePathName 

更多详情请参考

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control-model

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-powershell