使用 DataLakeStore FileSystem ManagementClient 上传到 Azure Data Lake 的限制为 30Mb
30Mb limit uploading to Azure DataLake using DataLakeStoreFileSystemManagementClient
我在使用
时收到错误消息
_adlsFileSystemClient.FileSystem.Create(_adlsAccountName, destFilePath, stream, overwrite)
将文件上传到数据湖。超过 30Mb 的文件会出现错误。它适用于较小的文件。
错误是:
at
Microsoft.Azure.Management.DataLake.Store.FileSystemOperations.d__16.MoveNext()
--- End of stack trace from previous location where exception was thrown --- at
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task
task) at
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task
task) at
Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.d__23.MoveNext()
--- End of stack trace from previous location where exception was thrown --- at
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task
task) at
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task
task) at
Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.Create(IFileSystemOperations
operations, String accountName, String directFilePath, Stream
streamContents, Nullable1 overwrite, Nullable
1 syncFlag) at
AzureDataFunctions.DataLakeController.CreateFileInDataLake(String
destFilePath, Stream stream, Boolean overwrite) in
F:\GitHub\ZutoDW\ADF_ProcessAllFiles\ADF_ProcessAllFiles\DataLakeController.cs:line
122
有没有人遇到过这个?或者观察到类似的行为?我通过将我的文件分成 30Mb 的片段并上传它们来解决这个问题。
然而,从长远来看,这是不切实际的,因为原始文件是 380Mb,而且可能要大很多。从长远来看,我不想在我的数据湖中有 10-15 个剖析文件。我想作为单个文件上传。
我可以通过门户界面将完全相同的文件上传到数据湖。
它回答了here。
目前有 30000000 字节的大小限制。您可以通过创建一个初始文件然后追加来解决,两者的流大小都小于限制。
请尝试使用DataLakeStoreUploader
将文件或目录上传到DataLake,更多演示代码请参考github sample. I test the demo and it works correctly for me. We can get the Microsoft.Azure.Management.DataLake.Store and Microsoft.Azure.Management.DataLake.StoreUploader来自nuget的SDK。以下是我的详细步骤:
- 创建 C# 控制台应用程序
添加以下代码
var applicationId = "your application Id";
var secretKey = "secret Key";
var tenantId = "Your tenantId";
var adlsAccountName = "adls account name";
var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result;
var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds);
var inputFilePath = @"c:\tom\ForDemoCode.zip";
var targetStreamPath = "/mytempdir/ForDemoCode.zip"; //should be the '/foldername/' not the full path
var parameters = new UploadParameters(inputFilePath, targetStreamPath, adlsAccountName, isOverwrite: true,maxSegmentLength: 268435456*2); // the default maxSegmentLength is 256M, we can set by ourself.
var frontend = new DataLakeStoreFrontEndAdapter(adlsAccountName, adlsFileSystemClient);
var uploader = new DataLakeStoreUploader(parameters, frontend);
uploader.Execute();
调试应用程序。
从 Azure 门户查看
SDK信息请参考packages.config文件
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Microsoft.Azure.Management.DataLake.Store" version="1.0.2-preview" targetFramework="net452" />
<package id="Microsoft.Azure.Management.DataLake.StoreUploader" version="1.0.0-preview" targetFramework="net452" />
<package id="Microsoft.IdentityModel.Clients.ActiveDirectory" version="3.13.8" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime" version="2.3.2" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime.Azure" version="3.3.2" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime.Azure.Authentication" version="2.2.0-preview" targetFramework="net452" />
<package id="Newtonsoft.Json" version="9.0.2-beta1" targetFramework="net452" />
</packages>
我在使用
时收到错误消息_adlsFileSystemClient.FileSystem.Create(_adlsAccountName, destFilePath, stream, overwrite)
将文件上传到数据湖。超过 30Mb 的文件会出现错误。它适用于较小的文件。
错误是:
at Microsoft.Azure.Management.DataLake.Store.FileSystemOperations.d__16.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.d__23.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.Create(IFileSystemOperations operations, String accountName, String directFilePath, Stream streamContents, Nullable
1 overwrite, Nullable
1 syncFlag) at AzureDataFunctions.DataLakeController.CreateFileInDataLake(String destFilePath, Stream stream, Boolean overwrite) in F:\GitHub\ZutoDW\ADF_ProcessAllFiles\ADF_ProcessAllFiles\DataLakeController.cs:line 122
有没有人遇到过这个?或者观察到类似的行为?我通过将我的文件分成 30Mb 的片段并上传它们来解决这个问题。
然而,从长远来看,这是不切实际的,因为原始文件是 380Mb,而且可能要大很多。从长远来看,我不想在我的数据湖中有 10-15 个剖析文件。我想作为单个文件上传。
我可以通过门户界面将完全相同的文件上传到数据湖。
它回答了here。
目前有 30000000 字节的大小限制。您可以通过创建一个初始文件然后追加来解决,两者的流大小都小于限制。
请尝试使用DataLakeStoreUploader
将文件或目录上传到DataLake,更多演示代码请参考github sample. I test the demo and it works correctly for me. We can get the Microsoft.Azure.Management.DataLake.Store and Microsoft.Azure.Management.DataLake.StoreUploader来自nuget的SDK。以下是我的详细步骤:
- 创建 C# 控制台应用程序
添加以下代码
var applicationId = "your application Id"; var secretKey = "secret Key"; var tenantId = "Your tenantId"; var adlsAccountName = "adls account name"; var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result; var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds); var inputFilePath = @"c:\tom\ForDemoCode.zip"; var targetStreamPath = "/mytempdir/ForDemoCode.zip"; //should be the '/foldername/' not the full path var parameters = new UploadParameters(inputFilePath, targetStreamPath, adlsAccountName, isOverwrite: true,maxSegmentLength: 268435456*2); // the default maxSegmentLength is 256M, we can set by ourself. var frontend = new DataLakeStoreFrontEndAdapter(adlsAccountName, adlsFileSystemClient); var uploader = new DataLakeStoreUploader(parameters, frontend); uploader.Execute();
调试应用程序。
从 Azure 门户查看
SDK信息请参考packages.config文件
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Microsoft.Azure.Management.DataLake.Store" version="1.0.2-preview" targetFramework="net452" />
<package id="Microsoft.Azure.Management.DataLake.StoreUploader" version="1.0.0-preview" targetFramework="net452" />
<package id="Microsoft.IdentityModel.Clients.ActiveDirectory" version="3.13.8" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime" version="2.3.2" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime.Azure" version="3.3.2" targetFramework="net452" />
<package id="Microsoft.Rest.ClientRuntime.Azure.Authentication" version="2.2.0-preview" targetFramework="net452" />
<package id="Newtonsoft.Json" version="9.0.2-beta1" targetFramework="net452" />
</packages>