运行 DataLakeAnalyticsU-SQL 数据工厂管道中的脚本
Run DataLakeAnalyticsU-SQL script in Data Factory Pipeline
我想创建一个包含 U-SQL 脚本的管道,该脚本将 Azure DataLake Store 中的多个日志文件合并到一个文件中,我试图通过在我的商店中创建一个文本文件然后将它添加到管道 scriptpath 但我得到一个错误,我搜索了它,我发现管道不支持 ADL 所以我想写 U-SQL 管道中的脚本使用 script 属性。我尝试使用此脚本来执行此操作,但出现错误并且我无法部署它,所以任何人都可以帮助执行此操作吗?
这是我的管道脚本:
{
"name": "RG-GatherData",
"properties": {
"description": "description",
"activities": [
{
"name": "DataLakeAnalyticsUSqlActivityTemplate",
"type": "DataLakeAnalyticsU-SQL",
"linkedServiceName": "AzureDataLakeAnalyticsLinkedService",
"typeProperties": {
"script": "
@log = EXTRACT ["VersionID"] int,
["NodeName"] string,
["UpdateIng Area"] string,
["ActionDate"] string,
["UserName"] string,
["Code part Type"] string,
["DocCode"] string,
["Header Entity Id"] string,
["Common Entity Id"] string,
["Attribute Name"] string,
["Latest Update Value"] string,
["Previous Update Value"] string
FROM @in
USING Extractors.Csv(skipFirstNRows: 1);
OUTPUT @log
TO @out
USING Outputters.Csv();
,
"degreeOfParallelism": 3,
"priority": 100,
"parameters": {
"in": "/RowLogs/InPut/RoyalGardens/{*}.csv",
"out": "/RowLogs/OutPut/RoyalGardens/Alllog.csv"
}
},
"policy": {
"concurrency": 1,
"executionPriorityOrder": "OldestFirst",
"retry": 3,
"timeout": "10:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1
}
}
],
"start": "2018-09-20T00:06:00Z",
"end": "2099-12-30T22:00:00Z"
}
}
将 U-SQL 脚本存储在 Blob 存储中并通过 Blob 存储链接服务引用它。
我想创建一个包含 U-SQL 脚本的管道,该脚本将 Azure DataLake Store 中的多个日志文件合并到一个文件中,我试图通过在我的商店中创建一个文本文件然后将它添加到管道 scriptpath 但我得到一个错误,我搜索了它,我发现管道不支持 ADL 所以我想写 U-SQL 管道中的脚本使用 script 属性。我尝试使用此脚本来执行此操作,但出现错误并且我无法部署它,所以任何人都可以帮助执行此操作吗? 这是我的管道脚本:
{
"name": "RG-GatherData",
"properties": {
"description": "description",
"activities": [
{
"name": "DataLakeAnalyticsUSqlActivityTemplate",
"type": "DataLakeAnalyticsU-SQL",
"linkedServiceName": "AzureDataLakeAnalyticsLinkedService",
"typeProperties": {
"script": "
@log = EXTRACT ["VersionID"] int,
["NodeName"] string,
["UpdateIng Area"] string,
["ActionDate"] string,
["UserName"] string,
["Code part Type"] string,
["DocCode"] string,
["Header Entity Id"] string,
["Common Entity Id"] string,
["Attribute Name"] string,
["Latest Update Value"] string,
["Previous Update Value"] string
FROM @in
USING Extractors.Csv(skipFirstNRows: 1);
OUTPUT @log
TO @out
USING Outputters.Csv();
,
"degreeOfParallelism": 3,
"priority": 100,
"parameters": {
"in": "/RowLogs/InPut/RoyalGardens/{*}.csv",
"out": "/RowLogs/OutPut/RoyalGardens/Alllog.csv"
}
},
"policy": {
"concurrency": 1,
"executionPriorityOrder": "OldestFirst",
"retry": 3,
"timeout": "10:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1
}
}
],
"start": "2018-09-20T00:06:00Z",
"end": "2099-12-30T22:00:00Z"
}
}
将 U-SQL 脚本存储在 Blob 存储中并通过 Blob 存储链接服务引用它。