来自容器的天蓝色数据工厂递归副本
azure data factory recursive copy from container
您好,我正在使用 Azure 数据工厂进行复制 activity。
我希望副本在容器中递归,它的子文件夹如下:
myfolder/Year/Month/Day/Hour}/New_Generated_File.csv
我正在生成并导入到文件夹中的文件始终具有不同的名称。
问题是 activity 似乎永远在等待。
管道每小时安排一次。
我附上数据集和链接服务的 json 代码。
数据集:
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/{Custom}.csv",
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
}
关联服务:
{
"name": "LinkedService_To_Blob",
"properties": {
"description": "",
"hubName": "dataorchestrationsystem_hub",
"type": "AzureStorage",
"typeProperties": {
"connectionString": "DefaultEndpointsProtocol=https;AccountName=wizestorage;AccountKey=**********"
}
}
}
数据集的 folderPath
属性 中的文件名不是强制性的。只需删除文件名,然后数据工厂将为您加载所有文件。
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/",
"partitionedBy": [
{ "name": "Year", "value": { "type": "DateTime", "date": "SliceStart", "format": "yyyy" } },
{ "name": "Month", "value": { "type": "DateTime", "date": "SliceStart", "format": "%M" } },
{ "name": "Day", "value": { "type": "DateTime", "date": "SliceStart", "format": "%d" } },
{ "name": "Hour", "value": { "type": "DateTime", "date": "SliceStart", "format": "hh" } }
],
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
用上面的folderPath
会生成运行的时间值
uploadtransactional/yearno=2016/monthno=05/dayno=30/hourno=07/
对于现在执行 UTC 时区的管道
您好,我正在使用 Azure 数据工厂进行复制 activity。 我希望副本在容器中递归,它的子文件夹如下: myfolder/Year/Month/Day/Hour}/New_Generated_File.csv
我正在生成并导入到文件夹中的文件始终具有不同的名称。
问题是 activity 似乎永远在等待。
管道每小时安排一次。
我附上数据集和链接服务的 json 代码。
数据集:
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/{Custom}.csv",
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
}
关联服务:
{
"name": "LinkedService_To_Blob",
"properties": {
"description": "",
"hubName": "dataorchestrationsystem_hub",
"type": "AzureStorage",
"typeProperties": {
"connectionString": "DefaultEndpointsProtocol=https;AccountName=wizestorage;AccountKey=**********"
}
}
}
数据集的 folderPath
属性 中的文件名不是强制性的。只需删除文件名,然后数据工厂将为您加载所有文件。
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/",
"partitionedBy": [
{ "name": "Year", "value": { "type": "DateTime", "date": "SliceStart", "format": "yyyy" } },
{ "name": "Month", "value": { "type": "DateTime", "date": "SliceStart", "format": "%M" } },
{ "name": "Day", "value": { "type": "DateTime", "date": "SliceStart", "format": "%d" } },
{ "name": "Hour", "value": { "type": "DateTime", "date": "SliceStart", "format": "hh" } }
],
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
用上面的folderPath
会生成运行的时间值
uploadtransactional/yearno=2016/monthno=05/dayno=30/hourno=07/
对于现在执行 UTC 时区的管道