过滤掉 AWS Textract 函数返回的数据
Filtering out data returned by AWS Textract function
我已经提取了由 Textract AWS 函数 return 编辑的数据。此 Textract 函数的 return 数据类型为以下类型:
{
"AnalyzeDocumentModelVersion": "string",
"Blocks": [
{
"BlockType": "string",
"ColumnIndex": number,
"ColumnSpan": number,
"Confidence": number,
"EntityTypes": [ "string" ],
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Id": "string",
"Page": number,
"Relationships": [
{
"Ids": [ "string" ],
"Type": "string"
}
],
"RowIndex": number,
"RowSpan": number,
"SelectionStatus": "string",
"Text": "string"
}
],
"DocumentMetadata": {
"Pages": number
},
"JobStatus": "string",
"NextToken": "string",
"StatusMessage": "string",
"Warnings": [
{
"ErrorCode": "string",
"Pages": [ number ]
}
]
}
我已通过以下代码从该数据中提取块:
var d = null;
...<Some Code Here>...
d = data.Blocks;
console.log(d);
以 JSON 对象的数组形式给出输出。下面给出了提取文本的示例:
[...{ BlockType: 'WORD',
Confidence: 99.7286376953125,
Text: '2000.00',
Geometry: { BoundingBox: [Object], Polygon: [Array] },
Id: '<ID here>',
Page: 1 }, ...]
我只想提取文本字段并将其视为唯一的输出。我该如何开始呢?
我可能误解了你的问题,但如果你需要提取数据数组中每个对象的文本字段的值,请看下面的例子
const data = [
{
BlockType: "WORD",
Confidence: 99.7286376953125,
Text: "2000.00",
Geometry: { BoundingBox: {}, Polygon: [] },
Id: "<ID here>",
Page: 1,
},
];
const output = data.map(({ Text: text }) => text);
console.log(output);
见
我已经提取了由 Textract AWS 函数 return 编辑的数据。此 Textract 函数的 return 数据类型为以下类型:
{
"AnalyzeDocumentModelVersion": "string",
"Blocks": [
{
"BlockType": "string",
"ColumnIndex": number,
"ColumnSpan": number,
"Confidence": number,
"EntityTypes": [ "string" ],
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Id": "string",
"Page": number,
"Relationships": [
{
"Ids": [ "string" ],
"Type": "string"
}
],
"RowIndex": number,
"RowSpan": number,
"SelectionStatus": "string",
"Text": "string"
}
],
"DocumentMetadata": {
"Pages": number
},
"JobStatus": "string",
"NextToken": "string",
"StatusMessage": "string",
"Warnings": [
{
"ErrorCode": "string",
"Pages": [ number ]
}
]
}
我已通过以下代码从该数据中提取块:
var d = null;
...<Some Code Here>...
d = data.Blocks;
console.log(d);
以 JSON 对象的数组形式给出输出。下面给出了提取文本的示例:
[...{ BlockType: 'WORD',
Confidence: 99.7286376953125,
Text: '2000.00',
Geometry: { BoundingBox: [Object], Polygon: [Array] },
Id: '<ID here>',
Page: 1 }, ...]
我只想提取文本字段并将其视为唯一的输出。我该如何开始呢?
我可能误解了你的问题,但如果你需要提取数据数组中每个对象的文本字段的值,请看下面的例子
const data = [
{
BlockType: "WORD",
Confidence: 99.7286376953125,
Text: "2000.00",
Geometry: { BoundingBox: {}, Polygon: [] },
Id: "<ID here>",
Page: 1,
},
];
const output = data.map(({ Text: text }) => text);
console.log(output);
见