如何使用识别图像文本中的所需数据

Question

我将以下代码与 AWS 服务 Recoknition 结合使用来分析图像中的文本。我只在图片中寻找一个特定的行项目，它是一个 10 位数字。我想让我的程序确定照片中是否有 10 位数字。如果 TRUE = 响应 A，如果 FALSE 响应 B.

第 1 步。如何修改我的响应以打印仅包含此 10 位数字（如果在照片中找到）的输出？如何将这个数字保存为程序中的变量？

此处供参考是我当前的代码和我得到的输出的一部分。我只想要第 5 行，但它不会总是具有相同几何形状的第 5 行，因为照片中标签的角度和大小会发生变化。我正在尝试测试分析的图像也已附上。

这是图片

这是我的代码：

import csv
import boto3 


with open('credentials.csv' , 'r') as input:
    next(input)
    reader = csv.reader(input)
    for line in reader: 
        access_key_id = line[2]
        secret_access_key = line[3]

photo = 'DSCN4898.JPG'

client = boto3.client('rekognition',
aws_access_key_id = access_key_id, 
aws_secret_access_key = secret_access_key,
region_name = 'us-east-2',
)

response = client.detect_text(Image={'S3Object': {
            'Bucket': 'phare.lumberid',
            'Name': photo} 
    }     ) 


print(response.text)

{'TextDetections': [{'DetectedText': 'GR', 'Type': 'LINE', 'Id': 0, 'Confidence': 98.77118682861328, 'Geometry': {'BoundingBox': {'Width': 0.004206564277410507, 'Height': 0.0020818368066102266, 'Left': 0.7285668849945068, 'Top': 0.2874905467033386}, 'Polygon': [{'X': 0.7285668849945068, 'Y': 0.2874905467033386}, {'X': 0.7327734231948853, 'Y': 0.22823381423950195}, {'X': 0.7849873304367065, 'Y': 0.2303156554698944}, {'X': 0.7807807922363281, 'Y': 0.2895723879337311}]}}, {'DetectedText': '3B', 'Type': 'LINE', 'Id': 1, 'Confidence': 99.73309326171875, 'Geometry': {'BoundingBox': {'Width': 0.0030030030757188797, 'Height': 0.0015003751032054424, 'Left': 0.5345345139503479, 'Top': 0.29482370615005493}, 'Polygon': [{'X': 0.5345345139503479, 'Y': 0.29482370615005493}, {'X': 0.5375375151634216, 'Y': 0.24456113576889038}, {'X': 0.5905905961990356, 'Y': 0.24531133472919464}, {'X': 0.5885885953903198, 'Y': 0.29632407426834106}]}}, {'DetectedText': '11/16/2020', 'Type': 'LINE', 'Id': 2, 'Confidence': 99.93339538574219, 'Geometry': {'BoundingBox': {'Width': 0.013013012707233429, 'Height': 0.0022505626548081636, 'Left': 0.3553553521633148, 'Top': 0.4043510854244232}, 'Polygon': [{'X': 0.3553553521633148, 'Y': 0.4043510854244232}, {'X': 0.36836835741996765, 'Y': 0.23255814611911774}, {'X': 0.4154154062271118, 'Y': 0.23405851423740387}, {'X': 0.402402400970459, 'Y': 0.4066016376018524}]}}, {'DetectedText': 'RO', 'Type': 'LINE', 'Id': 3, 'Confidence': 99.86873626708984, 'Geometry': {'BoundingBox': {'Width': 0.0030030030757188797, 'Height': 0.0015003751032054424, 'Left': 0.5195195078849792, 'Top': 0.4606151580810547}, 'Polygon': [{'X': 0.5195195078849792, 'Y': 0.4606151580810547}, {'X': 0.522522509098053, 'Y': 0.40735185146331787}, {'X': 0.575575590133667, 
'Y': 0.408852219581604}, {'X': 0.5725725889205933, 'Y': 0.4621155261993408}]}}, {'DetectedText': '10', 'Type': 'LINE', 'Id': 4, 'Confidence': 99.67726135253906, 'Geometry': {'BoundingBox': {'Width': 0.0010010009864345193, 'Height': 0.0, 'Left': 0.35035035014152527, 'Top': 0.5011252760887146}, 'Polygon': [{'X': 0.35035035014152527, 'Y': 0.5011252760887146}, {'X': 0.3513513505458832, 'Y': 0.46436607837677}, {'X': 0.3953953981399536, 'Y': 0.46436607837677}, {'X': 0.3943943977355957, 'Y': 0.5011252760887146}]}}, **{'DetectedText': '0000014819', 'Type': 'LINE', 'Id': 5,** 'Confidence': 98.70645904541016, 'Geometry': {'BoundingBox': {'Width': 0.026078984141349792, 'Height': 0.003469466231763363, 'Left': 0.41238895058631897, 'Top': 0.5906790494918823}, 'Polygon': [{'X': 0.41238895058631897, 'Y': 0.5906790494918823}, {'X': 0.43846791982650757, 'Y': 0.2778533399105072}, {'X': 0.5125654935836792, 'Y': 0.28132280707359314}, {'X': 0.4864864945411682, 'Y': 0.5941485166549683}]}}, {'DetectedText': '04', 'Type': 'LINE', 'Id': 6, 'Confidence': 98.94153594970703, 'Geometry': {'BoundingBox': {'Width': 0.0037548313848674297, 'Height': 0.002500824397429824, 'Left': 0.5068372488021851, 'Top': 0.5926163792610168}, 'Polygon': [{'X': 0.5068372488021851, 'Y': 0.5926163792610168}, {'X': 0.5105921030044556, 'Y': 0.5481716394424438}, {'X': 0.5632959604263306, 'Y': 0.5506724715232849}, {'X': 0.5595411062240601, 'Y': 0.5951172113418579}]}}, {'DetectedText': 'PACKS', 'Type': 'LINE', 'Id': 7, 'Confidence':

Answer 1

如果您想使用 pandas 查看数据（在 ipython 或其他 IDE 中，您可以这样做）：

import pandas as pd
df = pd.DataFrame(response['TextDetections'])

In [192]: pd.DataFrame(response['TextDetections'])
     ...:
Out[192]:
              DetectedText  Type  Id  Confidence                                           Geometry  ParentId
0                       GR  LINE   0   99.072647  {'BoundingBox': {'Width': 0.005556055344641209...       NaN
1                       3B  LINE   1   99.752129  {'BoundingBox': {'Width': 0.002002001972869038...       NaN
2               11/16/2020  LINE   2   99.937080  {'BoundingBox': {'Width': 0.013013012707233429...       NaN
3                       RO  LINE   3   99.875778  {'BoundingBox': {'Width': 0.003003003075718879...       NaN
4                       10  LINE   4   99.780312  {'BoundingBox': {'Width': 0.001001000986434519...       NaN
5               0000014819  LINE   5   99.577316  {'BoundingBox': {'Width': 0.02471441961824894,...       NaN
6                       04  LINE   6   99.152405  {'BoundingBox': {'Width': 0.003001040546223521...       NaN
7                    PACKS  LINE   7   99.902817  {'BoundingBox': {'Width': 0.010937293991446495...       NaN
8   River Valley Hardwoods  LINE   8   99.724754  {'BoundingBox': {'Width': 0.014965030364692211...       NaN
9                       3B  WORD  10   99.752129  {'BoundingBox': {'Width': 0.0503024198114872, ...       1.0
10                      GR  WORD   9   99.072647  {'BoundingBox': {'Width': 0.05872828885912895,...       0.0
11              11/16/2020  WORD  11   99.937080  {'BoundingBox': {'Width': 0.17228509485721588,...       2.0
12                      RO  WORD  12   99.875778  {'BoundingBox': {'Width': 0.05334790423512459,...       3.0
13                      10  WORD  13   99.780312  {'BoundingBox': {'Width': 0.03677281737327576,...       4.0
14               Hardwoods  WORD  19   99.487328  {'BoundingBox': {'Width': 0.10112638026475906,...       8.0
15              0000014819  WORD  14   99.577316  {'BoundingBox': {'Width': 0.31150540709495544,...       5.0
16                      04  WORD  15   99.152405  {'BoundingBox': {'Width': 0.045055754482746124...       6.0
17                  Valley  WORD  18   99.796745  {'BoundingBox': {'Width': 0.05704939365386963,...       8.0
18                   PACKS  WORD  16   99.902817  {'BoundingBox': {'Width': 0.10187213867902756,...       7.0
19                   River  WORD  17   99.890190  {'BoundingBox': {'Width': 0.05252266675233841,...       8.0

您可以直观地看到 DetectedText 和置信水平，但要回答您以编程方式查看 10 位数字的问题，您可以循环 DetectedText 并单独分析每个数字。如果您知道文本始终采用某种格式，您可以对其进行测试。这应该可以帮助您入门

In [198]: for text in response['TextDetections']:
     ...:     # print(text['DetectedText'])
     ...:     if len(text['DetectedText']) == 10:
     ...:         print(text['DetectedText'])
     ...:         print(text['DetectedText'].isnumeric())
     ...:
11/16/2020
False
0000014819
True
11/16/2020
False
0000014819
True

如何使用识别图像文本中的所需数据

How to use desired data from rekognition image text

python

amazon-web-services

amazon-rekognition