使用 google 视觉 OCR API 从特定图像位置提取数据
extracting data from specific image locations using google vision OCR API
我正在使用 Googles Vision OCR API 尝试从图像中提取两种类型的数据 1) 来自文本框的手写文本;下面标有红色圆圈和 2) 复选框或 'x';下面用绿色圆圈标出。我将把这些数据输入数据库,所以我需要一个字符串 returned 用于两种类型的数据
目前,当我将此图像传递到 API 时,我得到一个包含所有数据的字符串:
Secondary School Study Student Perception of Computers LO 13 . Are any of your family members working >in computing / IT ? If so , what family member ( s ) is it ( eg , parent , guardian , brother , sister >etc . ) brother 14 . Have you any previous computing experience ( even attended a single day ) ? Select >one or many areas : U CODER DOJO IN SCHOOL CAMP VSELF TAUGHT JOTHER If you selected any from Q14 , was >the general experience : GOOD NEITHER GOOD OR BAD BAD BAD And why ( short answer , under 4 words ) >learned new skills To be completed after the camp . NewsLRY 1 . I would now consider a career in >computing / IT . Strongly Agree Agree No Opinion Disagree Strongly Disagree 2 . The camp showed me what >a career in computing / IT really was . ? Strongly Agree Agree No Opinion Disagree Strongly Disagree 3 >. The camp showed / highlighted that I was no good at programming or computing . Strongly Agree Agree >No Opinion Disagree Strongly Disagree 4 . Give two things that you did not know about computing / >programming until after the camp ? java Language Eclipse IDE va 5 . I was better than I first thought ( >before the camp ) at programming / computing . ? Agree No Opinion Disagree Strongly Disagree ? O >Strongly Agree 6 . Any feedback / comments about the camp ( good or bad ) ? good camp , Learned a lot . >Thank you for taking this survey . Page 2 of 2
我的代码:
public static void Main(string[] args)
{
string credential_path = @"C:\Users385\nodal.json";
System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", credential_path);
// Instantiates a client
var client = ImageAnnotatorClient.Create();
// Load the image file into memory
var image = Image.FromFile("stack.jpg");
// Performs text detection on the image file
var response = client.DetectDocumentText(image);
string words = "";
foreach (var page in response.Pages)
{
foreach (var block in page.Blocks)
{
string box = string.Join(" - ", block.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
foreach (var paragraph in block.Paragraphs)
{
box = string.Join(" - ", paragraph.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
foreach (var word in paragraph.Words)
{
words += $" {string.Join("", word.Symbols.Select(s => s.Text))}";
}
}
}
}
Console.WriteLine(words);
}
所以我的问题:
- 如何从每个红色框中提取数据(即第一个文本框将 return 'brother',第二个文本框应 return 'learned new skills')?
- 如何从每个绿色问题中提取标记了哪个复选框(即问题 13 应该 return 'YES'、问题 14. 应该 return 'SELF TAUGHT' 等.)?
我只是使用了某些 PHP 脚本中的 API,但我认为您的问题不取决于编程语言。
您需要使用检测到的单词的坐标(准确地说是具有四个顶点的框)。然后,您可以找到与参与者的写作相关的问卷元素。
这个脚本对我来说是一个很好的切入点:
https://www.leanx.eu/tutorials/use-google-cloud-vision-api-to-process-invoices-and-receipts
您可以在任何启用 PHP 的网站空间上使用它 "as is",它为您提供了结构良好的概述,说明如何检索 API returns.
有了这些方框并知道问卷的文本,如果 google 检测到您的参与者所做的复选标记,应该很容易找到它们。复选标记的检测可能并不总是适用于 google 视觉,因为 google 的 OCR 并不总是能找到单个 "character"。
我正在使用 Googles Vision OCR API 尝试从图像中提取两种类型的数据 1) 来自文本框的手写文本;下面标有红色圆圈和 2) 复选框或 'x';下面用绿色圆圈标出。我将把这些数据输入数据库,所以我需要一个字符串 returned 用于两种类型的数据
目前,当我将此图像传递到 API 时,我得到一个包含所有数据的字符串:
Secondary School Study Student Perception of Computers LO 13 . Are any of your family members working >in computing / IT ? If so , what family member ( s ) is it ( eg , parent , guardian , brother , sister >etc . ) brother 14 . Have you any previous computing experience ( even attended a single day ) ? Select >one or many areas : U CODER DOJO IN SCHOOL CAMP VSELF TAUGHT JOTHER If you selected any from Q14 , was >the general experience : GOOD NEITHER GOOD OR BAD BAD BAD And why ( short answer , under 4 words ) >learned new skills To be completed after the camp . NewsLRY 1 . I would now consider a career in >computing / IT . Strongly Agree Agree No Opinion Disagree Strongly Disagree 2 . The camp showed me what >a career in computing / IT really was . ? Strongly Agree Agree No Opinion Disagree Strongly Disagree 3 >. The camp showed / highlighted that I was no good at programming or computing . Strongly Agree Agree >No Opinion Disagree Strongly Disagree 4 . Give two things that you did not know about computing / >programming until after the camp ? java Language Eclipse IDE va 5 . I was better than I first thought ( >before the camp ) at programming / computing . ? Agree No Opinion Disagree Strongly Disagree ? O >Strongly Agree 6 . Any feedback / comments about the camp ( good or bad ) ? good camp , Learned a lot . >Thank you for taking this survey . Page 2 of 2
我的代码:
public static void Main(string[] args)
{
string credential_path = @"C:\Users385\nodal.json";
System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", credential_path);
// Instantiates a client
var client = ImageAnnotatorClient.Create();
// Load the image file into memory
var image = Image.FromFile("stack.jpg");
// Performs text detection on the image file
var response = client.DetectDocumentText(image);
string words = "";
foreach (var page in response.Pages)
{
foreach (var block in page.Blocks)
{
string box = string.Join(" - ", block.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
foreach (var paragraph in block.Paragraphs)
{
box = string.Join(" - ", paragraph.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
foreach (var word in paragraph.Words)
{
words += $" {string.Join("", word.Symbols.Select(s => s.Text))}";
}
}
}
}
Console.WriteLine(words);
}
所以我的问题:
- 如何从每个红色框中提取数据(即第一个文本框将 return 'brother',第二个文本框应 return 'learned new skills')?
- 如何从每个绿色问题中提取标记了哪个复选框(即问题 13 应该 return 'YES'、问题 14. 应该 return 'SELF TAUGHT' 等.)?
我只是使用了某些 PHP 脚本中的 API,但我认为您的问题不取决于编程语言。 您需要使用检测到的单词的坐标(准确地说是具有四个顶点的框)。然后,您可以找到与参与者的写作相关的问卷元素。 这个脚本对我来说是一个很好的切入点:
https://www.leanx.eu/tutorials/use-google-cloud-vision-api-to-process-invoices-and-receipts
您可以在任何启用 PHP 的网站空间上使用它 "as is",它为您提供了结构良好的概述,说明如何检索 API returns.
有了这些方框并知道问卷的文本,如果 google 检测到您的参与者所做的复选标记,应该很容易找到它们。复选标记的检测可能并不总是适用于 google 视觉,因为 google 的 OCR 并不总是能找到单个 "character"。