为什么来自 Tesseract 的边界框没有在图像文本上对齐？

Question

我正在使用 tesseract R 包来识别图像文件中的文本。但是，在绘制单词的边界框时，坐标似乎不正确。

为什么单词“This”的边界框未与图像中的文本“This”对齐？
有没有更简单的方法来绘制图像上的所有边界框矩形？

library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

Answer 1

这是因为图片的x,y坐标是从左上数起，而rect是从左下数起。图像是 480 像素高，所以我们可以这样做：

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = 480 - text$y1[1], 
  xright = text$x2[1], 
  ytop = 480 - text$y2[1])

或者，为了展示这个概括：

plot(image)

rect(
  xleft = text$x1, 
  ybottom = magick::image_info(image)$height - text$y1, 
  xright = text$x2, 
  ytop = magick::image_info(image)$height - text$y2,
  border = sample(128, nrow(text)))

为什么来自 Tesseract 的边界框没有在图像文本上对齐？

Why are the Bounding Boxes from Tesseract not aligned on the image text?

ocr

tesseract

r

bounding-box