构建与 uchar.h 文件和 ICU 版本相关的训练工具时出现 Tesseract 错误

Tesseract error while building training tools related to uchar.h file and ICU version

我尝试按照 here 中的教程为 Tesseract 构建训练工具。我收到此错误。

depbase=`echo boxchar.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
/bin/sh ../../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. -I../..  -DNDEBUG -DPANGO_ENABLE_ENGINE -I../../src/api -I../../src/api -I../../src/ccmain -I../../src/ccutil -I../../src/ccstruct -I../../src/lstm -I../../src/arch -I../../src/viewer -I../../src/textord -I../../src/dict -I../../src/classify -I../../src/wordrec -I../../src/cutil  -I/usr/local/include/leptonica   -I/usr/include/pango-1.0 -I/usr/include/fribidi -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include   -I/usr/include/cairo -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/libpng15 -I/usr/include/uuid -I/usr/include/libdrm    -g -O2 -std=c++11 -MT boxchar.lo -MD -MP -MF $depbase.Tpo -c -o boxchar.lo boxchar.cpp &&\
mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../.. -DNDEBUG -DPANGO_ENABLE_ENGINE -I../../src/api -I../../src/api -I../../src/ccmain -I../../src/ccutil -I../../src/ccstruct -I../../src/lstm -I../../src/arch -I../../src/viewer -I../../src/textord -I../../src/dict -I../../src/classify -I../../src/wordrec -I../../src/cutil -I/usr/local/include/leptonica -I/usr/include/pango-1.0 -I/usr/include/fribidi -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/cairo -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/libpng15 -I/usr/include/uuid -I/usr/include/libdrm -g -O2 -std=c++11 -MT boxchar.lo -MD -MP -MF .deps/boxchar.Tpo -c boxchar.cpp  -fPIC -DPIC -o .libs/boxchar.o
boxchar.cpp: In member function 'void tesseract::BoxChar::GetDirection(int*, int*) const':
boxchar.cpp:66:42: error: 'U_RIGHT_TO_LEFT_ISOLATE' was not declared in this scope
         dir == U_ARABIC_NUMBER || dir == U_RIGHT_TO_LEFT_ISOLATE) {
                                          ^
make[1]: *** [boxchar.lo] Error 1
make[1]: Leaving directory `/home/vmuser/ocrd-train-master/tesseract-fd492062d08a2f55001a639f2015b8524c7e9ad4/src/training'
make: *** [training] Error 2

我查看了这条 Github 评论 here,其中指出:

http://source.icu-project.org/repos/icu/tags/release-52-1/icu4c/source/common/unicode/uchar.h
http://source.icu-project.org/repos/icu/tags/release-50-1-2/icu4c/source/common/unicode/uchar.h

The training tools need icu version 52 and up. The icu version in RHEL & CentOS 7 is 50.

并且从 locate uchar.h 我发现它位于 /usr/include/uchar.h 并且 /usr/include/unicode/uchar.h

我现在不确定要更改哪个文件。我不知道这是否看起来像一个非常愚蠢的问题

提前致谢。

所以,我终于自己弄明白了:

从以下位置下载 icu 软件包: http://download.icu-project.org/files/icu4c/

我使用了这个文件:http://download.icu-project.org/files/icu4c/59.1/icu4c-59_1-src.tgz

之后

tar zxf (tar file)
cd icu/source
make
sudo make install

在此之后 Tesseract Directory

make training
sudo make training-install