Windows ocr 可以识别自定义 symbols/font 吗？

Can Windows ocr recognize custom symbols/font?

我正在为 UWP 开发，Windows 有一个 OCR 引擎：Windows.Media.Ocr

我的问题是：有人知道是否可以训练 Windows OCR 来识别新字符或使用自定义字体？如果是，我该怎么做？

我想要实现的是识别非字母符号。我想根据示例识别字符 ⌰ (unicode: U+2330) 或 ⌖ (U+2316).

我要识别的字符是非任何语言的符号。

我在 WUP application 中使用了 Windows.Media.Ocr 库，这里是一些不同字体的测试结果

宋体

字体 - Arial
测试词 - 你好@世界
预期结果 - 你好@世界
原始结果 - 你好@世界
准确度 - 100%

机构FB

字体 - 代理FB
测试词 - 你好@世界
预期结果 - 你好@世界
原始结果 - 你好世界
准确率 - 84.6%（错过了 - @ 符号和一个 space）

现代

字体 - 现代
测试词 - 你好@世界
预期结果 - 你好@世界
原始结果 - 你好@世界
准确度 - 92.3%（W 被识别为 w）

Lucida 手写

字体 - Lucida手写 测试词 - 你好@世界
预期结果 - 你好@世界
原始结果 - HeUe@worw
准确率 - 46.1%

更新[]

Arial Unicode MS

字体 - Arial Unicode MS
测试符号 - ⌰ ⌖
预期结果 - ⌰ ⌖
原始结果 -（无法识别）
准确度 - 0%

更新 2

希望对您有所帮助。

我认为对您的问题的简短回答是否定的。正如 Supported languages sections in Windows.Media.Ocr 命名空间中所说：

There are 25 supported languages. Based on recognition accuracy and performance, supported languages are divided into three groups:

Excellent: Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Romanian, Serbian Cyrillic, Serbian Latin, Slovak, Spanish and Swedish.

Very good: Chinese Simplified, Greek, Japanese, Russian and Turkish.

Good: Chinese Traditional and Korean.

The language is required information for correct text recognition. Every language uses some language-specific resources, so it must be specified in advance.

Note Only languages installed on the device can be used. A user can install new languages through the Settings app.

因此，如果您的符号不适用于任何语言，OCR 引擎将无法识别它。

而对于自定义字体，正如 Vineet Choudhary 的回答所示，也许 OCR 引擎可以识别一些，文本识别的准确性取决于您的字体。如果是手写或草书文字，文字识别的准确率可能会很低。