给定的字符集涵盖哪些书写系统

Which writing systems does given character set cover

找出一组给定的 Unicode 字符支持哪些书写系统（如拉丁文、希伯来文、阿拉伯文、片假名、中文字符）的最简单方法是什么？

检查集合中每个字符的 Script 和 Script_Extensions 属性，如 UAX #24 中所述。

Unicode characters are divided into non-overlapping ranges called blocks [Blocks]. Many of these blocks have a name derived from a script name, because characters of that script are primarily encoded in that block. However, blocks and scripts differ in the following ways:

Blocks are simply ranges, and often contain code points that are unassigned.

Characters from the same script may be encoded in several different blocks.

Characters from different scripts may be encoded in the same block.

As a result, using the block names as simplistic substitute for script identity generally leads to poor results. For example, see Annex A, Character Blocks, in Unicode Technical Standard #18, "Unicode Regular Expressions" [UTS18].

后面的文件里面[UTS18], pay your priority attention to Writing Systems Versus Blocks in Annex A: Character Blocks.

在这一点上，我倾向于测试字符集中是否出现了来自脚本的足够字形。

该方法需要两个准备步骤：

整理一套Unicode支持的书写系统（脚本）
对于每个脚本，定义一个包含该脚本字符的字符集

然后我可以通过测试“脚本 X 的字符集的字符是否也足够字符集 A 的成员”来解决“字符集 A 是否支持脚本 X”的问题。如果我对步骤 (1) 中的每个脚本都这样做，我会得到一个受支持脚本的列表。

一二三提供的link引用了一个data file，将Unicode字符映射到它们各自的脚本，这在步骤（1）和（2）中是非常宝贵的。

给定的字符集涵盖哪些书写系统

Which writing systems does given character set cover

language-agnostic

unicode