如何将 Android 二进制字典解码为人类可读的格式,如 .xml

我有一个 .dict 目录,其中包含用于我的个性化键盘建议的双字母组文件。从环顾四周Android source I've gathered that the files are encoded in a binary dictionary format, described here。该维基页面描述了如何将 .xml 文件转换为 .dict 二进制字典,但没有描述如何将二进制字典转换为人类可读格式。从这些文件中提取人类可读数据以使用 Android 源中的函数的唯一方法是什么?



我不知道这是否有帮助,但参考您的陈述 "Would be excellent to have some java code showing how to read words from the binary dictionary",也许 this would be a good start. This is the GIT

它说它 returns 是一个单词列表,但我不确定它 returns 它的格式是什么,也不知道它的外观。此代码片段来自此页面的第 240 行。

> * Returns the list of cached files for a specific locale, one for each category.
>      *
>      * This will return exactly one file for each word list category that matches
>      * the passed locale. If several files match the locale for any given category,
>      * this returns the file with the closest match to the locale. For example, if
>      * the passed word list is en_US, and for a category we have an en and an en_US
>      * word list available, we'll return only the en_US one.
>      * Thus, the list will contain as many files as there are categories.
>      *
>      * @param locale the locale to find the dictionary files for, as a string.
>      * @param context the context on which to open the files upon.
>      * @return an array of binary dictionary files, which may be empty but may not be null.
>      */
>     private static File[] getCachedWordLists(final String locale,
>             final Context context) {
>         final File[] directoryList = getCachedDirectoryList(context);
>         if (null == directoryList) return EMPTY_FILE_ARRAY;
>         final HashMap<String, FileAndMatchLevel> cacheFiles =
>                 new HashMap<String, FileAndMatchLevel>();
>         for (File directory : directoryList) {
>             if (!directory.isDirectory()) continue;
>             final String dirLocale = getWordListIdFromFileName(directory.getName());
>             final int matchLevel = LocaleUtils.getMatchLevel(dirLocale, locale);
>             if (LocaleUtils.isMatch(matchLevel)) {
>                 final File[] wordLists = directory.listFiles();
>                 if (null != wordLists) {
>                     for (File wordList : wordLists) {
>                         final String category = getCategoryFromFileName(wordList.getName());
>                         final FileAndMatchLevel currentBestMatch = cacheFiles.get(category);
>                         if (null == currentBestMatch || currentBestMatch.mMatchLevel < matchLevel) {
>                             cacheFiles.put(category, new FileAndMatchLevel(wordList, matchLevel));
>                         }
>                     }
>                 }
>             }
>         }
>         if (cacheFiles.isEmpty()) return EMPTY_FILE_ARRAY;
>         final File[] result = new File[cacheFiles.size()];
>         int index = 0;
>         for (final FileAndMatchLevel entry : cacheFiles.values()) {
>             result[index++] = entry.mFile;
>         }
>         return result;
>     }

至于如何将 .dict 二进制文件转换为人类可读的形式,我知道这不是您要找的具体内容,但也许它会给您一个良好的开端。看起来您可能必须自己编写一些东西才能像他们那样进行转换 Here。他们写了这个脚本来处理这个过程。

"Lingoes Converter is a script written in PHP that can convert .LD2/.LDX dictionaries of Lingoes into human-readable text files. The script is based on Xiaoyun Zhu analysis (lingoes-extractor) on the LD2/LDX dictionary format ."
