如何将 Android 二进制字典解码为人类可读的格式,如 .xml

How to decode an Android binary dictionary to a human readable format like .xml

我有一个 .dict 目录,其中包含用于我的个性化键盘建议的双字母组文件。从环顾四周Android source I've gathered that the files are encoded in a binary dictionary format, described here。该维基页面描述了如何将 .xml 文件转换为 .dict 二进制字典,但没有描述如何将二进制字典转换为人类可读格式。从这些文件中提取人类可读数据以使用 Android 源中的函数的唯一方法是什么?

以下是有问题的文件:

谢谢

我不知道这是否有帮助,但参考您的陈述 "Would be excellent to have some java code showing how to read words from the binary dictionary",也许 this would be a good start. This is the GIT

它说它 returns 是一个单词列表,但我不确定它 returns 它的格式是什么,也不知道它的外观。此代码片段来自此页面的第 240 行。

> * Returns the list of cached files for a specific locale, one for each category.
>      *
>      * This will return exactly one file for each word list category that matches
>      * the passed locale. If several files match the locale for any given category,
>      * this returns the file with the closest match to the locale. For example, if
>      * the passed word list is en_US, and for a category we have an en and an en_US
>      * word list available, we'll return only the en_US one.
>      * Thus, the list will contain as many files as there are categories.
>      *
>      * @param locale the locale to find the dictionary files for, as a string.
>      * @param context the context on which to open the files upon.
>      * @return an array of binary dictionary files, which may be empty but may not be null.
>      */
>     private static File[] getCachedWordLists(final String locale,
>             final Context context) {
>         final File[] directoryList = getCachedDirectoryList(context);
>         if (null == directoryList) return EMPTY_FILE_ARRAY;
>         final HashMap<String, FileAndMatchLevel> cacheFiles =
>                 new HashMap<String, FileAndMatchLevel>();
>         for (File directory : directoryList) {
>             if (!directory.isDirectory()) continue;
>             final String dirLocale = getWordListIdFromFileName(directory.getName());
>             final int matchLevel = LocaleUtils.getMatchLevel(dirLocale, locale);
>             if (LocaleUtils.isMatch(matchLevel)) {
>                 final File[] wordLists = directory.listFiles();
>                 if (null != wordLists) {
>                     for (File wordList : wordLists) {
>                         final String category = getCategoryFromFileName(wordList.getName());
>                         final FileAndMatchLevel currentBestMatch = cacheFiles.get(category);
>                         if (null == currentBestMatch || currentBestMatch.mMatchLevel < matchLevel) {
>                             cacheFiles.put(category, new FileAndMatchLevel(wordList, matchLevel));
>                         }
>                     }
>                 }
>             }
>         }
>         if (cacheFiles.isEmpty()) return EMPTY_FILE_ARRAY;
>         final File[] result = new File[cacheFiles.size()];
>         int index = 0;
>         for (final FileAndMatchLevel entry : cacheFiles.values()) {
>             result[index++] = entry.mFile;
>         }
>         return result;
>     }

至于如何将 .dict 二进制文件转换为人类可读的形式,我知道这不是您要找的具体内容,但也许它会给您一个良好的开端。看起来您可能必须自己编写一些东西才能像他们那样进行转换 Here。他们写了这个脚本来处理这个过程。

"Lingoes Converter is a script written in PHP that can convert .LD2/.LDX dictionaries of Lingoes into human-readable text files. The script is based on Xiaoyun Zhu analysis (lingoes-extractor) on the LD2/LDX dictionary format ."

我希望其中的一些内容至少可以给您一个开始。这是一个利基需求,绝对需要提供一个好的解决方案。希望你能弄明白!