如何将 Android 二进制字典解码为人类可读的格式,如 .xml
How to decode an Android binary dictionary to a human readable format like .xml
我有一个 .dict 目录,其中包含用于我的个性化键盘建议的双字母组文件。从环顾四周Android source I've gathered that the files are encoded in a binary dictionary format, described here。该维基页面描述了如何将 .xml 文件转换为 .dict 二进制字典,但没有描述如何将二进制字典转换为人类可读格式。从这些文件中提取人类可读数据以使用 Android 源中的函数的唯一方法是什么?
以下是有问题的文件:
谢谢
我不知道这是否有帮助,但参考您的陈述 "Would be excellent to have some java code showing how to read words from the binary dictionary",也许 this would be a good start. This is the GIT
它说它 returns 是一个单词列表,但我不确定它 returns 它的格式是什么,也不知道它的外观。此代码片段来自此页面的第 240 行。
> * Returns the list of cached files for a specific locale, one for each category.
> *
> * This will return exactly one file for each word list category that matches
> * the passed locale. If several files match the locale for any given category,
> * this returns the file with the closest match to the locale. For example, if
> * the passed word list is en_US, and for a category we have an en and an en_US
> * word list available, we'll return only the en_US one.
> * Thus, the list will contain as many files as there are categories.
> *
> * @param locale the locale to find the dictionary files for, as a string.
> * @param context the context on which to open the files upon.
> * @return an array of binary dictionary files, which may be empty but may not be null.
> */
> private static File[] getCachedWordLists(final String locale,
> final Context context) {
> final File[] directoryList = getCachedDirectoryList(context);
> if (null == directoryList) return EMPTY_FILE_ARRAY;
> final HashMap<String, FileAndMatchLevel> cacheFiles =
> new HashMap<String, FileAndMatchLevel>();
> for (File directory : directoryList) {
> if (!directory.isDirectory()) continue;
> final String dirLocale = getWordListIdFromFileName(directory.getName());
> final int matchLevel = LocaleUtils.getMatchLevel(dirLocale, locale);
> if (LocaleUtils.isMatch(matchLevel)) {
> final File[] wordLists = directory.listFiles();
> if (null != wordLists) {
> for (File wordList : wordLists) {
> final String category = getCategoryFromFileName(wordList.getName());
> final FileAndMatchLevel currentBestMatch = cacheFiles.get(category);
> if (null == currentBestMatch || currentBestMatch.mMatchLevel < matchLevel) {
> cacheFiles.put(category, new FileAndMatchLevel(wordList, matchLevel));
> }
> }
> }
> }
> }
> if (cacheFiles.isEmpty()) return EMPTY_FILE_ARRAY;
> final File[] result = new File[cacheFiles.size()];
> int index = 0;
> for (final FileAndMatchLevel entry : cacheFiles.values()) {
> result[index++] = entry.mFile;
> }
> return result;
> }
至于如何将 .dict 二进制文件转换为人类可读的形式,我知道这不是您要找的具体内容,但也许它会给您一个良好的开端。看起来您可能必须自己编写一些东西才能像他们那样进行转换 Here。他们写了这个脚本来处理这个过程。
"Lingoes Converter is a script written in PHP that can convert
.LD2/.LDX dictionaries of Lingoes into human-readable text files. The
script is based on Xiaoyun Zhu analysis (lingoes-extractor) on the
LD2/LDX dictionary format ."
我希望其中的一些内容至少可以给您一个开始。这是一个利基需求,绝对需要提供一个好的解决方案。希望你能弄明白!
我有一个 .dict 目录,其中包含用于我的个性化键盘建议的双字母组文件。从环顾四周Android source I've gathered that the files are encoded in a binary dictionary format, described here。该维基页面描述了如何将 .xml 文件转换为 .dict 二进制字典,但没有描述如何将二进制字典转换为人类可读格式。从这些文件中提取人类可读数据以使用 Android 源中的函数的唯一方法是什么?
以下是有问题的文件:
谢谢
我不知道这是否有帮助,但参考您的陈述 "Would be excellent to have some java code showing how to read words from the binary dictionary",也许 this would be a good start. This is the GIT
它说它 returns 是一个单词列表,但我不确定它 returns 它的格式是什么,也不知道它的外观。此代码片段来自此页面的第 240 行。
> * Returns the list of cached files for a specific locale, one for each category.
> *
> * This will return exactly one file for each word list category that matches
> * the passed locale. If several files match the locale for any given category,
> * this returns the file with the closest match to the locale. For example, if
> * the passed word list is en_US, and for a category we have an en and an en_US
> * word list available, we'll return only the en_US one.
> * Thus, the list will contain as many files as there are categories.
> *
> * @param locale the locale to find the dictionary files for, as a string.
> * @param context the context on which to open the files upon.
> * @return an array of binary dictionary files, which may be empty but may not be null.
> */
> private static File[] getCachedWordLists(final String locale,
> final Context context) {
> final File[] directoryList = getCachedDirectoryList(context);
> if (null == directoryList) return EMPTY_FILE_ARRAY;
> final HashMap<String, FileAndMatchLevel> cacheFiles =
> new HashMap<String, FileAndMatchLevel>();
> for (File directory : directoryList) {
> if (!directory.isDirectory()) continue;
> final String dirLocale = getWordListIdFromFileName(directory.getName());
> final int matchLevel = LocaleUtils.getMatchLevel(dirLocale, locale);
> if (LocaleUtils.isMatch(matchLevel)) {
> final File[] wordLists = directory.listFiles();
> if (null != wordLists) {
> for (File wordList : wordLists) {
> final String category = getCategoryFromFileName(wordList.getName());
> final FileAndMatchLevel currentBestMatch = cacheFiles.get(category);
> if (null == currentBestMatch || currentBestMatch.mMatchLevel < matchLevel) {
> cacheFiles.put(category, new FileAndMatchLevel(wordList, matchLevel));
> }
> }
> }
> }
> }
> if (cacheFiles.isEmpty()) return EMPTY_FILE_ARRAY;
> final File[] result = new File[cacheFiles.size()];
> int index = 0;
> for (final FileAndMatchLevel entry : cacheFiles.values()) {
> result[index++] = entry.mFile;
> }
> return result;
> }
至于如何将 .dict 二进制文件转换为人类可读的形式,我知道这不是您要找的具体内容,但也许它会给您一个良好的开端。看起来您可能必须自己编写一些东西才能像他们那样进行转换 Here。他们写了这个脚本来处理这个过程。
"Lingoes Converter is a script written in PHP that can convert .LD2/.LDX dictionaries of Lingoes into human-readable text files. The script is based on Xiaoyun Zhu analysis (lingoes-extractor) on the LD2/LDX dictionary format ."
我希望其中的一些内容至少可以给您一个开始。这是一个利基需求,绝对需要提供一个好的解决方案。希望你能弄明白!