如何允许在 android 中搜索也适用于字符重音?

How do I allow search in android that works with character accents aswell?

我在我的应用程序中实施了搜索机制,这样当我搜索姓名或电子邮件时,它会显示具有匹配字符的字符串。但是,我的列表中有一些带重音的字符串,当我使用与该特定重音相关的常规字符进行搜索时,假设我有字符串“àngela”并且我搜索 "angela" 它不会显示该字符串,除非我使用精确搜索字符串“àngela”。

我试图让它工作,不管重音与否,比如说如果我输入“à”,它应该显示所有包含“à”和 "a" 的字符串,反之亦然。知道如何去做吗?我在网上查找了一堆文章,例如:How to ignore accent in SQLite query (Android)" 并尝试了规范化器,但它部分有效,如果我搜索 "a",它确实也会显示带有常规字母的重音字母,但如果我搜索带有重音字母,它不显示任何内容。

这是我的过滤器代码:

 @Override
    public Filter getFilter() {
        return new Filter() {
            @Override
            protected FilterResults performFiltering(CharSequence charSequence) {
                String charString = charSequence.toString();
                if (charString.isEmpty()) {
                    mSearchGuestListResponseListFiltered = mSearchGuestListResponseList;
                } else {
                    List<RegisterGuestList.Guest> filteredList = new ArrayList<>();
                    for (RegisterGuestList.Guest row : mSearchGuestListResponseList) {

                        // name match condition. this might differ depending on your requirement
                        // here we are looking for name or phone number match
                        String firstName = row.getGuestFirstName().toLowerCase();
                        String lastName = row.getGuestLastName().toLowerCase();
                        String name = firstName + " " +lastName;
                        String email = row.getGuestEmail().toLowerCase();
                        if ( name.trim().contains(charString.toLowerCase().trim()) ||
                                email.trim().contains(charString.toLowerCase().trim())){
                            filteredList.add(row);
                            searchText = charString.toLowerCase();
                        }
                    }

                    mSearchGuestListResponseListFiltered = filteredList;
                }

                FilterResults filterResults = new FilterResults();
                filterResults.values = mSearchGuestListResponseListFiltered;
                return filterResults;
            }

            @Override
            protected void publishResults(CharSequence charSequence, FilterResults filterResults) {
                mSearchGuestListResponseListFiltered = (ArrayList<RegisterGuestList.Guest>) filterResults.values;
                notifyDataSetChanged();
            }
        };
    }

如果有人感兴趣,这是完整的适配器 class:https://pastebin.com/VxsWWMiS 这是相应的 activity 视图:

searchView.setOnQueryTextListener(new SearchView.OnQueryTextListener() {
            @Override
            public boolean onQueryTextSubmit(String query) {
                mSearchGuestListAdapter.getFilter().filter(query);

                return false;
            }

            @Override
            public boolean onQueryTextChange(String newText) {
                mSearchGuestListAdapter.getFilter().filter(newText);
                mSearchGuestListAdapter.notifyDataSetChanged();
                mSearchGuestListAdapter.setFilter(newText);

                if(mSearchGuestListAdapter.getItemCount() == 0){


                    String sourceString = "No match found for <b>" + newText + "</b> ";
                    mNoMatchTextView.setText(Html.fromHtml(sourceString));
                } else {
                    mEmptyRelativeLayout.setVisibility(View.GONE);
                    mRecyclerView.setVisibility(View.VISIBLE);
                }
                return false;
            }
        });

如有必要,很乐意分享任何详细信息。另外,我在搜索时随机得到 indexoutofboundexception onBind() 方法(使用 recyclerview 列表):

java.lang.IndexOutOfBoundsException: Index: 7, Size: 0
        at java.util.ArrayList.get(ArrayList.java:437)

知道怎么做吗?

一般来说,我建议使用强度设置为 Collator.PRIMARYCollator 来比较包含重音和不同大小写的字符串(例如,Nn é 对比 e)。不幸的是,Collator 没有 contains() 函数。

所以我们将自己制作。

private static boolean contains(String source, String target) {
    if (target.length() > source.length()) {
        return false;
    }

    Collator collator = Collator.getInstance();
    collator.setStrength(Collator.PRIMARY);

    int end = source.length() - target.length() + 1;

    for (int i = 0; i < end; i++) {
        String sourceSubstring = source.substring(i, i + target.length());

        if (collator.compare(sourceSubstring, target) == 0) {
            return true;
        }
    }

    return false;
}

这会遍历源字符串,并检查与搜索目标长度相同的每个子字符串是否与搜索目标相等,就 Collat​​or 而言。

例如,假设我们的源字符串是 "This is a Tèst" 并且我们正在搜索单词 "test"。此方法将遍历每个四个字母的子字符串:

This
his 
is i
s is
 is 
is a
s a 
 a T
a Tè
 Tès
Tèst

一旦找到匹配,return 就会为真。由于强度设置为 Collator.PRIMARY,整理器认为 "Tèst""test" 相等,所以我们的方法 returns true.

很有可能对该方法进行更多优化,但这应该是一个合理的起点。

Edit:一种可能的优化是利用整理键以及 RuleBasedCollatorRuleBasedCollationKey 的已知实现细节(假设您有 Google's Guava in your project):

private static boolean containsBytes(String source, String target) {
    Collator collator = Collator.getInstance();
    collator.setStrength(Collator.PRIMARY);

    byte[] sourceBytes = dropLastFour(collator.getCollationKey(source).toByteArray());
    byte[] targetBytes = dropLastFour(collator.getCollationKey(target).toByteArray());

    return Bytes.indexOf(sourceBytes, targetBytes) >= 0;
}

private static byte[] dropLastFour(byte[] in) {
    return Arrays.copyOf(in, in.length - 4);
}

这相当脆弱(可能不适用于所有语言环境),但在我的测试中,它的速度提高了 2 到 10 倍。

编辑:要支持突出显示,您应该将contains()转换为indexOf(),然后使用该信息:

private static int indexOf(String source, String target) {
    if (target.length() > source.length()) {
        return -1;
    }

    Collator collator = Collator.getInstance();
    collator.setStrength(Collator.PRIMARY);

    int end = source.length() - target.length() + 1;

    for (int i = 0; i < end; i++) {
        String sourceSubstring = source.substring(i, i + target.length());

        if (collator.compare(sourceSubstring, target) == 0) {
            return i;
        }
    }

    return -1;
}

然后你可以这样应用它:

String guestWholeName = guest.getGuestFirstName() + " " + guest.getGuestLastName();
int wholeNameIndex = indexOf(guestWholeName, searchText);

if (wholeNameIndex > -1) {
    Timber.d("guest name first : guest.getGuestFirstName() %s", guest.getGuestFirstName());
    Timber.d("guest name last : guest.getGuestLastName() %s", guest.getGuestLastName());

    int endPos = wholeNameIndex + searchText.length();

    Spannable spannable = new SpannableString(guestWholeName);
    Typeface firstNameFont = Typeface.createFromAsset(context.getAssets(), "fonts/Graphik-Semibold.otf");
    spannable.setSpan(new CustomTypefaceSpan("", firstNameFont), wholeNameIndex, endPos, Spannable.SPAN_EXCLUSIVE_EXCLUSIVE);
    Objects.requireNonNull(guestName).setText(spannable);
} else {
    Objects.requireNonNull(guestName).setText(guestWholeName);
}