如何允许在 android 中搜索也适用于字符重音?
How do I allow search in android that works with character accents aswell?
我在我的应用程序中实施了搜索机制,这样当我搜索姓名或电子邮件时,它会显示具有匹配字符的字符串。但是,我的列表中有一些带重音的字符串,当我使用与该特定重音相关的常规字符进行搜索时,假设我有字符串“àngela”并且我搜索 "angela" 它不会显示该字符串,除非我使用精确搜索字符串“àngela”。
我试图让它工作,不管重音与否,比如说如果我输入“à”,它应该显示所有包含“à”和 "a" 的字符串,反之亦然。知道如何去做吗?我在网上查找了一堆文章,例如:How to ignore accent in SQLite query (Android)" 并尝试了规范化器,但它部分有效,如果我搜索 "a",它确实也会显示带有常规字母的重音字母,但如果我搜索带有重音字母,它不显示任何内容。
这是我的过滤器代码:
@Override
public Filter getFilter() {
return new Filter() {
@Override
protected FilterResults performFiltering(CharSequence charSequence) {
String charString = charSequence.toString();
if (charString.isEmpty()) {
mSearchGuestListResponseListFiltered = mSearchGuestListResponseList;
} else {
List<RegisterGuestList.Guest> filteredList = new ArrayList<>();
for (RegisterGuestList.Guest row : mSearchGuestListResponseList) {
// name match condition. this might differ depending on your requirement
// here we are looking for name or phone number match
String firstName = row.getGuestFirstName().toLowerCase();
String lastName = row.getGuestLastName().toLowerCase();
String name = firstName + " " +lastName;
String email = row.getGuestEmail().toLowerCase();
if ( name.trim().contains(charString.toLowerCase().trim()) ||
email.trim().contains(charString.toLowerCase().trim())){
filteredList.add(row);
searchText = charString.toLowerCase();
}
}
mSearchGuestListResponseListFiltered = filteredList;
}
FilterResults filterResults = new FilterResults();
filterResults.values = mSearchGuestListResponseListFiltered;
return filterResults;
}
@Override
protected void publishResults(CharSequence charSequence, FilterResults filterResults) {
mSearchGuestListResponseListFiltered = (ArrayList<RegisterGuestList.Guest>) filterResults.values;
notifyDataSetChanged();
}
};
}
如果有人感兴趣,这是完整的适配器 class:https://pastebin.com/VxsWWMiS
这是相应的 activity 视图:
searchView.setOnQueryTextListener(new SearchView.OnQueryTextListener() {
@Override
public boolean onQueryTextSubmit(String query) {
mSearchGuestListAdapter.getFilter().filter(query);
return false;
}
@Override
public boolean onQueryTextChange(String newText) {
mSearchGuestListAdapter.getFilter().filter(newText);
mSearchGuestListAdapter.notifyDataSetChanged();
mSearchGuestListAdapter.setFilter(newText);
if(mSearchGuestListAdapter.getItemCount() == 0){
String sourceString = "No match found for <b>" + newText + "</b> ";
mNoMatchTextView.setText(Html.fromHtml(sourceString));
} else {
mEmptyRelativeLayout.setVisibility(View.GONE);
mRecyclerView.setVisibility(View.VISIBLE);
}
return false;
}
});
如有必要,很乐意分享任何详细信息。另外,我在搜索时随机得到 indexoutofboundexception onBind() 方法(使用 recyclerview 列表):
java.lang.IndexOutOfBoundsException: Index: 7, Size: 0
at java.util.ArrayList.get(ArrayList.java:437)
知道怎么做吗?
一般来说,我建议使用强度设置为 Collator.PRIMARY
的 Collator
来比较包含重音和不同大小写的字符串(例如,N
与 n
é
对比 e
)。不幸的是,Collator
没有 contains()
函数。
所以我们将自己制作。
private static boolean contains(String source, String target) {
if (target.length() > source.length()) {
return false;
}
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
int end = source.length() - target.length() + 1;
for (int i = 0; i < end; i++) {
String sourceSubstring = source.substring(i, i + target.length());
if (collator.compare(sourceSubstring, target) == 0) {
return true;
}
}
return false;
}
这会遍历源字符串,并检查与搜索目标长度相同的每个子字符串是否与搜索目标相等,就 Collator 而言。
例如,假设我们的源字符串是 "This is a Tèst"
并且我们正在搜索单词 "test"
。此方法将遍历每个四个字母的子字符串:
This
his
is i
s is
is
is a
s a
a T
a Tè
Tès
Tèst
一旦找到匹配,return 就会为真。由于强度设置为 Collator.PRIMARY
,整理器认为 "Tèst"
和 "test"
相等,所以我们的方法 returns true
.
很有可能对该方法进行更多优化,但这应该是一个合理的起点。
Edit:一种可能的优化是利用整理键以及 RuleBasedCollator
和 RuleBasedCollationKey
的已知实现细节(假设您有 Google's Guava in your project):
private static boolean containsBytes(String source, String target) {
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
byte[] sourceBytes = dropLastFour(collator.getCollationKey(source).toByteArray());
byte[] targetBytes = dropLastFour(collator.getCollationKey(target).toByteArray());
return Bytes.indexOf(sourceBytes, targetBytes) >= 0;
}
private static byte[] dropLastFour(byte[] in) {
return Arrays.copyOf(in, in.length - 4);
}
这相当脆弱(可能不适用于所有语言环境),但在我的测试中,它的速度提高了 2 到 10 倍。
编辑:要支持突出显示,您应该将contains()
转换为indexOf()
,然后使用该信息:
private static int indexOf(String source, String target) {
if (target.length() > source.length()) {
return -1;
}
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
int end = source.length() - target.length() + 1;
for (int i = 0; i < end; i++) {
String sourceSubstring = source.substring(i, i + target.length());
if (collator.compare(sourceSubstring, target) == 0) {
return i;
}
}
return -1;
}
然后你可以这样应用它:
String guestWholeName = guest.getGuestFirstName() + " " + guest.getGuestLastName();
int wholeNameIndex = indexOf(guestWholeName, searchText);
if (wholeNameIndex > -1) {
Timber.d("guest name first : guest.getGuestFirstName() %s", guest.getGuestFirstName());
Timber.d("guest name last : guest.getGuestLastName() %s", guest.getGuestLastName());
int endPos = wholeNameIndex + searchText.length();
Spannable spannable = new SpannableString(guestWholeName);
Typeface firstNameFont = Typeface.createFromAsset(context.getAssets(), "fonts/Graphik-Semibold.otf");
spannable.setSpan(new CustomTypefaceSpan("", firstNameFont), wholeNameIndex, endPos, Spannable.SPAN_EXCLUSIVE_EXCLUSIVE);
Objects.requireNonNull(guestName).setText(spannable);
} else {
Objects.requireNonNull(guestName).setText(guestWholeName);
}
我在我的应用程序中实施了搜索机制,这样当我搜索姓名或电子邮件时,它会显示具有匹配字符的字符串。但是,我的列表中有一些带重音的字符串,当我使用与该特定重音相关的常规字符进行搜索时,假设我有字符串“àngela”并且我搜索 "angela" 它不会显示该字符串,除非我使用精确搜索字符串“àngela”。
我试图让它工作,不管重音与否,比如说如果我输入“à”,它应该显示所有包含“à”和 "a" 的字符串,反之亦然。知道如何去做吗?我在网上查找了一堆文章,例如:How to ignore accent in SQLite query (Android)" 并尝试了规范化器,但它部分有效,如果我搜索 "a",它确实也会显示带有常规字母的重音字母,但如果我搜索带有重音字母,它不显示任何内容。
这是我的过滤器代码:
@Override
public Filter getFilter() {
return new Filter() {
@Override
protected FilterResults performFiltering(CharSequence charSequence) {
String charString = charSequence.toString();
if (charString.isEmpty()) {
mSearchGuestListResponseListFiltered = mSearchGuestListResponseList;
} else {
List<RegisterGuestList.Guest> filteredList = new ArrayList<>();
for (RegisterGuestList.Guest row : mSearchGuestListResponseList) {
// name match condition. this might differ depending on your requirement
// here we are looking for name or phone number match
String firstName = row.getGuestFirstName().toLowerCase();
String lastName = row.getGuestLastName().toLowerCase();
String name = firstName + " " +lastName;
String email = row.getGuestEmail().toLowerCase();
if ( name.trim().contains(charString.toLowerCase().trim()) ||
email.trim().contains(charString.toLowerCase().trim())){
filteredList.add(row);
searchText = charString.toLowerCase();
}
}
mSearchGuestListResponseListFiltered = filteredList;
}
FilterResults filterResults = new FilterResults();
filterResults.values = mSearchGuestListResponseListFiltered;
return filterResults;
}
@Override
protected void publishResults(CharSequence charSequence, FilterResults filterResults) {
mSearchGuestListResponseListFiltered = (ArrayList<RegisterGuestList.Guest>) filterResults.values;
notifyDataSetChanged();
}
};
}
如果有人感兴趣,这是完整的适配器 class:https://pastebin.com/VxsWWMiS 这是相应的 activity 视图:
searchView.setOnQueryTextListener(new SearchView.OnQueryTextListener() {
@Override
public boolean onQueryTextSubmit(String query) {
mSearchGuestListAdapter.getFilter().filter(query);
return false;
}
@Override
public boolean onQueryTextChange(String newText) {
mSearchGuestListAdapter.getFilter().filter(newText);
mSearchGuestListAdapter.notifyDataSetChanged();
mSearchGuestListAdapter.setFilter(newText);
if(mSearchGuestListAdapter.getItemCount() == 0){
String sourceString = "No match found for <b>" + newText + "</b> ";
mNoMatchTextView.setText(Html.fromHtml(sourceString));
} else {
mEmptyRelativeLayout.setVisibility(View.GONE);
mRecyclerView.setVisibility(View.VISIBLE);
}
return false;
}
});
如有必要,很乐意分享任何详细信息。另外,我在搜索时随机得到 indexoutofboundexception onBind() 方法(使用 recyclerview 列表):
java.lang.IndexOutOfBoundsException: Index: 7, Size: 0
at java.util.ArrayList.get(ArrayList.java:437)
知道怎么做吗?
一般来说,我建议使用强度设置为 Collator.PRIMARY
的 Collator
来比较包含重音和不同大小写的字符串(例如,N
与 n
é
对比 e
)。不幸的是,Collator
没有 contains()
函数。
所以我们将自己制作。
private static boolean contains(String source, String target) {
if (target.length() > source.length()) {
return false;
}
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
int end = source.length() - target.length() + 1;
for (int i = 0; i < end; i++) {
String sourceSubstring = source.substring(i, i + target.length());
if (collator.compare(sourceSubstring, target) == 0) {
return true;
}
}
return false;
}
这会遍历源字符串,并检查与搜索目标长度相同的每个子字符串是否与搜索目标相等,就 Collator 而言。
例如,假设我们的源字符串是 "This is a Tèst"
并且我们正在搜索单词 "test"
。此方法将遍历每个四个字母的子字符串:
This
his
is i
s is
is
is a
s a
a T
a Tè
Tès
Tèst
一旦找到匹配,return 就会为真。由于强度设置为 Collator.PRIMARY
,整理器认为 "Tèst"
和 "test"
相等,所以我们的方法 returns true
.
很有可能对该方法进行更多优化,但这应该是一个合理的起点。
Edit:一种可能的优化是利用整理键以及 RuleBasedCollator
和 RuleBasedCollationKey
的已知实现细节(假设您有 Google's Guava in your project):
private static boolean containsBytes(String source, String target) {
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
byte[] sourceBytes = dropLastFour(collator.getCollationKey(source).toByteArray());
byte[] targetBytes = dropLastFour(collator.getCollationKey(target).toByteArray());
return Bytes.indexOf(sourceBytes, targetBytes) >= 0;
}
private static byte[] dropLastFour(byte[] in) {
return Arrays.copyOf(in, in.length - 4);
}
这相当脆弱(可能不适用于所有语言环境),但在我的测试中,它的速度提高了 2 到 10 倍。
编辑:要支持突出显示,您应该将contains()
转换为indexOf()
,然后使用该信息:
private static int indexOf(String source, String target) {
if (target.length() > source.length()) {
return -1;
}
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
int end = source.length() - target.length() + 1;
for (int i = 0; i < end; i++) {
String sourceSubstring = source.substring(i, i + target.length());
if (collator.compare(sourceSubstring, target) == 0) {
return i;
}
}
return -1;
}
然后你可以这样应用它:
String guestWholeName = guest.getGuestFirstName() + " " + guest.getGuestLastName();
int wholeNameIndex = indexOf(guestWholeName, searchText);
if (wholeNameIndex > -1) {
Timber.d("guest name first : guest.getGuestFirstName() %s", guest.getGuestFirstName());
Timber.d("guest name last : guest.getGuestLastName() %s", guest.getGuestLastName());
int endPos = wholeNameIndex + searchText.length();
Spannable spannable = new SpannableString(guestWholeName);
Typeface firstNameFont = Typeface.createFromAsset(context.getAssets(), "fonts/Graphik-Semibold.otf");
spannable.setSpan(new CustomTypefaceSpan("", firstNameFont), wholeNameIndex, endPos, Spannable.SPAN_EXCLUSIVE_EXCLUSIVE);
Objects.requireNonNull(guestName).setText(spannable);
} else {
Objects.requireNonNull(guestName).setText(guestWholeName);
}