字符串基数排序 - StringIndexOutOfBoundsEception
String Radix Sort - StringIndexOutOfBoundsEception
我正在编写自己的 Radix Sort 方法来对字符串中的单词进行排序(the big black cat sat on the
beautiful brown mat
将被排序为 beautiful big black brown cat mat on sat the the
)。该方法接收单个单词的列表(我自己的列表界面)并重新排序列表。
到目前为止,这是我的方法:
public static void stringRadixSort(List<String> list, int letters) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 26);
int letterNumber = 1; //Sorts list by 1st letter of each word, then 2nd etc.
for (int i = 0; i < letters; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
if (word.length() > letters) throw new UnsortableException("The list contains a word that holds more letters than the given maximum number of letters."
+ "\nMax Letters: " + letters + "\nWord: " + word);
String letter = word.substring(letterNumber - 1, letterNumber); //EXCEPTION THROWN
char ch = letter.charAt(0);
int index = ch - 'a'; //gets index of each letter ('a' = buckets[0], 'z' = buckets[25]
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber++;
}
}
我的方法的(唯一的,我希望的)问题是,当我阅读单词的每个字符时,我创建了单词的单个字母子串。由于外部 for
循环运行 letters
次(其中 letters
是列表中单词的最大长度),当此循环的迭代次数大于当前单词的长度 - 即 letterNumber > word.length()
- 因此它试图使用大于字符串长度的字符串索引创建子字符串。
我如何调整我的方法,以便它只创建 letterNumber == word.length()
之前的每个单词的子字符串,然后能够将排序算法应用于这些较短的单词 - "a" 会变成之前"aa".
你为什么不替换
String letter = word.substring(letterNumber - 1, letterNumber);
char ch = letter.charAt(0);
和
char ch = word.charAt(letterNumber - 1);
直接给你 char
。但这并不能解决 IndexOutOfBoundException
.
的问题
你当然应该捕获异常并处理它。也许为这种情况创建一个桶是好的:当这个词对于当前迭代来说太短时,它被分类到一个桶中。合并列表时,先取这个桶的元素。
public static void stringRadixSort(List<String> list, int letters) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 27);
int letterNumber = 1; //Sorts list by 1st letter of each word, then 2nd etc.
for (int i = 0; i < letters; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
if (word.length() > letters) throw new UnsortableException("The list contains a word that holds more letters than the given maximum number of letters."
+ "\nMax Letters: " + letters + "\nWord: " + word);
int index;
if(word.length() > letterNumber) {
char ch = word.charAt(letterNumber - 1);
index = ch - 'a' + 1; //gets index of each letter ('a' = buckets[1], 'z' = buckets[26], buckets[0] is for short words
} else {
index = 0;
}
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber++;
}
}
只需将比字符串长度短的元素分组到一个额外的组中。您还需要先对最不重要(相关)的字符进行排序。以下代码使用 java 集合而不是您使用的任何数据结构:
public static void stringRadixSort(List<String> list, int letters) {
if (list.size() <= 1) {
return;
}
List<String>[] buckets = new List[27];
for (int i = 0; i < buckets.length; i++) {
buckets[i] = new LinkedList<>();
}
int largestLength = -1;
int secondLargestLength = 0;
for (String s : list) {
int length = s.length();
if (length >= largestLength) {
secondLargestLength = largestLength;
largestLength = length;
} else if (secondLargestLength < length) {
secondLargestLength = length;
}
}
if (largestLength > letters) {
throw new IllegalArgumentException("one of the strings is too long");
}
for (int i = secondLargestLength == largestLength ? secondLargestLength-1 : secondLargestLength; i >= 0; i--) {
for (String word : list) {
int index = (word.length() <= i) ? 0 : word.charAt(i) - ('a' - 1);
buckets[index].add(word);
}
list.clear();
for (List<String> lst : buckets) {
if (lst != null) {
list.addAll(lst);
lst.clear();
}
}
}
}
在我所有的尝试中,我一直在先按最重要的字母(每个单词的第一个字母)对单词进行排序,然后是下一个重要的字母,依此类推。当然,基数排序依赖于排序最不重要的 digit/letter(number/word 的最后一个 digit/letter)。因此,我不是通过关注 letterNumber = 1
开始并在每次迭代后递增它来迭代我的外部 for
循环,而是从 letterNumber = maxWordLength
开始,然后在每次迭代后递减它,这样每次迭代都会比较下一个最重要的字母。
@SuppressWarnings("unchecked")
public static void stringRadixSort(List<String> list) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 27);
//Find longest word in list
int maxWordLength = 0;
for (String word : list) {
if (word.length() > maxWordLength) {
maxWordLength = word.length();
}
}
//Sorts list based on least significant letter (last letter of word) to most significant
int letterNumber = maxWordLength;
for (int i = 0; i < maxWordLength; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
int index = 0;
if(word.length() >= letterNumber) {
char ch = word.charAt(letterNumber - 1);
index = ch - 'a' + 1; //gets index of each letter ('a' = buckets[1], 'z' = buckets[26], buckets[0] is for words shorter than 'letterNumber')
}
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber--;
}
}
我正在编写自己的 Radix Sort 方法来对字符串中的单词进行排序(the big black cat sat on the
beautiful brown mat
将被排序为 beautiful big black brown cat mat on sat the the
)。该方法接收单个单词的列表(我自己的列表界面)并重新排序列表。
到目前为止,这是我的方法:
public static void stringRadixSort(List<String> list, int letters) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 26);
int letterNumber = 1; //Sorts list by 1st letter of each word, then 2nd etc.
for (int i = 0; i < letters; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
if (word.length() > letters) throw new UnsortableException("The list contains a word that holds more letters than the given maximum number of letters."
+ "\nMax Letters: " + letters + "\nWord: " + word);
String letter = word.substring(letterNumber - 1, letterNumber); //EXCEPTION THROWN
char ch = letter.charAt(0);
int index = ch - 'a'; //gets index of each letter ('a' = buckets[0], 'z' = buckets[25]
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber++;
}
}
我的方法的(唯一的,我希望的)问题是,当我阅读单词的每个字符时,我创建了单词的单个字母子串。由于外部 for
循环运行 letters
次(其中 letters
是列表中单词的最大长度),当此循环的迭代次数大于当前单词的长度 - 即 letterNumber > word.length()
- 因此它试图使用大于字符串长度的字符串索引创建子字符串。
我如何调整我的方法,以便它只创建 letterNumber == word.length()
之前的每个单词的子字符串,然后能够将排序算法应用于这些较短的单词 - "a" 会变成之前"aa".
你为什么不替换
String letter = word.substring(letterNumber - 1, letterNumber);
char ch = letter.charAt(0);
和
char ch = word.charAt(letterNumber - 1);
直接给你 char
。但这并不能解决 IndexOutOfBoundException
.
你当然应该捕获异常并处理它。也许为这种情况创建一个桶是好的:当这个词对于当前迭代来说太短时,它被分类到一个桶中。合并列表时,先取这个桶的元素。
public static void stringRadixSort(List<String> list, int letters) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 27);
int letterNumber = 1; //Sorts list by 1st letter of each word, then 2nd etc.
for (int i = 0; i < letters; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
if (word.length() > letters) throw new UnsortableException("The list contains a word that holds more letters than the given maximum number of letters."
+ "\nMax Letters: " + letters + "\nWord: " + word);
int index;
if(word.length() > letterNumber) {
char ch = word.charAt(letterNumber - 1);
index = ch - 'a' + 1; //gets index of each letter ('a' = buckets[1], 'z' = buckets[26], buckets[0] is for short words
} else {
index = 0;
}
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber++;
}
}
只需将比字符串长度短的元素分组到一个额外的组中。您还需要先对最不重要(相关)的字符进行排序。以下代码使用 java 集合而不是您使用的任何数据结构:
public static void stringRadixSort(List<String> list, int letters) {
if (list.size() <= 1) {
return;
}
List<String>[] buckets = new List[27];
for (int i = 0; i < buckets.length; i++) {
buckets[i] = new LinkedList<>();
}
int largestLength = -1;
int secondLargestLength = 0;
for (String s : list) {
int length = s.length();
if (length >= largestLength) {
secondLargestLength = largestLength;
largestLength = length;
} else if (secondLargestLength < length) {
secondLargestLength = length;
}
}
if (largestLength > letters) {
throw new IllegalArgumentException("one of the strings is too long");
}
for (int i = secondLargestLength == largestLength ? secondLargestLength-1 : secondLargestLength; i >= 0; i--) {
for (String word : list) {
int index = (word.length() <= i) ? 0 : word.charAt(i) - ('a' - 1);
buckets[index].add(word);
}
list.clear();
for (List<String> lst : buckets) {
if (lst != null) {
list.addAll(lst);
lst.clear();
}
}
}
}
在我所有的尝试中,我一直在先按最重要的字母(每个单词的第一个字母)对单词进行排序,然后是下一个重要的字母,依此类推。当然,基数排序依赖于排序最不重要的 digit/letter(number/word 的最后一个 digit/letter)。因此,我不是通过关注 letterNumber = 1
开始并在每次迭代后递增它来迭代我的外部 for
循环,而是从 letterNumber = maxWordLength
开始,然后在每次迭代后递减它,这样每次迭代都会比较下一个最重要的字母。
@SuppressWarnings("unchecked")
public static void stringRadixSort(List<String> list) {
List<String>[] buckets = (List<String>[]) Array.newInstance(List.class, 27);
//Find longest word in list
int maxWordLength = 0;
for (String word : list) {
if (word.length() > maxWordLength) {
maxWordLength = word.length();
}
}
//Sorts list based on least significant letter (last letter of word) to most significant
int letterNumber = maxWordLength;
for (int i = 0; i < maxWordLength; i++) {
while (!list.isEmpty()) {
String word = list.remove(list.first());
int index = 0;
if(word.length() >= letterNumber) {
char ch = word.charAt(letterNumber - 1);
index = ch - 'a' + 1; //gets index of each letter ('a' = buckets[1], 'z' = buckets[26], buckets[0] is for words shorter than 'letterNumber')
}
if (buckets[index] == null) {
buckets[index] = new LinkedList<String>();
}
buckets[index].insertLast(word);
}
for (int j = 0; j < buckets.length; j++) {
if (buckets[j] != null) {
while (!buckets[j].isEmpty()) {
list.insertLast(buckets[j].remove(buckets[j].first()));
}
}
}
letterNumber--;
}
}