在字典中找出三个最常见的词
Find the three most common words in a dictionary
问题是找出字典中最常见的三个单词。我想出了下面的代码,但由于某种原因它不起作用(我的意思是当我尝试在 eclipse 中 运行 它时,它直接引导我进入调试页面,尽管我在编译器上没有收到任何错误screen),调试后找不到原因。你能帮我找出问题所在吗?
Exception in thread "main" java.lang.NullPointerException at java.util.PriorityQueue.offer
(Unknown Source) at java.util.PriorityQueue.add
(Unknown Source) at generalquestions.MostCommonWords.mostCommonStringFinder
(MostCommonWords.java:41) at generalquestions.MostCommonWords.main
(MostCommonWords.java:61)
public static Queue<Integer> mostCommonStringFinder (String document, int k){
if (document == null){
throw new IllegalArgumentException();
}
if (document.isEmpty()){
throw new IllegalArgumentException("Document is empty");
}
String [] wordHolder = document.split(" ");
HashMap<String, Integer> map = new HashMap<String, Integer>();
for (String s : wordHolder){
if (!map.containsKey(s)){
map.put(s, 1);
}
else{
int value = map.get(s);
value++;
map.put(s, value);
}
}
Queue<Integer> minHeap = new PriorityQueue<>();
for ( int i = 0 ; i < k ; i++){
minHeap.add(map.get(i));
}
for(int j = k ; j < map.size() ; j++){
if(map.get(j) > minHeap.peek()){
minHeap.poll();
minHeap.add(map.get(j));
}
}
return minHeap;
}
map
是一个 HashMap<String, Integer>
,因此键是字符串,即文本中的单词。
map.get(i)
将始终 return null
,因为映射中没有 Integer
键。
因为你 return 正在 Queue<Integer>
我假设期望是最高 k
字数,所以用 Queue<Integer> minHeap
替换所有内容:
List<Integer> counts = new ArrayList<>(map.values());
Collections.sort(counts, Collections.reverseOrder());
return counts.subList(0, k);
并将 return 类型更改为 List<Integer>
。
测试
String text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod " +
"tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, " +
"quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo " +
"consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse " +
"cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat " +
"non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
System.out.println(mostCommonStringFinder(text, 3));
输出
[3, 2, 2]
问题是找出字典中最常见的三个单词。我想出了下面的代码,但由于某种原因它不起作用(我的意思是当我尝试在 eclipse 中 运行 它时,它直接引导我进入调试页面,尽管我在编译器上没有收到任何错误screen),调试后找不到原因。你能帮我找出问题所在吗?
Exception in thread "main" java.lang.NullPointerException at java.util.PriorityQueue.offer
(Unknown Source) at java.util.PriorityQueue.add
(Unknown Source) at generalquestions.MostCommonWords.mostCommonStringFinder
(MostCommonWords.java:41) at generalquestions.MostCommonWords.main
(MostCommonWords.java:61)
public static Queue<Integer> mostCommonStringFinder (String document, int k){
if (document == null){
throw new IllegalArgumentException();
}
if (document.isEmpty()){
throw new IllegalArgumentException("Document is empty");
}
String [] wordHolder = document.split(" ");
HashMap<String, Integer> map = new HashMap<String, Integer>();
for (String s : wordHolder){
if (!map.containsKey(s)){
map.put(s, 1);
}
else{
int value = map.get(s);
value++;
map.put(s, value);
}
}
Queue<Integer> minHeap = new PriorityQueue<>();
for ( int i = 0 ; i < k ; i++){
minHeap.add(map.get(i));
}
for(int j = k ; j < map.size() ; j++){
if(map.get(j) > minHeap.peek()){
minHeap.poll();
minHeap.add(map.get(j));
}
}
return minHeap;
}
map
是一个 HashMap<String, Integer>
,因此键是字符串,即文本中的单词。
map.get(i)
将始终 return null
,因为映射中没有 Integer
键。
因为你 return 正在 Queue<Integer>
我假设期望是最高 k
字数,所以用 Queue<Integer> minHeap
替换所有内容:
List<Integer> counts = new ArrayList<>(map.values());
Collections.sort(counts, Collections.reverseOrder());
return counts.subList(0, k);
并将 return 类型更改为 List<Integer>
。
测试
String text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod " +
"tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, " +
"quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo " +
"consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse " +
"cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat " +
"non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
System.out.println(mostCommonStringFinder(text, 3));
输出
[3, 2, 2]