无法添加到另一个地图内的树图(以创建倒排索引)
Failed to add to tree map that is inside another map (to create an inverted index)
我正在为 java 中的单词列表创建倒排索引。基本上它为每个单词创建一个列表,其中包含该单词出现的文档索引以及该文档中单词的频率,所需的输出应该是这样的:
[word1:[FileNo:frequency],[FileNo:frequency],[FileNo:frequency],word2:[FileNo:frequency],[FileNo:frequency]...etc]
代码如下:
package assigenment2;
import java.io.*;
import java.util.*;
public class invertedIndex {
public static Map<String, Map<Integer,Integer>> wordTodocumentMap;
public static BufferedReader buffer;
public static BufferedReader br;
public static BufferedReader reader;
public static List<String> files = new ArrayList<String>();
public static List<String>[] tokens;
public static void main(String[] args) throws IOException {
//read the token file and store the token in list
String tokensPath="/Users/Manal/Documents/workspace/Information Retrieval/tokens.txt";
int k=0;
String[] tokens = new String[8500];
String sCurrentLine;
try
{
FileReader fr=new FileReader(tokensPath);
BufferedReader br= new BufferedReader(fr);
while ((sCurrentLine = br.readLine()) != null)
{
tokens[k]=sCurrentLine;
k++;
}
System.out.println("the number of token are:"+k+" words");
br.close();
}
catch(Exception ex)
{System.out.println(ex);}
直到它正常工作,我认为问题出在以下部分中对嵌套映射的操作:
TreeMap<Integer,Integer> documentToCount = new TreeMap<Integer,Integer>();
//read files
System.out.print("Enter the path of files you want to process:\n");
Scanner InputPath = new Scanner(System.in);
String cranfield = InputPath.nextLine();
File cranfieldFiles = new File(cranfield);
for (File file: cranfieldFiles.listFiles())
{
int fileno = files.indexOf(file.getPath());
if (fileno == -1) //the current file isn't in the files list \
{
files.add(file.getPath());// add file to the files list
fileno = files.size() - 1;//the index of file will start from 0 to size-1
}
int frequency = 0;
BufferedReader reader = new BufferedReader(new FileReader(file));
for (String line = reader.readLine(); line != null; line = reader.readLine())
{
for (String _word : line.split(" "))
{
String word = _word.toLowerCase();
if (Arrays.asList(tokens).contains(word))
if (wordTodocumentMap.get(word) == null)//check whether word is new word
{
documentToCount = new TreeMap<Integer,Integer>();
wordTodocumentMap.put(word, documentToCount);
}
documentToCount.put(fileno, frequency+1);//add the location and frequency
}
}
}
reader.close();
}
}
我得到的错误是:
Exception in thread "main" java.lang.NullPointerException
at assigenment2.invertedIndex.main(invertedIndex.java:65)
您永远不会实例化 wordTodocumentMap
,因此它始终保持 null
。因此,当您执行 .get()
时,行 if (wordTodocumentMap.get(word) == null)//check whether word is new word
抛出 NullPointerException
,也就是说,在您有任何东西可以与 null
进行比较之前。一种可能的解决方案是在声明中实例化地图:
public static Map<String, Map<Integer,Integer>> wordTodocumentMap = new HashMap<>();
您的代码中可能还有其他问题,但这应该能让您更进一步。
我正在为 java 中的单词列表创建倒排索引。基本上它为每个单词创建一个列表,其中包含该单词出现的文档索引以及该文档中单词的频率,所需的输出应该是这样的:
[word1:[FileNo:frequency],[FileNo:frequency],[FileNo:frequency],word2:[FileNo:frequency],[FileNo:frequency]...etc]
代码如下:
package assigenment2;
import java.io.*;
import java.util.*;
public class invertedIndex {
public static Map<String, Map<Integer,Integer>> wordTodocumentMap;
public static BufferedReader buffer;
public static BufferedReader br;
public static BufferedReader reader;
public static List<String> files = new ArrayList<String>();
public static List<String>[] tokens;
public static void main(String[] args) throws IOException {
//read the token file and store the token in list
String tokensPath="/Users/Manal/Documents/workspace/Information Retrieval/tokens.txt";
int k=0;
String[] tokens = new String[8500];
String sCurrentLine;
try
{
FileReader fr=new FileReader(tokensPath);
BufferedReader br= new BufferedReader(fr);
while ((sCurrentLine = br.readLine()) != null)
{
tokens[k]=sCurrentLine;
k++;
}
System.out.println("the number of token are:"+k+" words");
br.close();
}
catch(Exception ex)
{System.out.println(ex);}
直到它正常工作,我认为问题出在以下部分中对嵌套映射的操作:
TreeMap<Integer,Integer> documentToCount = new TreeMap<Integer,Integer>();
//read files
System.out.print("Enter the path of files you want to process:\n");
Scanner InputPath = new Scanner(System.in);
String cranfield = InputPath.nextLine();
File cranfieldFiles = new File(cranfield);
for (File file: cranfieldFiles.listFiles())
{
int fileno = files.indexOf(file.getPath());
if (fileno == -1) //the current file isn't in the files list \
{
files.add(file.getPath());// add file to the files list
fileno = files.size() - 1;//the index of file will start from 0 to size-1
}
int frequency = 0;
BufferedReader reader = new BufferedReader(new FileReader(file));
for (String line = reader.readLine(); line != null; line = reader.readLine())
{
for (String _word : line.split(" "))
{
String word = _word.toLowerCase();
if (Arrays.asList(tokens).contains(word))
if (wordTodocumentMap.get(word) == null)//check whether word is new word
{
documentToCount = new TreeMap<Integer,Integer>();
wordTodocumentMap.put(word, documentToCount);
}
documentToCount.put(fileno, frequency+1);//add the location and frequency
}
}
}
reader.close();
}
}
我得到的错误是:
Exception in thread "main" java.lang.NullPointerException
at assigenment2.invertedIndex.main(invertedIndex.java:65)
您永远不会实例化 wordTodocumentMap
,因此它始终保持 null
。因此,当您执行 .get()
时,行 if (wordTodocumentMap.get(word) == null)//check whether word is new word
抛出 NullPointerException
,也就是说,在您有任何东西可以与 null
进行比较之前。一种可能的解决方案是在声明中实例化地图:
public static Map<String, Map<Integer,Integer>> wordTodocumentMap = new HashMap<>();
您的代码中可能还有其他问题,但这应该能让您更进一步。