过于复杂的目录迭代结构破坏了程序的连续性和可理解性
overly complicated directory iteration structure devastating program continuity and comprehensibility
我正在尝试从一个目录中读取许多文件,该目录具有 /train
形式的底层子结构,其下分别有 '/atheism
、/politics
、/science
& /sports
每个包含许多文件。我
我需要读入所有文件中的所有单词以创建全局 "dictionary",每个文件中的每个单词都代表一次(此时我不太担心词干提取或任何那些花哨的东西!)。
问题是,每当我试图以一种清晰的方式思考我必须做的事情时,我正在使用的这种复杂的迭代结构让我感到困惑。我该如何简化和制服这头笨重的野兽!
public class FileDictCreateur
{
static String PATH = "/home/Workbench/SUTD/ISTD_50.570/assignments/data/train";
//the global list of all words across all articles
static Set<String> GLOBO_DICT = new HashSet<String>();
public static void main(String[] args) throws IOException
{
//each of the diferent categories
String[] categories = { "/atheism", "/politics", "/science", "/sports"};
//cycle through all categories once to populate the global dict
for(int cycle = 0; cycle <= 3; cycle++)
{
String general_data_partition = PATH + categories[cycle];
File directory = new File( general_data_partition );
iterateDirectory( directory );
}
}
private static void iterateDirectory(File directory) throws IOException
{
for (File file : directory.listFiles())
{
if (file.isDirectory())
{
iterateDirectory(directory);
}
else
{
System.out.println(file);
String line;
BufferedReader br = new BufferedReader(new FileReader( file ));
while ((line = br.readLine()) != null)
{
String[] words = line.split(" ");//those are your words
//here is where I will populate that
//globo dict
}
}
}
}
我很确定您在 /home
之后需要一个用户文件夹。另外,您可以使用 File(String, String)
constructor and a for-each
loop。放在一起,我想你想要的是
static String PATH = "Workbench/SUTD/ISTD_50.570/assignments/data/train";
// the global list of all words across all articles
static Set<String> GLOBO_DICT = new HashSet<String>();
public static void main(String[] args) throws IOException {
// each of the diferent categories
String[] categories = { "/atheism", "/politics", "/science", "/sports" };
File trainpath = new File(System.getProperty("user.home"), PATH);
// cycle through all categories once to populate the global dict
for (String cycle : categories) {
File directory = new File(trainpath, cycle);
iterateDirectory(directory);
}
}
我正在尝试从一个目录中读取许多文件,该目录具有 /train
形式的底层子结构,其下分别有 '/atheism
、/politics
、/science
& /sports
每个包含许多文件。我
我需要读入所有文件中的所有单词以创建全局 "dictionary",每个文件中的每个单词都代表一次(此时我不太担心词干提取或任何那些花哨的东西!)。
问题是,每当我试图以一种清晰的方式思考我必须做的事情时,我正在使用的这种复杂的迭代结构让我感到困惑。我该如何简化和制服这头笨重的野兽!
public class FileDictCreateur
{
static String PATH = "/home/Workbench/SUTD/ISTD_50.570/assignments/data/train";
//the global list of all words across all articles
static Set<String> GLOBO_DICT = new HashSet<String>();
public static void main(String[] args) throws IOException
{
//each of the diferent categories
String[] categories = { "/atheism", "/politics", "/science", "/sports"};
//cycle through all categories once to populate the global dict
for(int cycle = 0; cycle <= 3; cycle++)
{
String general_data_partition = PATH + categories[cycle];
File directory = new File( general_data_partition );
iterateDirectory( directory );
}
}
private static void iterateDirectory(File directory) throws IOException
{
for (File file : directory.listFiles())
{
if (file.isDirectory())
{
iterateDirectory(directory);
}
else
{
System.out.println(file);
String line;
BufferedReader br = new BufferedReader(new FileReader( file ));
while ((line = br.readLine()) != null)
{
String[] words = line.split(" ");//those are your words
//here is where I will populate that
//globo dict
}
}
}
}
我很确定您在 /home
之后需要一个用户文件夹。另外,您可以使用 File(String, String)
constructor and a for-each
loop。放在一起,我想你想要的是
static String PATH = "Workbench/SUTD/ISTD_50.570/assignments/data/train";
// the global list of all words across all articles
static Set<String> GLOBO_DICT = new HashSet<String>();
public static void main(String[] args) throws IOException {
// each of the diferent categories
String[] categories = { "/atheism", "/politics", "/science", "/sports" };
File trainpath = new File(System.getProperty("user.home"), PATH);
// cycle through all categories once to populate the global dict
for (String cycle : categories) {
File directory = new File(trainpath, cycle);
iterateDirectory(directory);
}
}