将抓取的数据保存到文件
Saving scraped data to file
我使用 Jsoup 从多个网页抓取数据,我怎样才能将抓取的数据保存到文件而不覆盖之前抓取的网页
我尝试在堆栈溢出和 Jsoup 文档中搜索解决方案。
int j = 0;
int i = 0;
String URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+j);
Document doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
Elements temp = doc.select("div.c-listing-athlete__text");
for (Element fighterList:temp) {
i++;
System.out.println(i + " " + fighterList.getElementsByClass("c-listing-athlete__name").first().text());
}
j++;
URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+j);
doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
temp = doc.select("div.c-listing-athlete__text");
for (Element fighterList:temp) {
i++;
System.out.println(i + " " + fighterList.getElementsByClass("c-listing-athlete__name").first().text());
}
如果您需要从代码中保存数据,只需检查一下,也许它可以帮助您:
int i = 0;
int pagesNumber = 10;
String URL = "";
Document doc = null;
Elements temp = null;
try {
// Create file
FileWriter fstream = new FileWriter(System.currentTimeMillis() + "out.txt");
BufferedWriter out = new BufferedWriter(fstream);
for (i=0; i<pagesNumber; i++) {
URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+i);
doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
temp = doc.select("div.c-listing-athlete__text");
for (Element fighter : temp) {
out.write(i + " " + fighter.getElementsByClass("c-listing-athlete__name").first().text());
}
}
//Close the output stream
out.close();
} catch (Exception e) { // Catch exception if any
System.err.println("Error: " + e.getMessage());
}
希望对您有所帮助:)
我使用 Jsoup 从多个网页抓取数据,我怎样才能将抓取的数据保存到文件而不覆盖之前抓取的网页
我尝试在堆栈溢出和 Jsoup 文档中搜索解决方案。
int j = 0;
int i = 0;
String URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+j);
Document doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
Elements temp = doc.select("div.c-listing-athlete__text");
for (Element fighterList:temp) {
i++;
System.out.println(i + " " + fighterList.getElementsByClass("c-listing-athlete__name").first().text());
}
j++;
URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+j);
doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
temp = doc.select("div.c-listing-athlete__text");
for (Element fighterList:temp) {
i++;
System.out.println(i + " " + fighterList.getElementsByClass("c-listing-athlete__name").first().text());
}
如果您需要从代码中保存数据,只需检查一下,也许它可以帮助您:
int i = 0;
int pagesNumber = 10;
String URL = "";
Document doc = null;
Elements temp = null;
try {
// Create file
FileWriter fstream = new FileWriter(System.currentTimeMillis() + "out.txt");
BufferedWriter out = new BufferedWriter(fstream);
for (i=0; i<pagesNumber; i++) {
URL = ("https://www.ufc.com/athletes/all?gender=All&search=&page="+i);
doc = Jsoup.connect(URL).userAgent("mozilla/70.0.1").get();
temp = doc.select("div.c-listing-athlete__text");
for (Element fighter : temp) {
out.write(i + " " + fighter.getElementsByClass("c-listing-athlete__name").first().text());
}
}
//Close the output stream
out.close();
} catch (Exception e) { // Catch exception if any
System.err.println("Error: " + e.getMessage());
}
希望对您有所帮助:)