使用 InputStream 读取 url 的内容时遇到问题
Having trouble reading in content of url using InputStream
所以我 运行 下面的代码打印了“!DOCTYPE html”。如何获取 url 的内容,例如 html?
public static void main(String[] args) throws IOException {
URL u = new URL("https://www.whitehouse.gov/");
InputStream ins = u.openStream();
InputStreamReader isr = new InputStreamReader(ins);
BufferedReader websiteText = new BufferedReader(isr);
System.out.println(websiteText.readLine());
}
根据 java 文档 https://docs.oracle.com/javase/tutorial/networking/urls/readingURL.html:"When you run the program, you should see, scrolling by in your command window, the HTML commands and textual content from the HTML file located at "... 为什么我不明白?
您只阅读了文本的一行。
试试这个,你会看到你得到两行:
System.out.println(websiteText.readLine());
System.out.println(websiteText.readLine());
尝试循环阅读以获取所有文本。
在你的程序中,你没有放置 while 循环。
URL u = new URL("https://www.whitehouse.gov/");
InputStream ins = u.openStream();
InputStreamReader isr = new InputStreamReader(ins);
BufferedReader websiteText = new BufferedReader(isr);
String inputLine;
while ((inputLine = websiteText.readLine()) != null){
System.out.println(inputLine);
}
websiteText.close();
BufferedReader 自 Java 8 起就有一个名为 #lines() 的方法。#lines() 的 return 类型是 Stream。要阅读整个站点,您可以这样做:
String htmlText = websiteText.lines()
.reduce("", (text, nextLine) -> text + "\n" + nextLine)
.orElse(null);
所以我 运行 下面的代码打印了“!DOCTYPE html”。如何获取 url 的内容,例如 html?
public static void main(String[] args) throws IOException {
URL u = new URL("https://www.whitehouse.gov/");
InputStream ins = u.openStream();
InputStreamReader isr = new InputStreamReader(ins);
BufferedReader websiteText = new BufferedReader(isr);
System.out.println(websiteText.readLine());
}
根据 java 文档 https://docs.oracle.com/javase/tutorial/networking/urls/readingURL.html:"When you run the program, you should see, scrolling by in your command window, the HTML commands and textual content from the HTML file located at "... 为什么我不明白?
您只阅读了文本的一行。 试试这个,你会看到你得到两行:
System.out.println(websiteText.readLine());
System.out.println(websiteText.readLine());
尝试循环阅读以获取所有文本。
在你的程序中,你没有放置 while 循环。
URL u = new URL("https://www.whitehouse.gov/");
InputStream ins = u.openStream();
InputStreamReader isr = new InputStreamReader(ins);
BufferedReader websiteText = new BufferedReader(isr);
String inputLine;
while ((inputLine = websiteText.readLine()) != null){
System.out.println(inputLine);
}
websiteText.close();
BufferedReader 自 Java 8 起就有一个名为 #lines() 的方法。#lines() 的 return 类型是 Stream。要阅读整个站点,您可以这样做:
String htmlText = websiteText.lines()
.reduce("", (text, nextLine) -> text + "\n" + nextLine)
.orElse(null);