无法从 HTML 源字符串 C# 中获取所需数据
Can't get desired data from HTML source string C#
我需要从 https://www.bcr.ro/en/exchange-rates 获取货币值,但使用这些方法获取 html 字符串:
WebRequest req = HttpWebRequest.Create("https://www.bcr.ro/en/exchange-rates");
req.Method = "GET";
string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
source = reader.ReadToEnd();
}
WebClient wc = new WebClient();
string s = wc.DownloadString("https://www.bcr.ro/en/exchange-rates");
两者都导致得到一个奇怪的 html 字符串,其中不包含所需的数据:
<!DOCTYPE html>
<html lang="en" class="no-js false_EAM isEmil_false">
<!-- Version: 2.16.7.0 (gportals2m1pvm1-044457035960000082024075) Date: 24.10.2015 18:19:59 -->
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Exchange rates | BCR</title>
<link rel="shortcut icon" type="image/x-icon" href="https://www.bcr.ro/content/8ea9dd8a/-3b9c-429b-9f72-34e75b7512e3/favicon.ico">
<meta name="author" content="Banca Comerciala Romana (BCR): loans, cards, deposits, Internet Banking, current account">
<meta name="description" content="Banca Comerciala Romana (BCR), a member of Erste Group, is a universal bank serving both retail and corporate clients. ">
<meta name="generator" content="Group Portal - 2.16.7.0"><meta name="keywords" content=" loans, cards, deposits, Internet Banking, current account">
我怎样才能达到想要的结果?
尝试创建一个 RegEx 来抓取这个 ('<table class="overview glaze fullsize">'
),然后抓取 HTML 页面的这个标签中的所有内容。然后在需要的地方使用它。
所以经过快速研究,答案很简单:
WebRequest
和 WebClient
都使用 initiallink 提取页面源中包含的数据,即 Crtl+ U,其中不包含所需数据
- 在DEV(Crtl+F12)中快速搜索后,很明显需要的数据是动态带来的,所以看了一下在 Network TAB 中,我找到了请求 (data),它以漂亮的 JSON(完美)提取了准确的所需数据。
我需要从 https://www.bcr.ro/en/exchange-rates 获取货币值,但使用这些方法获取 html 字符串:
WebRequest req = HttpWebRequest.Create("https://www.bcr.ro/en/exchange-rates"); req.Method = "GET"; string source; using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream())) { source = reader.ReadToEnd(); }
WebClient wc = new WebClient(); string s = wc.DownloadString("https://www.bcr.ro/en/exchange-rates");
两者都导致得到一个奇怪的 html 字符串,其中不包含所需的数据:
<!DOCTYPE html>
<html lang="en" class="no-js false_EAM isEmil_false">
<!-- Version: 2.16.7.0 (gportals2m1pvm1-044457035960000082024075) Date: 24.10.2015 18:19:59 -->
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Exchange rates | BCR</title>
<link rel="shortcut icon" type="image/x-icon" href="https://www.bcr.ro/content/8ea9dd8a/-3b9c-429b-9f72-34e75b7512e3/favicon.ico">
<meta name="author" content="Banca Comerciala Romana (BCR): loans, cards, deposits, Internet Banking, current account">
<meta name="description" content="Banca Comerciala Romana (BCR), a member of Erste Group, is a universal bank serving both retail and corporate clients. ">
<meta name="generator" content="Group Portal - 2.16.7.0"><meta name="keywords" content=" loans, cards, deposits, Internet Banking, current account">
我怎样才能达到想要的结果?
尝试创建一个 RegEx 来抓取这个 ('<table class="overview glaze fullsize">'
),然后抓取 HTML 页面的这个标签中的所有内容。然后在需要的地方使用它。
所以经过快速研究,答案很简单:
WebRequest
和WebClient
都使用 initiallink 提取页面源中包含的数据,即 Crtl+ U,其中不包含所需数据- 在DEV(Crtl+F12)中快速搜索后,很明显需要的数据是动态带来的,所以看了一下在 Network TAB 中,我找到了请求 (data),它以漂亮的 JSON(完美)提取了准确的所需数据。