HTMLagilityPack 结合 Powershell,Windows 身份验证
HTMLagilityPack in combination with Powershell, Windows authentification
所以我有一个叫做 lansweeper 的工具。它在本地服务器上运行。现在我想从中抓取一个页面,但它使用 windows 身份验证。
我使用 Powershell 作为脚本语言。
我主要使用 HTMLAgilityPack 来抓取。但是我从来没有抓取过使用 windows 身份验证的页面。
有谁知道我是如何通过它传递我的凭据的?以便它在某些凭据下打开页面? (比如我的管理员帐户而不是我的普通帐户)。
(是的,我可以将我的普通用户添加到 Lansweeper 中允许的用户,但这不是我想使用的解决方案)。
我尝试了以下方法,但它不起作用。
[Reflection.Assembly]::LoadFile("C:\Scraping\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll”)
[HtmlAgilityPack.HtmlWeb]$web = @{}
$webclient = new-object System.Net.WebClient
$username = "user"
$password = "passw0rd-"
$domain = "mydomain"
$webclient.Credentials = new-object System.Net.NetworkCredential($username, $password, $domain)
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load("http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain","","",$webclient.Credentials)
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//body")
我一直在研究函数并发现了两种可能性:
TypeName : HtmlAgilityPack.HtmlWeb
Name : Load
HtmlAgilityPack.HtmlDocument Load(string url),
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password),
HtmlAgilityPack.HtmlDocument Load(string url, string method),
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)
Name : Get
MemberType : Method
void Get(string url, string path),
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials),
void Get(string url, string path, string method),
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials, string method)
但我无法让其中之一工作。有人用 Powershell 做过吗?
我找到了方法:我希望它对以后的人有所帮助。
这不是直接就能弄清楚的,但一旦你看到它就很容易了。
[Reflection.Assembly]::LoadFile("C:\temp\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll") | Out-Null
[HtmlAgilityPack.HtmlWeb]$web = @{}
$url = "http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain"
$webclient = new-object System.Net.WebClient
$cred = new-object System.Net.NetworkCredential
$defaultCredentials = $cred.UseDefaultCredentials
$proxyAddr = (get-itemproperty 'HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings').ProxyServer
$proxy = new-object System.Net.WebProxy
$proxy.Address = $proxyAddr
$proxy.useDefaultCredentials = $true
$proxy
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url,"GET","$proxy",$defaultCredentials )
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//html[1]/body[1]")
$nodes
<# USER RESOURCES
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials(v=vs.110).aspx
https://forums.asp.net/t/2027997.aspx?HtmlAgilityPack+Stuck+trying+to+understand+HtmlWeb+Load+NetworkCredential
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials.aspx
TypeName : HtmlAgilityPack.HtmlWeb
Name : Load
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password),
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)
#>
所以我有一个叫做 lansweeper 的工具。它在本地服务器上运行。现在我想从中抓取一个页面,但它使用 windows 身份验证。 我使用 Powershell 作为脚本语言。 我主要使用 HTMLAgilityPack 来抓取。但是我从来没有抓取过使用 windows 身份验证的页面。
有谁知道我是如何通过它传递我的凭据的?以便它在某些凭据下打开页面? (比如我的管理员帐户而不是我的普通帐户)。 (是的,我可以将我的普通用户添加到 Lansweeper 中允许的用户,但这不是我想使用的解决方案)。
我尝试了以下方法,但它不起作用。
[Reflection.Assembly]::LoadFile("C:\Scraping\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll”)
[HtmlAgilityPack.HtmlWeb]$web = @{}
$webclient = new-object System.Net.WebClient
$username = "user"
$password = "passw0rd-"
$domain = "mydomain"
$webclient.Credentials = new-object System.Net.NetworkCredential($username, $password, $domain)
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load("http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain","","",$webclient.Credentials)
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//body")
我一直在研究函数并发现了两种可能性:
TypeName : HtmlAgilityPack.HtmlWeb
Name : Load
HtmlAgilityPack.HtmlDocument Load(string url),
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password),
HtmlAgilityPack.HtmlDocument Load(string url, string method),
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)
Name : Get
MemberType : Method
void Get(string url, string path),
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials),
void Get(string url, string path, string method),
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials, string method)
但我无法让其中之一工作。有人用 Powershell 做过吗?
我找到了方法:我希望它对以后的人有所帮助。 这不是直接就能弄清楚的,但一旦你看到它就很容易了。
[Reflection.Assembly]::LoadFile("C:\temp\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll") | Out-Null
[HtmlAgilityPack.HtmlWeb]$web = @{}
$url = "http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain"
$webclient = new-object System.Net.WebClient
$cred = new-object System.Net.NetworkCredential
$defaultCredentials = $cred.UseDefaultCredentials
$proxyAddr = (get-itemproperty 'HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings').ProxyServer
$proxy = new-object System.Net.WebProxy
$proxy.Address = $proxyAddr
$proxy.useDefaultCredentials = $true
$proxy
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url,"GET","$proxy",$defaultCredentials )
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//html[1]/body[1]")
$nodes
<# USER RESOURCES
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials(v=vs.110).aspx
https://forums.asp.net/t/2027997.aspx?HtmlAgilityPack+Stuck+trying+to+understand+HtmlWeb+Load+NetworkCredential
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials.aspx
TypeName : HtmlAgilityPack.HtmlWeb
Name : Load
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password),
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)
#>