PowerShell 网页抓取在后台不起作用

PowerShell web scraping doesn't work in background

我写了一个运行良好的脚本:

# Use Internet Explorer
$ie = New-Object -ComObject 'internetExplorer.Application'
$ie.Visible= $true # Make it visible
# Set Credentials
$username="name.surname@mail.com"
$password="password"
#Navigate to URL
$ie.Navigate("https://service.post.ch/zopa/dlc/app/?service=dlc-web&inMobileApp=false&inIframe=false&lang=fr#!/main")
While ($ie.Busy -eq $true) {Start-Sleep -Seconds 3;}
# Login 
$usernamefield = $ie.document.getElementByID('isiwebuserid')
$usernamefield.value = "$username"
$passwordfield = $ie.document.getElementByID('isiwebpasswd')
$passwordfield.value = "$password"
$Link = $ie.document.getElementByID('actionLogin')
$Link.click()
Start-Sleep -seconds 5
# Find file to download
$link = $ie.Document.getElementsByTagName('A') | where-object {$_.innerText -like 'post_adressdaten*'}
$link.click()
Start-Sleep -seconds 3
# Press "Alt + s" on the download dialog  
Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%s")
Start-Sleep -seconds 3
# Quit Internet Explorer
$ie.Quit()

但是如果我将 $ie.Visible= $true 更改为 $ie.Visible= $false 脚本将无法运行。

为什么?

因为这两行:

Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%s")

在这两行中,我正在处理 Internet Explorer 的下载对话框,如果浏览器在后台运行,则脚本无法点击它。

如何在后台发送输入,或者如何让 Internet Explorer 始终位于最前面?

我发现的最接近的东西是这个,它丑得要命:

# Start Internet Explorer on top
Add-Type -TypeDefinition @"
    using System;
    using System.Runtime.InteropServices;

    public class Win32SetWindow {
        [DllImport("user32.dll")]
        [return: MarshalAs(UnmanagedType.Bool)]
        public static extern bool SetForegroundWindow(IntPtr hWnd);
    }
"@

$ie = new-object -comobject InternetExplorer.Application;
$ie.visible = $true;

    [Win32SetWindow]::SetForegroundWindow($ie.HWND) # <--   Internet Explorer window on top

# Set Credentials
$username="name.surname@mail.com"
$password="password"
#Navigate to URL
$ie.Navigate("https://service.post.ch/zopa/dlc/app/?service=dlc-web&inMobileApp=false&inIframe=false&lang=fr#!/main")
While ($ie.Busy -eq $true) {Start-Sleep -Seconds 3;}
# Login 
$usernamefield = $ie.document.getElementByID('isiwebuserid')
$usernamefield.value = "$username"
$passwordfield = $ie.document.getElementByID('isiwebpasswd')
$passwordfield.value = "$password"
$Link = $ie.document.getElementByID('actionLogin')
$Link.click()
Start-Sleep -seconds 5
# Find file to download
$link = $ie.Document.getElementsByTagName('A') | where-object {$_.innerText -like 'post_adressdaten*'}
$link.click()
Start-Sleep -seconds 3
# Press "Alt + s" on the download dialog  
Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%n{TAB}{ENTER}")     # or use SendWait("%s")
Start-Sleep -seconds 3
# Quit Internet Explorer
$ie.Quit()

正如您自己的回答所暗示的那样,为了能够 将击键 发送到具有 [System.Windows.Forms.SendKeys]::SendWait() 的应用程序 ,它必须有 a window 即 (a) visible 和 (b) 有 (input) focus.

您的答案中显示的技术的更简单、更快速的替代方法 - 您使用 C# 代码的临时编译,通过 P/Invoke 声明,通过 Add-Type 包装 WinAPI 函数 - 如下:

# Create an Internet Explorer instance and make it visible.
$ie = New-Object -ComObject 'internetExplorer.Application'; $ie.Visible= $true

# Activate it (give it the focus), via its PID (process ID).
(New-Object -ComObject WScript.Shell).AppActivate(
  (Get-Process iexplore | Where-Object MainWindowHandle -eq $ie.hWnd).Id
)

退一步:

  • GUI 脚本(通过模拟用户对 GUI 的输入 来自动执行任务)本质上是不可靠;例如,用户可能会点击远离预期获得焦点的 window。

  • 虽然没有内置解决方案,但听起来 Selenium 提供了 稳健 程序化浏览器控件。