如何网络抓取 virustotal 以获得结果 python

Question

我想通过 Python bs4 获取扫描的恶意检测结果，这是我的扫描结果代码以及我希望扫描结果显示的内容：

我想要的：

8

代码：

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
r = requests.get('https://www.virustotal.com/gui/file/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761', headers=headers)

soup = BeautifulSoup(r.content, 'html.parser')
item = soup.find('div', class_="positives")
print(item.get_text(strip=True, separator=' '))

但是其中 none 个有效，我有什么方法可以使它有效吗？

Answer 1

您看到的结果是通过 JavaScript 动态加载的。要模拟 ajax 请求，您可以使用下一个示例：

import json
import requests


url = "https://www.virustotal.com/ui/files/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761"
headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0",
    "X-Tool": "vt-ui-main",
    "X-VT-Anti-Abuse-Header": "MTA3OTM2NjUwMjctWkc5dWRDQmlaU0JsZG1scy0xNjMxMTE3NzQyLjY1",
    "Accept-Ianguage": "en-US,en;q=0.9,es;q=0.8",
}

data = requests.get(url, headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

# print some data:
for k, v in data["data"]["attributes"]["last_analysis_results"].items():
    print("{:<30} {:<30}".format(k, str(v["result"])))

打印：

Bkav                           None                          
Lionic                         None                          
MicroWorld-eScan               None                          
VBA32                          None                          
FireEye                        None                          
CAT-QuickHeal                  None                          
Qihoo-360                      None                          
ALYac                          None                          
Cylance                        None                          
Zillya                         None                          
Paloalto                       None                          
Sangfor                        None                          
K7AntiVirus                    None                          
Alibaba                        None                          
K7GW                           None                          
Cybereason                     None                          
Arcabit                        None                          
TrendMicro                     None                          
Baidu                          None                          
Cyren                          None                          
SymantecMobileInsight          None                          
Symantec                       None                          
TotalDefense                   None                          
APEX                           Malicious                     
Avast                          None                          
ClamAV                         None                          
Kaspersky                      None                          
BitDefender                    None                          
NANO-Antivirus                 None                          
SUPERAntiSpyware               None                          
Rising                         None                          
Endgame                        None                          
Trustlook                      None                          
Emsisoft                       None                          
Comodo                         Heur.Corrupt.PE@1z141z3       
F-Secure                       None                          
DrWeb                          None                          
VIPRE                          None                          
Invincea                       heuristic                     
McAfee-GW-Edition              None                          
Trapmine                       malicious.high.ml.score       
CMC                            None                          
Sophos                         None                          
SentinelOne                    DFI - Suspicious PE           
F-Prot                         W32/Damaged_File.B.gen!Eldorado
Jiangmin                       None                          
Webroot                        W32.Malware.Gen               
Avira                          None                          
Fortinet                       None                          
Antiy-AVL                      None                          
Kingsoft                       None                          
Microsoft                      None                          
ViRobot                        None                          
ZoneAlarm                      None                          
Avast-Mobile                   None                          
TACHYON                        None                          
AhnLab-V3                      None                          
Acronis                        None                          
McAfee                         None                          
MAX                            None                          
Ad-Aware                       None                          
Malwarebytes                   None                          
Zoner                          None                          
ESET-NOD32                     None                          
TrendMicro-HouseCall           None                          
Tencent                        None                          
Yandex                         None                          
Ikarus                         None                          
eGambit                        None                          
GData                          None                          
BitDefenderTheta               None                          
AVG                            None                          
Panda                          None                          
CrowdStrike                    win/malicious_confidence_80% (D)
MaxSecure                      None

编辑：打印分析统计数据：

print(data["data"]["attributes"]["last_analysis_stats"])

打印：

{
    "harmless": 0,
    "type-unsupported": 2,
    "suspicious": 0,
    "confirmed-timeout": 0,
    "timeout": 0,
    "failure": 0,
    "malicious": 8,
    "undetected": 65,
}

如何网络抓取 virustotal 以获得结果 python

How to webscrape virustotal to get results python

python

beautifulsoup

web-scraping

python-3.9