如何网络抓取 virustotal 以获得结果 python
How to webscrape virustotal to get results python
我想通过 Python bs4 获取扫描的恶意检测结果,这是我的扫描结果代码以及我希望扫描结果显示的内容:
我想要的:
8
代码:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
r = requests.get('https://www.virustotal.com/gui/file/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761', headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
item = soup.find('div', class_="positives")
print(item.get_text(strip=True, separator=' '))
但是其中 none 个有效,我有什么方法可以使它有效吗?
您看到的结果是通过 JavaScript 动态加载的。要模拟 ajax 请求,您可以使用下一个示例:
import json
import requests
url = "https://www.virustotal.com/ui/files/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0",
"X-Tool": "vt-ui-main",
"X-VT-Anti-Abuse-Header": "MTA3OTM2NjUwMjctWkc5dWRDQmlaU0JsZG1scy0xNjMxMTE3NzQyLjY1",
"Accept-Ianguage": "en-US,en;q=0.9,es;q=0.8",
}
data = requests.get(url, headers=headers).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
# print some data:
for k, v in data["data"]["attributes"]["last_analysis_results"].items():
print("{:<30} {:<30}".format(k, str(v["result"])))
打印:
Bkav None
Lionic None
MicroWorld-eScan None
VBA32 None
FireEye None
CAT-QuickHeal None
Qihoo-360 None
ALYac None
Cylance None
Zillya None
Paloalto None
Sangfor None
K7AntiVirus None
Alibaba None
K7GW None
Cybereason None
Arcabit None
TrendMicro None
Baidu None
Cyren None
SymantecMobileInsight None
Symantec None
TotalDefense None
APEX Malicious
Avast None
ClamAV None
Kaspersky None
BitDefender None
NANO-Antivirus None
SUPERAntiSpyware None
Rising None
Endgame None
Trustlook None
Emsisoft None
Comodo Heur.Corrupt.PE@1z141z3
F-Secure None
DrWeb None
VIPRE None
Invincea heuristic
McAfee-GW-Edition None
Trapmine malicious.high.ml.score
CMC None
Sophos None
SentinelOne DFI - Suspicious PE
F-Prot W32/Damaged_File.B.gen!Eldorado
Jiangmin None
Webroot W32.Malware.Gen
Avira None
Fortinet None
Antiy-AVL None
Kingsoft None
Microsoft None
ViRobot None
ZoneAlarm None
Avast-Mobile None
TACHYON None
AhnLab-V3 None
Acronis None
McAfee None
MAX None
Ad-Aware None
Malwarebytes None
Zoner None
ESET-NOD32 None
TrendMicro-HouseCall None
Tencent None
Yandex None
Ikarus None
eGambit None
GData None
BitDefenderTheta None
AVG None
Panda None
CrowdStrike win/malicious_confidence_80% (D)
MaxSecure None
编辑:打印分析统计数据:
print(data["data"]["attributes"]["last_analysis_stats"])
打印:
{
"harmless": 0,
"type-unsupported": 2,
"suspicious": 0,
"confirmed-timeout": 0,
"timeout": 0,
"failure": 0,
"malicious": 8,
"undetected": 65,
}
我想通过 Python bs4 获取扫描的恶意检测结果,这是我的扫描结果代码以及我希望扫描结果显示的内容:
我想要的:
8
代码:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
r = requests.get('https://www.virustotal.com/gui/file/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761', headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
item = soup.find('div', class_="positives")
print(item.get_text(strip=True, separator=' '))
但是其中 none 个有效,我有什么方法可以使它有效吗?
您看到的结果是通过 JavaScript 动态加载的。要模拟 ajax 请求,您可以使用下一个示例:
import json
import requests
url = "https://www.virustotal.com/ui/files/43175f0c9423853dcd38ee0077f1600dace535ed593d46f9f88ef3dda4e84761"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0",
"X-Tool": "vt-ui-main",
"X-VT-Anti-Abuse-Header": "MTA3OTM2NjUwMjctWkc5dWRDQmlaU0JsZG1scy0xNjMxMTE3NzQyLjY1",
"Accept-Ianguage": "en-US,en;q=0.9,es;q=0.8",
}
data = requests.get(url, headers=headers).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
# print some data:
for k, v in data["data"]["attributes"]["last_analysis_results"].items():
print("{:<30} {:<30}".format(k, str(v["result"])))
打印:
Bkav None
Lionic None
MicroWorld-eScan None
VBA32 None
FireEye None
CAT-QuickHeal None
Qihoo-360 None
ALYac None
Cylance None
Zillya None
Paloalto None
Sangfor None
K7AntiVirus None
Alibaba None
K7GW None
Cybereason None
Arcabit None
TrendMicro None
Baidu None
Cyren None
SymantecMobileInsight None
Symantec None
TotalDefense None
APEX Malicious
Avast None
ClamAV None
Kaspersky None
BitDefender None
NANO-Antivirus None
SUPERAntiSpyware None
Rising None
Endgame None
Trustlook None
Emsisoft None
Comodo Heur.Corrupt.PE@1z141z3
F-Secure None
DrWeb None
VIPRE None
Invincea heuristic
McAfee-GW-Edition None
Trapmine malicious.high.ml.score
CMC None
Sophos None
SentinelOne DFI - Suspicious PE
F-Prot W32/Damaged_File.B.gen!Eldorado
Jiangmin None
Webroot W32.Malware.Gen
Avira None
Fortinet None
Antiy-AVL None
Kingsoft None
Microsoft None
ViRobot None
ZoneAlarm None
Avast-Mobile None
TACHYON None
AhnLab-V3 None
Acronis None
McAfee None
MAX None
Ad-Aware None
Malwarebytes None
Zoner None
ESET-NOD32 None
TrendMicro-HouseCall None
Tencent None
Yandex None
Ikarus None
eGambit None
GData None
BitDefenderTheta None
AVG None
Panda None
CrowdStrike win/malicious_confidence_80% (D)
MaxSecure None
编辑:打印分析统计数据:
print(data["data"]["attributes"]["last_analysis_stats"])
打印:
{
"harmless": 0,
"type-unsupported": 2,
"suspicious": 0,
"confirmed-timeout": 0,
"timeout": 0,
"failure": 0,
"malicious": 8,
"undetected": 65,
}