使用 python 模块,例如 beautifulSoup 并在 Visual Studio C# 中请求
Using python modules like beautifulSoup and request in Visual Studio C#
我正在尝试 运行 在 visual studio 中使用 c# 的外部 python 脚本。我正在使用像 BeautifulSoup 和 requests
这样的模块
但我收到以下错误
No module named requests
早些时候我在 BeautifulSoup 中遇到了同样的错误,我将以下行添加到我的 python 脚本中并且错误得到解决
sys.path.append("[Path to Python]\Python\Python35-32\Lib\site-packages")
我在 Visual Studio 2015 年使用 IronPython。我是否可以克服这个错误?如果不可能,是否有任何其他方法可以在 c# 环境中 运行 python 脚本(具有上述模块)。
我尝试使用 denfromufa 给出的解决方案,但随后出现以下错误
这是我的Python代码
import sys
import requests
import re
import io
from bs4 import BeautifulSoup
from math import floor
r = requests.get("https://www.google.com/")
data = r.text
soup = BeautifulSoup(data, 'html.parser')
result = []
for item in soup.find_all(attrs={'class' :'something'}):
for m in item.select('a[href^="something"]'):
m1 = m['href'].replace("something","",1)
m2 = re.sub(r'&.*$', "", m1)
m3 = re.sub(r'%3F.*$', "", m2)
m4 = m3.replace("%2F","/")
m5 = m4.replace("%3A",":")
result.append(m5)
result.append(m.get_text())
for image in item.find_all('img'):
k1 = re.sub(r'&cfs.*$',"",image['src'])
k2 = re.sub(r'^https://something.*$',"",k1)
k3 = re.sub(r'.*url=',"",k2)
k4 = re.sub(r'%3F.*$', "", k3)
k5 = k4.replace("%2F","/")
k6 = k5.replace("%3A",":")
k7 = re.sub(r'.*\.gif',"",k6)
result.append(k7)
seen = set()
result_final = []
for item in result:
if item not in seen:
seen.add(item)
result_final.append(item)
result_final = list(result_final)
我的c#代码如下
using (Py.GIL())
{
dynamic sys = Py.Import("sys");
dynamic requests = Py.Import("requests");
dynamic re = Py.Import("re");
dynamic io = Py.Import("io");
dynamic BeautifulSoup = Py.Import("bs4");
dynamic math = Py.Import("math");
Console.WriteLine(5);
dynamic r = requests.get("https://www.google.com/");
dynamic data = r.text;
dynamic soup = BeautifulSoup.BeautifulSoup(data, "html.parser");
}
我用了
var divExp = new { _class = "smoething" };
var item = soup.find_all(Py.kw("class", divExp._class));
我正在得到结果。但是当我尝试在 item 变量上实现 select 方法时,我收到一条错误消息,指出 Python 对象不包含 'select'
的定义
item.select("a[href^='https://www.google.com/']");
最终答案
using (Py.GIL())
{
dynamic sys = Py.Import("sys");
dynamic requests = Py.Import("requests");
dynamic re = Py.Import("re");
dynamic io = Py.Import("io");
dynamic BeautifulSoup = Py.Import("bs4");
dynamic math = Py.Import("math");
Console.WriteLine(5);
dynamic r = requests.get(url);
dynamic data = r.text;
dynamic soup = BeautifulSoup.BeautifulSoup(data, "html.parser");
var divExp = new { _class = "className" };
var item = soup.find_all(Py.kw("class", divExp._class));
dynamic tag = soup.select("a[href^='https://something.com/']");
for (var i = 1; i < item.Length(); i++)
{
// Extrxting the required info using regex
String input = Convert.ToString(item[i]);
string pattern_link = "(.*href=\"https:[\/][\/]something.com[\/]a.php\?u=)|(&.*)";
string replacement_link = " ";
Regex rgx_link = new Regex(pattern_link);
string result_link = rgx_link.Replace(input, replacement_link);
.
.
.
.
string pattern_link_1 = "(http|https)%.*";
Regex rgx_link_1 = new Regex(pattern_link_1);
Match result_link_1 = rgx_link_1.Match(result_link);
String input_1_1 = Convert.ToString(result_link_1.Value);
result_link_2 = result_link_2.Replace("%2F", "/").Replace("%3A", ":");
}
}
为什么不使用 HTML 敏捷包?它是 C# 的等价物。
您可以将其导入您的解决方案。
对于外部 python 脚本,请参阅,
https://www.codeproject.com/articles/121374/step-by-step-guidance-of-calling-iron-python-funct
- 安装 CPython,2.7、3.4+ 版本之一
pip install pythonnet
- 参考安装 Python.Runtime.DLL 在您的 .NET 项目中
- 遵循 www.python4.net 上的教程,嵌入部分。
```
> scriptcs (ctrl-c to exit or :help for help)
> #r "C:\Python\Anaconda3_64b\Lib\site-packages\Python.Runtime.dll"
> using Python.Runtime;
> dynamic bs4;
> using (Py.GIL()) {bs4=Py.Import("bs4");}
> bs4.__file__.ToString()
C:\Python\Anaconda3_64b\lib\site-packages\bs4\__init__.py
> dynamic rq;
> using (Py.GIL()) {rq=Py.Import("requests");}
> dynamic r=rq.get("https://www.google.com/")
> dynamic soup = bs4.BeautifulSoup(r.text,"html.parser");
> soup.ToString()
```
我正在尝试 运行 在 visual studio 中使用 c# 的外部 python 脚本。我正在使用像 BeautifulSoup 和 requests
这样的模块但我收到以下错误
No module named requests
早些时候我在 BeautifulSoup 中遇到了同样的错误,我将以下行添加到我的 python 脚本中并且错误得到解决
sys.path.append("[Path to Python]\Python\Python35-32\Lib\site-packages")
我在 Visual Studio 2015 年使用 IronPython。我是否可以克服这个错误?如果不可能,是否有任何其他方法可以在 c# 环境中 运行 python 脚本(具有上述模块)。
我尝试使用 denfromufa 给出的解决方案,但随后出现以下错误
这是我的Python代码
import sys
import requests
import re
import io
from bs4 import BeautifulSoup
from math import floor
r = requests.get("https://www.google.com/")
data = r.text
soup = BeautifulSoup(data, 'html.parser')
result = []
for item in soup.find_all(attrs={'class' :'something'}):
for m in item.select('a[href^="something"]'):
m1 = m['href'].replace("something","",1)
m2 = re.sub(r'&.*$', "", m1)
m3 = re.sub(r'%3F.*$', "", m2)
m4 = m3.replace("%2F","/")
m5 = m4.replace("%3A",":")
result.append(m5)
result.append(m.get_text())
for image in item.find_all('img'):
k1 = re.sub(r'&cfs.*$',"",image['src'])
k2 = re.sub(r'^https://something.*$',"",k1)
k3 = re.sub(r'.*url=',"",k2)
k4 = re.sub(r'%3F.*$', "", k3)
k5 = k4.replace("%2F","/")
k6 = k5.replace("%3A",":")
k7 = re.sub(r'.*\.gif',"",k6)
result.append(k7)
seen = set()
result_final = []
for item in result:
if item not in seen:
seen.add(item)
result_final.append(item)
result_final = list(result_final)
我的c#代码如下
using (Py.GIL())
{
dynamic sys = Py.Import("sys");
dynamic requests = Py.Import("requests");
dynamic re = Py.Import("re");
dynamic io = Py.Import("io");
dynamic BeautifulSoup = Py.Import("bs4");
dynamic math = Py.Import("math");
Console.WriteLine(5);
dynamic r = requests.get("https://www.google.com/");
dynamic data = r.text;
dynamic soup = BeautifulSoup.BeautifulSoup(data, "html.parser");
}
我用了
var divExp = new { _class = "smoething" };
var item = soup.find_all(Py.kw("class", divExp._class));
我正在得到结果。但是当我尝试在 item 变量上实现 select 方法时,我收到一条错误消息,指出 Python 对象不包含 'select'
的定义item.select("a[href^='https://www.google.com/']");
最终答案
using (Py.GIL())
{
dynamic sys = Py.Import("sys");
dynamic requests = Py.Import("requests");
dynamic re = Py.Import("re");
dynamic io = Py.Import("io");
dynamic BeautifulSoup = Py.Import("bs4");
dynamic math = Py.Import("math");
Console.WriteLine(5);
dynamic r = requests.get(url);
dynamic data = r.text;
dynamic soup = BeautifulSoup.BeautifulSoup(data, "html.parser");
var divExp = new { _class = "className" };
var item = soup.find_all(Py.kw("class", divExp._class));
dynamic tag = soup.select("a[href^='https://something.com/']");
for (var i = 1; i < item.Length(); i++)
{
// Extrxting the required info using regex
String input = Convert.ToString(item[i]);
string pattern_link = "(.*href=\"https:[\/][\/]something.com[\/]a.php\?u=)|(&.*)";
string replacement_link = " ";
Regex rgx_link = new Regex(pattern_link);
string result_link = rgx_link.Replace(input, replacement_link);
.
.
.
.
string pattern_link_1 = "(http|https)%.*";
Regex rgx_link_1 = new Regex(pattern_link_1);
Match result_link_1 = rgx_link_1.Match(result_link);
String input_1_1 = Convert.ToString(result_link_1.Value);
result_link_2 = result_link_2.Replace("%2F", "/").Replace("%3A", ":");
}
}
为什么不使用 HTML 敏捷包?它是 C# 的等价物。
您可以将其导入您的解决方案。
对于外部 python 脚本,请参阅,
https://www.codeproject.com/articles/121374/step-by-step-guidance-of-calling-iron-python-funct
- 安装 CPython,2.7、3.4+ 版本之一
pip install pythonnet
- 参考安装 Python.Runtime.DLL 在您的 .NET 项目中
- 遵循 www.python4.net 上的教程,嵌入部分。
```
> scriptcs (ctrl-c to exit or :help for help)
> #r "C:\Python\Anaconda3_64b\Lib\site-packages\Python.Runtime.dll"
> using Python.Runtime;
> dynamic bs4;
> using (Py.GIL()) {bs4=Py.Import("bs4");}
> bs4.__file__.ToString()
C:\Python\Anaconda3_64b\lib\site-packages\bs4\__init__.py
> dynamic rq;
> using (Py.GIL()) {rq=Py.Import("requests");}
> dynamic r=rq.get("https://www.google.com/")
> dynamic soup = bs4.BeautifulSoup(r.text,"html.parser");
> soup.ToString()
```