在 pandas 数据帧映射函数中使用 eval 语句的正确方法
right way to use eval statement in pandas dataframe map function
我有一个 pandas 数据框,其中一列是 'organization',该列的内容是一个字符串,其中包含一个列表:
data['organization'][0]
Out[6] "['loony tunes']"
data['organization'][1]
Out[7] "['the three stooges']"
我想用字符串中的列表替换字符串。我尝试使用 map,其中 map 中的函数是 eval:
data['organization'] = data['organization'].map(eval)
但我得到的是:
Traceback (most recent call last):
File "C:\Users\xxx\Anaconda3\lib\site- packages\IPython\core\interactiveshell.py", line 3035, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-3dbc0abf8c2e>", line 1, in <module>
data['organization'] = data['organization'].map(eval)
File "C:\Users\xxx\Anaconda3\lib\site-packages\pandas\core\series.py", line 2015, in map
mapped = map_f(values, arg)
File "pandas\src\inference.pyx", line 1046, in pandas.lib.map_infer (pandas\lib.c:56983)
TypeError: eval() arg 1 must be a string, bytes or code object
于是我求助于下面的代码块,效率极低:
for index, line in data['organization'].iteritems():
print(index)
if type(line) != str:
data['organization'][index] = []
try:
data['organization'][index] = eval(data['organization'][index])
except:
continue
我做错了什么?我如何使用 eval (或矢量化实现)而不是上面笨拙的循环?
我认为问题可能是 pd.series 数据 ['organization'] 中的某些元素不是字符串,所以我实现了以下内容:
def is_string(x):
if type(x) != str:
x = ''
data['organization'] = data['organization'].map(is_string)
但我尝试时仍然遇到同样的错误:
data['organization'] = data['organization'].map(eval)
提前致谢。
通常不赞成使用 eval,因为它 允许任意 python 代码成为 运行。所以你应该强烈尽量不要使用它。
在这种情况下,您不需要计算表达式,只需要解析值。这意味着您可以使用 ast 的 literal_eval
:
In [11]: s = pd.Series(["['loony tunes']", "['the three stooges']"])
In [12]: from ast import literal_eval
In [13]: s.apply(literal_eval)
Out[13]:
0 [loony tunes]
1 [the three stooges]
dtype: object
In [14]: s.apply(literal_eval)[0] # look, it works!
Out[14]: ['loony tunes']
来自docs:
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
我有一个 pandas 数据框,其中一列是 'organization',该列的内容是一个字符串,其中包含一个列表:
data['organization'][0]
Out[6] "['loony tunes']"
data['organization'][1]
Out[7] "['the three stooges']"
我想用字符串中的列表替换字符串。我尝试使用 map,其中 map 中的函数是 eval:
data['organization'] = data['organization'].map(eval)
但我得到的是:
Traceback (most recent call last):
File "C:\Users\xxx\Anaconda3\lib\site- packages\IPython\core\interactiveshell.py", line 3035, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-3dbc0abf8c2e>", line 1, in <module>
data['organization'] = data['organization'].map(eval)
File "C:\Users\xxx\Anaconda3\lib\site-packages\pandas\core\series.py", line 2015, in map
mapped = map_f(values, arg)
File "pandas\src\inference.pyx", line 1046, in pandas.lib.map_infer (pandas\lib.c:56983)
TypeError: eval() arg 1 must be a string, bytes or code object
于是我求助于下面的代码块,效率极低:
for index, line in data['organization'].iteritems():
print(index)
if type(line) != str:
data['organization'][index] = []
try:
data['organization'][index] = eval(data['organization'][index])
except:
continue
我做错了什么?我如何使用 eval (或矢量化实现)而不是上面笨拙的循环?
我认为问题可能是 pd.series 数据 ['organization'] 中的某些元素不是字符串,所以我实现了以下内容:
def is_string(x):
if type(x) != str:
x = ''
data['organization'] = data['organization'].map(is_string)
但我尝试时仍然遇到同样的错误:
data['organization'] = data['organization'].map(eval)
提前致谢。
通常不赞成使用 eval,因为它 允许任意 python 代码成为 运行。所以你应该强烈尽量不要使用它。
在这种情况下,您不需要计算表达式,只需要解析值。这意味着您可以使用 ast 的 literal_eval
:
In [11]: s = pd.Series(["['loony tunes']", "['the three stooges']"])
In [12]: from ast import literal_eval
In [13]: s.apply(literal_eval)
Out[13]:
0 [loony tunes]
1 [the three stooges]
dtype: object
In [14]: s.apply(literal_eval)[0] # look, it works!
Out[14]: ['loony tunes']
来自docs:
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.