Pandas DataFrame 将单元格显示为字符串,但 returns 当我尝试拆分单元格时出现错误
Pandas DataFrame shows cells to be strings, but returns an error when I try to split cells
我有一个 Pandas DataFrame df
,其中有一列 df['auc_all']
,其中包含一个具有两个值的元组(例如 (0.54, 0.044)
)
当我运行:
type(df['auc_all'][0])
>>> str
然而,当我 运行:
def convert_str_into_tuple(self, string):
splitted_tuple = string.split(',')
value1 = float(splitted_tuple[0][1:])
value2 = float(splitted_tuple[1][1:-1])
return (value1, value2)
df['auc_all'] = df['auc_all'].apply(convert_str_into_tuple)
我收到以下错误:
df = full_df.create_full()
Traceback (most recent call last):
File "<ipython-input-437-34fc05204bad>", line 18, in create_full
df['auc_all'] = df['auc_all'].apply(self.convert_str_into_tuple)
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\apply.py", line 1043, in apply
return self.apply_standard()
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas\_libs\lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "<ipython-input-437-34fc05204bad>", line 63, in convert_str_into_tuple
splitted_tuple = string.split(',')
AttributeError: 'tuple' object has no attribute 'split'
这似乎表明该单元格包含一个元组。
但是:
df['auc'][0][0]
>>> '('
好像变量类型根据我使用它的地方而改变。这真的发生了吗?
如果您的列包含元组作为字符串,请使用 pd.eval
:
df['auc_all'] = pd.eval(df['auc_all'])
示例:
# df = pd.DataFrame({'auc_all': ['(0.54, 0.044)']})
>>> df
auc_all
0 (0.54, 0.044)
>>> type(df['auc_all'][0])
str
# df['auc_all'] = pd.eval(df['auc_all'])
>>> df
auc_all
0 [0.54, 0.044]
>>> type(df['auc_all'][0])
list
缺点是您的元组被转换为列表,但您可以使用 ast
模块中的 literal_eval
:
# import ast
# df['auc_all'] = df['auc_all'].apply(ast.literal_eval)
>>> df
auc_all
0 (0.54, 0.044)
>>> type(df['auc_all'][0])
tuple
我有一个 Pandas DataFrame df
,其中有一列 df['auc_all']
,其中包含一个具有两个值的元组(例如 (0.54, 0.044)
)
当我运行:
type(df['auc_all'][0])
>>> str
然而,当我 运行:
def convert_str_into_tuple(self, string):
splitted_tuple = string.split(',')
value1 = float(splitted_tuple[0][1:])
value2 = float(splitted_tuple[1][1:-1])
return (value1, value2)
df['auc_all'] = df['auc_all'].apply(convert_str_into_tuple)
我收到以下错误:
df = full_df.create_full()
Traceback (most recent call last):
File "<ipython-input-437-34fc05204bad>", line 18, in create_full
df['auc_all'] = df['auc_all'].apply(self.convert_str_into_tuple)
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\apply.py", line 1043, in apply
return self.apply_standard()
File "C:\Users200016\Anaconda3\lib\site-packages\pandas\core\apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas\_libs\lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "<ipython-input-437-34fc05204bad>", line 63, in convert_str_into_tuple
splitted_tuple = string.split(',')
AttributeError: 'tuple' object has no attribute 'split'
这似乎表明该单元格包含一个元组。
但是:
df['auc'][0][0]
>>> '('
好像变量类型根据我使用它的地方而改变。这真的发生了吗?
如果您的列包含元组作为字符串,请使用 pd.eval
:
df['auc_all'] = pd.eval(df['auc_all'])
示例:
# df = pd.DataFrame({'auc_all': ['(0.54, 0.044)']})
>>> df
auc_all
0 (0.54, 0.044)
>>> type(df['auc_all'][0])
str
# df['auc_all'] = pd.eval(df['auc_all'])
>>> df
auc_all
0 [0.54, 0.044]
>>> type(df['auc_all'][0])
list
缺点是您的元组被转换为列表,但您可以使用 ast
模块中的 literal_eval
:
# import ast
# df['auc_all'] = df['auc_all'].apply(ast.literal_eval)
>>> df
auc_all
0 (0.54, 0.044)
>>> type(df['auc_all'][0])
tuple