为什么 pandas explode 方法在我的数据框中不起作用?
why pandas explode method not working in my dataframe?
这是我的代码:
df = pd.read_csv('my_path\zzounds.csv')
df.head()
variation_type main_image
['yellow', 'orange'] ['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']
我试过这个代码
df.explode(['variation_type','main_image'])
但它返回的是原始数据帧。
我相信这是因为 python 很难以这种方式分解多个列。
您可以使用此代码获得我相信您期待的结果
data = {
' variation_type' : [['yellow', 'orange']],
'main_image' : [['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]
}
df = pd.DataFrame(data)
df.apply(pd.Series.explode)
***注意:这仅在所有“列表”字段长度相同时才有效
由于是从csv文件读取,需要先转换字符串列表
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: pd.eval(x, local_dict={'nan': np.nan}))
# or
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: eval(x, {'nan': np.nan}))
df = df.explode(['variation_type','main_image'])
为了缩小问题范围,请注意您问题中显示的数据框确实适用于 explode()
。如果您的值是看起来像列表的字符串,那么按照@Ynjxsjmh 的建议,可能需要先将它们转换为列表值。
示例测试代码:
import pandas as pd
df = pd.DataFrame({
'variation_type':[['yellow', 'orange']],
'main_image':[['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]})
print(df.to_string())
df = df.explode(['variation_type','main_image'])
print(df.to_string())
输入:
variation_type main_image
0 [yellow, orange] [https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg, https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg]
输出:
variation_type main_image
0 yellow https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
0 orange https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
这是我的代码:
df = pd.read_csv('my_path\zzounds.csv')
df.head()
variation_type main_image
['yellow', 'orange'] ['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']
我试过这个代码
df.explode(['variation_type','main_image'])
但它返回的是原始数据帧。
我相信这是因为 python 很难以这种方式分解多个列。
您可以使用此代码获得我相信您期待的结果
data = {
' variation_type' : [['yellow', 'orange']],
'main_image' : [['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]
}
df = pd.DataFrame(data)
df.apply(pd.Series.explode)
***注意:这仅在所有“列表”字段长度相同时才有效
由于是从csv文件读取,需要先转换字符串列表
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: pd.eval(x, local_dict={'nan': np.nan}))
# or
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: eval(x, {'nan': np.nan}))
df = df.explode(['variation_type','main_image'])
为了缩小问题范围,请注意您问题中显示的数据框确实适用于 explode()
。如果您的值是看起来像列表的字符串,那么按照@Ynjxsjmh 的建议,可能需要先将它们转换为列表值。
示例测试代码:
import pandas as pd
df = pd.DataFrame({
'variation_type':[['yellow', 'orange']],
'main_image':[['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]})
print(df.to_string())
df = df.explode(['variation_type','main_image'])
print(df.to_string())
输入:
variation_type main_image
0 [yellow, orange] [https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg, https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg]
输出:
variation_type main_image
0 yellow https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
0 orange https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg