应用函数创建多列作为参数的字符串
Apply function to create string with multiple columns as argument
我有这样一个数据框:
name . size . type . av_size_type
0 John . 23 . Qapra' . 22
1 Dan . 21 . nuk'neH . 12
2 Monica . 12 . kahless . 15
我想用一句话创建一个新的专栏,像这样:
name . size . type . av_size_type . sentence
0 John . 23 . Qapra' . 22 . "John has size 23, above the average of Qapra' type (22)"
1 Dan . 21 . nuk'neH . 12 . "Dan has size 21, above the average of nuk'neH type (21)"
2 Monica . 12 . kahless . 15 . "Monica has size 12l, above the average of kahless type (12)
会是这样的:
def func(x):
string="{0} has size {1}, above the average of {2} type ({3})".format(x[0],x[1],x[2],x[3])
return string
df['sentence']=df[['name','size','type','av_size_type']].apply(func)
但是,显然这种合成器不起作用。
有人对此有什么建议吗?
使用 splat 并解压
string = lambda x: "{} has size {}, above the average of {} type ({})".format(*x)
df.assign(sentence=df.apply(string, 1))
name size type av_size_type sentence
0 John 23 Qapra' 22 John has size 23, above the average of Qapra' ...
1 Dan 21 nuk'neH 12 Dan has size 21, above the average of nuk'neH ...
2 Monica 12 kahless 15 Monica has size 12, above the average of kahle...
如果需要,可以使用字典解包
string = lambda x: "{name} has size {size}, above the average of {type} type ({av_size_type})".format(**x)
df.assign(sentence=df.apply(string, 1))
name size type av_size_type sentence
0 John 23 Qapra' 22 John has size 23, above the average of Qapra' ...
1 Dan 21 nuk'neH 12 Dan has size 21, above the average of nuk'neH ...
2 Monica 12 kahless 15 Monica has size 12, above the average of kahle...
使用列表理解作为快速替代方案,因为您被迫迭代:
string = "{0} has size {1}, above the average of {2} type ({3})"
df['sentence'] = [string.format(*r) for r in df.values.tolist()]
df
name size type av_size_type \
0 John 23 Qapra' 22
1 Dan 21 nuk'neH 12
2 Monica 12 kahless 15
sentence
0 John has size 23, above the average of Qapra' ...
1 Dan has size 21, above the average of nuk'neH ...
2 Monica has size 12, above the average of kahle...
可以直接用apply造句
df['sentence'] = (
df.apply(lambda x: "{} has size {}, above the average of {} type ({})"
.format(*x), axis=1)
)
如果您想显式引用列,您可以这样做:
df['sentence'] = (
df.apply(lambda x: "{} has size {}, above the average of {} type ({})"
.format(x.name, x.size, x.type, x.av_size_type), axis=1)
)
我有这样一个数据框:
name . size . type . av_size_type
0 John . 23 . Qapra' . 22
1 Dan . 21 . nuk'neH . 12
2 Monica . 12 . kahless . 15
我想用一句话创建一个新的专栏,像这样:
name . size . type . av_size_type . sentence
0 John . 23 . Qapra' . 22 . "John has size 23, above the average of Qapra' type (22)"
1 Dan . 21 . nuk'neH . 12 . "Dan has size 21, above the average of nuk'neH type (21)"
2 Monica . 12 . kahless . 15 . "Monica has size 12l, above the average of kahless type (12)
会是这样的:
def func(x):
string="{0} has size {1}, above the average of {2} type ({3})".format(x[0],x[1],x[2],x[3])
return string
df['sentence']=df[['name','size','type','av_size_type']].apply(func)
但是,显然这种合成器不起作用。
有人对此有什么建议吗?
使用 splat 并解压
string = lambda x: "{} has size {}, above the average of {} type ({})".format(*x)
df.assign(sentence=df.apply(string, 1))
name size type av_size_type sentence
0 John 23 Qapra' 22 John has size 23, above the average of Qapra' ...
1 Dan 21 nuk'neH 12 Dan has size 21, above the average of nuk'neH ...
2 Monica 12 kahless 15 Monica has size 12, above the average of kahle...
如果需要,可以使用字典解包
string = lambda x: "{name} has size {size}, above the average of {type} type ({av_size_type})".format(**x)
df.assign(sentence=df.apply(string, 1))
name size type av_size_type sentence
0 John 23 Qapra' 22 John has size 23, above the average of Qapra' ...
1 Dan 21 nuk'neH 12 Dan has size 21, above the average of nuk'neH ...
2 Monica 12 kahless 15 Monica has size 12, above the average of kahle...
使用列表理解作为快速替代方案,因为您被迫迭代:
string = "{0} has size {1}, above the average of {2} type ({3})"
df['sentence'] = [string.format(*r) for r in df.values.tolist()]
df
name size type av_size_type \
0 John 23 Qapra' 22
1 Dan 21 nuk'neH 12
2 Monica 12 kahless 15
sentence
0 John has size 23, above the average of Qapra' ...
1 Dan has size 21, above the average of nuk'neH ...
2 Monica has size 12, above the average of kahle...
可以直接用apply造句
df['sentence'] = (
df.apply(lambda x: "{} has size {}, above the average of {} type ({})"
.format(*x), axis=1)
)
如果您想显式引用列,您可以这样做:
df['sentence'] = (
df.apply(lambda x: "{} has size {}, above the average of {} type ({})"
.format(x.name, x.size, x.type, x.av_size_type), axis=1)
)