将 pandasql 输出分配给 DataFrame 中的新列
Assigning pandasql output to new column in DataFrame
我正在使用 pandasql 从 df1 获取数据。
我可以将查询的输出分配给 df2
中的新列吗?我试过 (df2['grade']=ps.sqldf(sqlcode,locals()))
但这没有用,这是预期的,因为查询输出不是直接系列。有办法吗?提前致谢!
import pandasql as ps
df1=pd.DataFrame({"min":[10,10,21],
"max":[20, 20, 30],
"grade":['low', 'medium', "high"],
"class":['english', 'math', "english"]})
df2=pd.DataFrame({"score":([15, 16, 25]),
"class":['english', 'math', "english"]})
import pandasql as ps
sqlcode = '''
select
df1.grade
from df2
inner join df1
on df2.score between df1.min and df1.max and df1.class = df2.class
'''
newdf = ps.sqldf(sqlcode,locals())
newdf
无需分配新列,您可以通过稍微调整 sql 查询直接获得所需的输出:
select df2.*, df1.grade -- Notice the change
from df2
left join df1 -- Notice the change
on (df2.score between df1.min and df1.max) and (df1.class = df2.class)
score class grade
0 15 english low
1 16 math medium
2 25 english high
我正在使用 pandasql 从 df1 获取数据。
我可以将查询的输出分配给 df2
中的新列吗?我试过 (df2['grade']=ps.sqldf(sqlcode,locals()))
但这没有用,这是预期的,因为查询输出不是直接系列。有办法吗?提前致谢!
import pandasql as ps
df1=pd.DataFrame({"min":[10,10,21],
"max":[20, 20, 30],
"grade":['low', 'medium', "high"],
"class":['english', 'math', "english"]})
df2=pd.DataFrame({"score":([15, 16, 25]),
"class":['english', 'math', "english"]})
import pandasql as ps
sqlcode = '''
select
df1.grade
from df2
inner join df1
on df2.score between df1.min and df1.max and df1.class = df2.class
'''
newdf = ps.sqldf(sqlcode,locals())
newdf
无需分配新列,您可以通过稍微调整 sql 查询直接获得所需的输出:
select df2.*, df1.grade -- Notice the change
from df2
left join df1 -- Notice the change
on (df2.score between df1.min and df1.max) and (df1.class = df2.class)
score class grade
0 15 english low
1 16 math medium
2 25 english high