为列提供 pandas 一个 python 可迭代与 pd.Series 之间的区别
Difference between giving pandas a python iterable vs a pd.Series for column
通过 List
与 pd.Series
类型来创建新的 dataFrame 列之间有哪些区别?例如,从反复试验中我注意到:
# (1d) We can also give it a Series, which is quite similar to giving it a List
df['cost1'] = pd.Series([random.choice([1.99,2.99,3.99]) for i in range(len(df))])
df['cost2'] = [random.choice([1.99,2.99,3.99]) for i in range(len(df))]
df['cost3'] = pd.Series([1,2,3]) # <== will pad length with `NaN`
df['cost4'] = [1,2,3] # <== this one will fail because not the same size
d
pd.Series
与通过标准 python 列表不同还有其他原因吗?数据框可以接受任何 python 可迭代对象还是对可以传递给它的内容有限制?最后,使用 pd.Series
'correct' 方式添加列,还是可以与其他类型互换使用?
List
这里分配给dataframe需要相同的长度
对于pd.Series
assign ,它将使用索引作为键来匹配原始DataFrame
index
,然后在Series
[中使用相同索引填充值=17=]
df=pd.DataFrame([1,2,3],index=[9,8,7])
df['New']=pd.Series([1,2,3])
# the default index is range index , which is from 0 to n
# since the dataframe index dose not match the series, then will return NaN
df
Out[88]:
0 New
9 1 NaN
8 2 NaN
7 3 NaN
匹配索引的长度不同
df['New']=pd.Series([1,2],index=[9,8])
df
Out[90]:
0 New
9 1 1.0
8 2 2.0
7 3 NaN
通过 List
与 pd.Series
类型来创建新的 dataFrame 列之间有哪些区别?例如,从反复试验中我注意到:
# (1d) We can also give it a Series, which is quite similar to giving it a List
df['cost1'] = pd.Series([random.choice([1.99,2.99,3.99]) for i in range(len(df))])
df['cost2'] = [random.choice([1.99,2.99,3.99]) for i in range(len(df))]
df['cost3'] = pd.Series([1,2,3]) # <== will pad length with `NaN`
df['cost4'] = [1,2,3] # <== this one will fail because not the same size
d
pd.Series
与通过标准 python 列表不同还有其他原因吗?数据框可以接受任何 python 可迭代对象还是对可以传递给它的内容有限制?最后,使用 pd.Series
'correct' 方式添加列,还是可以与其他类型互换使用?
List
这里分配给dataframe需要相同的长度
对于pd.Series
assign ,它将使用索引作为键来匹配原始DataFrame
index
,然后在Series
[中使用相同索引填充值=17=]
df=pd.DataFrame([1,2,3],index=[9,8,7])
df['New']=pd.Series([1,2,3])
# the default index is range index , which is from 0 to n
# since the dataframe index dose not match the series, then will return NaN
df
Out[88]:
0 New
9 1 NaN
8 2 NaN
7 3 NaN
匹配索引的长度不同
df['New']=pd.Series([1,2],index=[9,8])
df
Out[90]:
0 New
9 1 1.0
8 2 2.0
7 3 NaN