从字母表中区分数字并在 Python 中创建新列

Question

数据框如下所示： df:

    item_id value 
0   101002  1.008665
1   101004  2.025818
2   101005  0.530516
3   101007  0.732918
4   101010  1.787264
... ... ...
621 ZB005   3.464102
622 ZB007   2.345208
623 ZB008   3.464102
624 ZBD002  2.592055
625 ZBD005  2.373321

我想根据列 item_id 的第一个 element/letter 创建一个新列 newcol：如果 item_id 的第一个字母是 alphabet ， return "alphabet ";如果第一个字母是数字，return "number".

预期输出：

    item_id value     newcol
0   101002  1.008665  number
1   101004  2.025818  number 
2   101005  0.530516  number
3   101007  0.732918  number
4   101010  1.787264  number
... ... ...
621 ZB005   3.464102  alphabet 
622 ZB007   2.345208  alphabet 
623 ZB008   3.464102  alphabet 
624 ZBD002  2.592055  alphabet 
625 ZBD005  2.373321  alphabet

我试过了：

df['new_component'] = [lambda x: 'alphabet' if x.isalpha() else 'number' for x in df.item_id]

return编辑

    item_id value       new_component
0   101002  1.008665    <function <listcomp>.<lambda> at 0x000002663B04E948>
1   101004  2.025818    <function <listcomp>.<lambda> at 0x000002663B04E828>
2   101005  0.530516    <function <listcomp>.<lambda> at 0x000002663B04EAF8>
3   101007  0.732918    <function <listcomp>.<lambda> at 0x000002663B04EB88>
4   101010  1.787264    <function <listcomp>.<lambda> at 0x000002663B04EC18>
... ... ...

代码有什么问题？有更好的方法吗？

Answer 1

在列表理解中，您正在创建一个 lambda 函数列表。

您只需要先在列表外定义 lambda 函数

y=lambda x, ...

然后，在列表理解中调用它。

Answer 2

首先设置具有默认值'alphabet'的列，然后更改带有数字的列：

df['newcol'] = 'alphabet'
df.loc[df.item_id.str[0].str.isdigit(),'newcol'] = 'number'

如果您更喜欢按照您尝试过的方式进行操作，请按以下步骤操作：

df['newcol'] = ['number' if x[0].isdigit() else 'alphabet' for x in df.item_id]

或等同于：

df['newcol'] = ['alphabet' if x[0].isalpha() else 'number' for x in df.item_id]

输出：

    item_id     value    newcol
0    101002  1.008665    number
1    101004  2.025818    number
2    101005  0.530516    number
3    101007  0.732918    number
4    101010  1.787264    number
621   ZB005  3.464102  alphabet
622   ZB007  2.345208  alphabet
623   ZB008  3.464102  alphabet
624  ZBD002  2.592055  alphabet
625  ZBD005  2.373321  alphabet

从字母表中区分数字并在 Python 中创建新列

Distinguish digits from alphabet and create new column in Python

python

numpy

list-comprehension

pandas