Pandas 等效于 Excel 列值的串联 - Python 3

Question

我有一个 pandas 数据帧 df 这样的，

    A           length
0   648702831   9
1    26533315   8
2   366073121   9
3   354701058   9
4    05708239   8
5   705542215   9
6     1574512   7
7   397015500   9

现在，我需要检查 length 列并根据条件创建一个新列。如果 length = 9，我需要 A 的前五个字符，如果 length = 8，我需要“0”和 A 的前四个字符，依此类推。对于length8，我需要在前面加一个“0”。

例如，

for i in df['length']:
    if i == 9:
       df['new_column'] = df['A'].astype(str).str[0:5]  # to take 5 characters for a df with 10000 rows takes a lot of time
    elif i == 8:
       df['new_column'] = "0" & df['A'].astype(str).str[0:4] ## Need help here

我想要的输出：

            A       length      new_column
    0   648702831   9           64870
    1    26533315   8           02653
    2   366073121   9           36607
    3   354701058   9           35470
    4    05708239   8           00570
    5   705542215   9           70554
    6     1574512   7           00157
    7   397015500   9           39701

在excelpower-query,

是这样做的，

if Text.Length([length]) = 8
   then "0" & Text.Start([length],4)

如何在 python 3 中执行此操作？

Answer 1

IIUC 使用 zfill 和字符串切片

[x[:5-9+y].zfill(5) for x,y in zip(df.A.astype(str),df.length)]
Out[356]: ['64870', '02653', '36607', '35470', '05708', '70554', '00157', '39701']

Answer 2

使用来自 str 访问器的 pad：

df['A'].astype(str).str.pad(5, side='left', fillchar='0').str[:5]

0    64870
1    02653
2    36607
3    35470
4    00570
5    70554
6    00157
7    39701

Pandas 等效于 Excel 列值的串联 - Python 3

Pandas Equivalent of Excel Concatanation for Column Values - Python 3

calculated-columns

python-3.x

pandas