使用 pandas 中的映射逻辑替换列值（实现函数时出现问题）

Question

我有一个数据框如下。我想要的是生成另一列 (freq)，其中的行将根据以下逻辑具有值：

如果Mode列值以数字m开头，则在中填写数字n ]freq列。
```
- m: 1, n: 12
- m: 6, n: 4
- m: 7, n: 2
- m: 8, n: 1
```

DataFrame

这是我尝试实现的逻辑。但不知何故它似乎不起作用。即使您可以提出一些替代解决方案，但不使用我的代码，那也可以。

def check_mode(Mode):
    freq = ''
    if (Mode.str.startswith('8')).any(): 
        freq = 1
    elif (Mode.startswith("7")).all():  
        freq = 2
    elif (Mode.startswith("6")).any():  
        freq = 4
    elif (Mode.startswith("1")).any(): 
        freq = 12
    return freq

df['freq']=check_mode(df_ia['Mode'].values)

一些观察

如果我使用：

if (Mode.str.startswith('8')).any():

我收到错误：

AttributeError: 'numpy.ndarray' object has no attribute 'str'

如果我使用：

if (Mode.startswith('8')).any():

我收到：

AttributeError: 'numpy.ndarray' object has no attribute 'startswith'

任何帮助将不胜感激。谢谢。

Answer 1

这就是你想要的吗？

打印(df1)

    Mode
0    602
1    603
2    700
3    100
4    100
5    100
6    802
7    100
8    100
9    100
10   100



 c=[df1['Mode'].astype(str).str.startswith('8'),df1['Mode'].astype(str).str.startswith('7'),df1['Mode'].astype(str).str.startswith('6'),df1['Mode'].astype(str).str.startswith('1')]
 ch=[1,2,4,12]
 df1['newcol']=np.select(c, ch,0)

结果

   Mode  newcol
0    602       4
1    603       4
2    700       2
3    100      12
4    100      12
5    100      12
6    802       1
7    100      12
8    100      12
9    100      12
10   100      12

Answer 2

试试 np.select

df=Mode
Mode = df.Mode.astype(str)
cond1 = Mode.str.startswith('8')
cond2 = Mode.str.startswith("7")
cond3 = Mode.str.startswith("6")
cond4 = Mode.str.startswith("1")
freq = [1,2,4,12]
df['new'] = np.select([cond1,cond2,cond3,cond4],freq)
df
   Mode  new
0   602    4
1   603    4
2   700    2
3   100   12
4   100   12
5   100   12
6   802    1
7   100   12
8   100   12
9   100   12
10  100   12

Answer 3

'startswith' 是一个 pandas 数据框 function/method。您正在将一个 numpy 数组传递给 check_mode() 方法。这就是低于错误

的原因

AttributeError: 'numpy.ndarray' object has no attribute 'str'

为避免此问题，请发送如下 pandas 系列

df['freq']=check_mode(df_ia['Mode'])

注意：请记住，Series 对象不会有 'startswith'，因此您需要使用 str.startswith 选项，并且还需要将数据作为相同的字符串

Answer 4

试试这个。 一个衬垫.

df['freq'] = df.Mode.astype(str).str.get(0).replace({'8': 1, '7': 2, '6': 4, '1': 12})

现在让我们解开它的作用：

# You can run this cell and check the result as well

(df.Mode.astype(str) # convert the column "Mode" into str data type
   .str.get(0)       # get string based methods and access the get 
                     # method to get the 1st (`.get(0)`) digit
    # replace the digits with a dictionary that 
    # maps to their replacement values.
   .replace({'8': 1, '7': 2, '6': 4, '1': 12}))

代码

df = pd.DataFrame([602, 603, 700, 100, 100, 100, 802, 100, 100, 100, 100,], columns=['Mode'])
df['freq'] = df.Mode.astype(str).str.get(0).replace({'8': 1, '7': 2, '6': 4, '1': 12})
df

## Output
#     Mode  freq
# 0    602     4
# 1    603     4
# 2    700     2
# 3    100    12
# 4    100    12
# 5    100    12
# 6    802     1
# 7    100    12
# 8    100    12
# 9    100    12
# 10   100    12

使用 pandas 中的映射逻辑替换列值（实现函数时出现问题）

Replace column values using a mapping-logic in pandas (problem with implementing a function)

python

numpy

dataframe

pandas

numpy-ndarray

一些观察

代码