根据值何时更改而不使用 if 语句重写数据框中的列单元格值
rewritng a column cell values in a dataframe based on when the value change without using if statment
我有一个包含错误值的列,因为它应该计算周期,但是数据来自的设备在 50 之后重置计数,所以我只剩下 exmalple [1,1,1,1,2,2 ,2,3,3,3,3,...,50,50,50,1,1,1,2,2,2,2,3,3,3,...,50,50, .....,50]
我的解决方案是,我什至无法让它工作:(为简单起见,我从 10 个周期开始重置数据
data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,1,1,1,2,3,3,3,3,
4,4,5,6,6,6,7,8,8,8,8,9,10]}
df = pd.DataFrame(data)
x=0
count=0
old_value=df.at[x,'Cyc-Count']
for x in range(x,len(df)-1):
if df.at[x,'Cyc-Count']==df.at[x+1,'Cyc-Count']:
old_value=df.at[x+1,'Cyc-Count']
df.at[x+1,'Cyc-Count']=count
else:
old_value=df.at[x+1,'Cyc-Count']
count+=1
df.at[x+1,'Cyc-Count']=count
我需要解决这个问题,但最好不要使用 if
语句
上例所需的输出应该是
data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,11,11,11,12,13,13,13,13,
14,14,15,16,16,16,17,18,18,18,18,19,20]}
提示“我的方法有一个大问题是最后一个索引值将很难更改,因为在将它与它的索引+1 进行比较时 > 它甚至不存在
IIUC,你想在计数器减少的时候继续计数
您可以使用矢量代码:
s = df['Cyc-Count'].shift()
df['Cyc-Count2'] = (df['Cyc-Count']
+ s.where(s.gt(df['Cyc-Count']))
.fillna(0, downcast='infer')
.cumsum()
)
或者,就地修改列:
s = df['Cyc-Count'].shift()
df['Cyc-Count'] += (s.where(s.gt(df['Cyc-Count']))
.fillna(0, downcast='infer').cumsum()
)
输出:
Cyc-Count Cyc-Count2
0 1 1
1 1 1
2 1 1
3 1 1
4 2 2
5 2 2
6 2 2
7 3 3
8 3 3
9 3 3
10 3 3
11 4 4
12 5 5
13 5 5
14 5 5
15 1 6
16 1 6
17 1 6
18 2 7
19 2 7
20 2 7
21 2 7
22 3 8
23 3 8
24 3 8
25 4 9
26 5 10
27 5 10
28 1 11
29 2 12
30 2 12
31 3 13
32 4 14
33 5 15
34 5 15
使用的输入:
l = [1,1,1,1,2,2,2,3,3,3,3,4,5,5,5,1,1,1,2,2,2,2,3,3,3,4,5,5,1,2,2,3,4,5,5]
df = pd.DataFrame({'Cyc-Count': l})
您可以使用 df.loc
通过标签或布尔数组访问一组行和列。
语法:df.loc[df['column name'] 条件,'column name or the new one'] = 'value if condition is met'
例如:
import pandas as pd
numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,0,0]}
df = pd.DataFrame(numbers,columns=['set_of_numbers'])
print (df)
df.loc[df['set_of_numbers'] == 0, 'set_of_numbers'] = 999
df.loc[df['set_of_numbers'] == 5, 'set_of_numbers'] = 555
print (df)
之前:‘set_of_numbers’:[1,2,3,4,5,6,7,8,9,10,0,0]
之后:‘set_of_numbers’:[1,2,3,4,555,6,7,8,9,10,999,999]
我有一个包含错误值的列,因为它应该计算周期,但是数据来自的设备在 50 之后重置计数,所以我只剩下 exmalple [1,1,1,1,2,2 ,2,3,3,3,3,...,50,50,50,1,1,1,2,2,2,2,3,3,3,...,50,50, .....,50] 我的解决方案是,我什至无法让它工作:(为简单起见,我从 10 个周期开始重置数据
data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,1,1,1,2,3,3,3,3,
4,4,5,6,6,6,7,8,8,8,8,9,10]}
df = pd.DataFrame(data)
x=0
count=0
old_value=df.at[x,'Cyc-Count']
for x in range(x,len(df)-1):
if df.at[x,'Cyc-Count']==df.at[x+1,'Cyc-Count']:
old_value=df.at[x+1,'Cyc-Count']
df.at[x+1,'Cyc-Count']=count
else:
old_value=df.at[x+1,'Cyc-Count']
count+=1
df.at[x+1,'Cyc-Count']=count
我需要解决这个问题,但最好不要使用 if
语句
上例所需的输出应该是
data = {'Cyc-Count':[1,1,2,2,2,3,4,5,6,7,7,7,8,9,10,11,11,11,12,13,13,13,13,
14,14,15,16,16,16,17,18,18,18,18,19,20]}
提示“我的方法有一个大问题是最后一个索引值将很难更改,因为在将它与它的索引+1 进行比较时 > 它甚至不存在
IIUC,你想在计数器减少的时候继续计数
您可以使用矢量代码:
s = df['Cyc-Count'].shift()
df['Cyc-Count2'] = (df['Cyc-Count']
+ s.where(s.gt(df['Cyc-Count']))
.fillna(0, downcast='infer')
.cumsum()
)
或者,就地修改列:
s = df['Cyc-Count'].shift()
df['Cyc-Count'] += (s.where(s.gt(df['Cyc-Count']))
.fillna(0, downcast='infer').cumsum()
)
输出:
Cyc-Count Cyc-Count2
0 1 1
1 1 1
2 1 1
3 1 1
4 2 2
5 2 2
6 2 2
7 3 3
8 3 3
9 3 3
10 3 3
11 4 4
12 5 5
13 5 5
14 5 5
15 1 6
16 1 6
17 1 6
18 2 7
19 2 7
20 2 7
21 2 7
22 3 8
23 3 8
24 3 8
25 4 9
26 5 10
27 5 10
28 1 11
29 2 12
30 2 12
31 3 13
32 4 14
33 5 15
34 5 15
使用的输入:
l = [1,1,1,1,2,2,2,3,3,3,3,4,5,5,5,1,1,1,2,2,2,2,3,3,3,4,5,5,1,2,2,3,4,5,5]
df = pd.DataFrame({'Cyc-Count': l})
您可以使用 df.loc
通过标签或布尔数组访问一组行和列。
语法:df.loc[df['column name'] 条件,'column name or the new one'] = 'value if condition is met'
例如:
import pandas as pd
numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,0,0]}
df = pd.DataFrame(numbers,columns=['set_of_numbers'])
print (df)
df.loc[df['set_of_numbers'] == 0, 'set_of_numbers'] = 999
df.loc[df['set_of_numbers'] == 5, 'set_of_numbers'] = 555
print (df)
之前:‘set_of_numbers’:[1,2,3,4,5,6,7,8,9,10,0,0]
之后:‘set_of_numbers’:[1,2,3,4,555,6,7,8,9,10,999,999]