根据行索引将列中的值更改为 np.nan
Change values in a column to np.nan based upon row index
我想有选择地将列值更改为 np.nan。
我有一列有很多零 (0) 值。
我正在获取总数的一个子集的行索引。
我将索引放入变量 (s0)。
然后我使用它为索引在 s0 中的行将列值设置为 np.nan。
它运行,但它正在将每一行(即整列)更改为 np.nan。
这是我的代码:
print((df3['amount_tsh'] == 0).sum()) # 41639 <-- there are this many zeros to start
# print(df3['amount_tsh'].value_counts()[0])
s0 = df3['amount_tsh'][df3['amount_tsh'].eq(0)].sample(37322).index # grab 37322 row indexes
print(len(s0)) # 37322
df3['amount_tsh'] = df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan # change the value in the column to np.nan if it's index is in s0
print(df3['amount_tsh'].isnull().sum())
让我们试试
s0 = df3.loc[df3['amount_tsh'].eq(0), ['amount_tsh']].sample(37322)
df3.loc[df3.index.isin(s0.index), 'amount_tsh'] = np.nan
为了快速修复,我使用了笔记本中的这些数据,它对我有用
import pandas as pd
import numpy as np
data = pd.DataFrame({'Symbol': {0: 'ABNB', 1: 'DKNG', 2: 'EXPE', 3: 'MPNGF', 4: 'RDFN', 5: 'ROKU', 6: 'VIACA', 7: 'Z'},
'Number of Buys': {0: np.nan, 1: 2.0, 2: np.nan, 3: 1.0, 4: 2.0, 5: 1.0, 6: 1.0, 7: np.nan},
'Number of Sell s': {0: 1.0, 1: np.nan, 2: 1.0, 3: np.nan, 4: np.nan, 5: np.nan, 6: np.nan, 7: 1.0},
'Gains/Losses': {0: 2106.0, 1: -1479.2, 2: 1863.18, 3: -1980.0, 4: -1687.7, 5: -1520.52, 6: -1282.4, 7: 1624.59}, 'Percentage change': {0: 0.0, 1: 2.0, 2: 0.0, 3: 0.0, 4: 1.5, 5: 0.0, 6: 0.0, 7: 0.0}})
rows = ['ABNB','DKNG','EXPE']
data
Symbol Number of Buys Number of Sell s Gains/Losses \
0 ABNB NaN 1.0 2106.00
1 DKNG 2.0 NaN -1479.20
2 EXPE NaN 1.0 1863.18
3 MPNGF 1.0 NaN -1980.00
4 RDFN 2.0 NaN -1687.70
5 ROKU 1.0 NaN -1520.52
6 VIACA 1.0 NaN -1282.40
7 Z NaN 1.0 1624.59
Percentage change
0 0.0
1 2.0
2 0.0
3 0.0
4 1.5
5 0.0
6 0.0
7 0.0
按照你的方法
(data['Number of Buys']==1.0).sum()
s0= data.loc[(data['Number of Buys']==1.0),['Number of Buys']].sample(2)
data.loc[data.index.isin(s0.index),'Number of Buys'] =np.nan
Symbol Number of Buys Number of Sell s Gains/Losses \
0 ABNB NaN 1.0 2106.00
1 DKNG 2.0 NaN -1479.20
2 EXPE NaN 1.0 1863.18
3 MPNGF 1.0 NaN -1980.00
4 RDFN 2.0 NaN -1687.70
5 ROKU NaN NaN -1520.52
6 VIACA NaN NaN -1282.40
7 Z NaN 1.0 1624.59
Percentage change
0 0.0
1 2.0
2 0.0
3 0.0
4 1.5
5 0.0
6 0.0
7 0.0
嗯...
我删除了 re-assignment 并且它起作用了??
s0 = df3['amount_tsh'][df3['amount_tsh'].eq(0)].sample(37322).index
df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan
第二行是:
df3['amount_tsh'] = df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan
我想有选择地将列值更改为 np.nan。
我有一列有很多零 (0) 值。
我正在获取总数的一个子集的行索引。
我将索引放入变量 (s0)。
然后我使用它为索引在 s0 中的行将列值设置为 np.nan。
它运行,但它正在将每一行(即整列)更改为 np.nan。
这是我的代码:
print((df3['amount_tsh'] == 0).sum()) # 41639 <-- there are this many zeros to start
# print(df3['amount_tsh'].value_counts()[0])
s0 = df3['amount_tsh'][df3['amount_tsh'].eq(0)].sample(37322).index # grab 37322 row indexes
print(len(s0)) # 37322
df3['amount_tsh'] = df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan # change the value in the column to np.nan if it's index is in s0
print(df3['amount_tsh'].isnull().sum())
让我们试试
s0 = df3.loc[df3['amount_tsh'].eq(0), ['amount_tsh']].sample(37322)
df3.loc[df3.index.isin(s0.index), 'amount_tsh'] = np.nan
为了快速修复,我使用了笔记本中的这些数据,它对我有用
import pandas as pd
import numpy as np
data = pd.DataFrame({'Symbol': {0: 'ABNB', 1: 'DKNG', 2: 'EXPE', 3: 'MPNGF', 4: 'RDFN', 5: 'ROKU', 6: 'VIACA', 7: 'Z'},
'Number of Buys': {0: np.nan, 1: 2.0, 2: np.nan, 3: 1.0, 4: 2.0, 5: 1.0, 6: 1.0, 7: np.nan},
'Number of Sell s': {0: 1.0, 1: np.nan, 2: 1.0, 3: np.nan, 4: np.nan, 5: np.nan, 6: np.nan, 7: 1.0},
'Gains/Losses': {0: 2106.0, 1: -1479.2, 2: 1863.18, 3: -1980.0, 4: -1687.7, 5: -1520.52, 6: -1282.4, 7: 1624.59}, 'Percentage change': {0: 0.0, 1: 2.0, 2: 0.0, 3: 0.0, 4: 1.5, 5: 0.0, 6: 0.0, 7: 0.0}})
rows = ['ABNB','DKNG','EXPE']
data
Symbol Number of Buys Number of Sell s Gains/Losses \
0 ABNB NaN 1.0 2106.00
1 DKNG 2.0 NaN -1479.20
2 EXPE NaN 1.0 1863.18
3 MPNGF 1.0 NaN -1980.00
4 RDFN 2.0 NaN -1687.70
5 ROKU 1.0 NaN -1520.52
6 VIACA 1.0 NaN -1282.40
7 Z NaN 1.0 1624.59
Percentage change
0 0.0
1 2.0
2 0.0
3 0.0
4 1.5
5 0.0
6 0.0
7 0.0
按照你的方法
(data['Number of Buys']==1.0).sum()
s0= data.loc[(data['Number of Buys']==1.0),['Number of Buys']].sample(2)
data.loc[data.index.isin(s0.index),'Number of Buys'] =np.nan
Symbol Number of Buys Number of Sell s Gains/Losses \
0 ABNB NaN 1.0 2106.00
1 DKNG 2.0 NaN -1479.20
2 EXPE NaN 1.0 1863.18
3 MPNGF 1.0 NaN -1980.00
4 RDFN 2.0 NaN -1687.70
5 ROKU NaN NaN -1520.52
6 VIACA NaN NaN -1282.40
7 Z NaN 1.0 1624.59
Percentage change
0 0.0
1 2.0
2 0.0
3 0.0
4 1.5
5 0.0
6 0.0
7 0.0
嗯...
我删除了 re-assignment 并且它起作用了??
s0 = df3['amount_tsh'][df3['amount_tsh'].eq(0)].sample(37322).index df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan
第二行是: df3['amount_tsh'] = df3.loc[df3.index.isin(s0), 'amount_tsh'] = np.nan