以 epsilon 精度将 Pandas DataFrame 有条件地转换为负数、零和正数

Question

我想要一个数据框，其中包含根据值是负数、零还是正数转换为三个特定符号的数值。此外，检查应受 epsilon 值的影响，以控制被视为零的值。

csv = pd.read_csv('filename.csv')
df = csv.iloc[:, :].diff()
df = df.iloc[1:,:] # remove the first row of nans

我尝试了以下方法

neg = df < -eps
zer = abs(df) <= eps
pos = df > eps
df[neg] = 'neg'
df[zer] = 'zer'
df[pos] = 'pos'

这工作了一段时间，但是当eps达到某个值时，抛出以下错误TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

然后我尝试了以下方法：

df.transform(lambda x: ('neg' if x < -eps else 'zer') if abs(x) <= eps else 'pos')

产生错误ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')

我有两个问题：

为什么只有当 eps 变大时我才会得到 TypeError，但适用于 eps = 0?
如何执行此转换？

Answer 1

FWIW，我可能会使用 where 将接近零的值推到零，使用 np.sign 得到 0、1 和 -1 的帧，然后映射结果：

In [132]: df = pd.DataFrame(np.random.uniform(-1, 1, (5,5)))

In [133]: df
Out[133]: 
          0         1         2         3         4
0  0.108927 -0.728913 -0.369125 -0.670461  0.941319
1 -0.075262  0.412293  0.893267 -0.911717 -0.489222
2 -0.363191 -0.019171  0.541484  0.933258 -0.742260
3 -0.943218 -0.326041 -0.817188  0.339880  0.830269
4 -0.374525  0.895200 -0.792452 -0.725313  0.190894

In [134]: np.sign(df.where(df.abs() > 0.3, 0)).replace({0: "zer", 1: "pos", -1: "neg"})
Out[134]: 
     0    1    2    3    4
0  zer  neg  neg  neg  pos
1  zer  pos  pos  neg  neg
2  neg  zer  pos  pos  neg
3  neg  neg  neg  pos  pos
4  neg  pos  neg  neg  zer

以 epsilon 精度将 Pandas DataFrame 有条件地转换为负数、零和正数

Conditional transformation of Pandas DataFrame into negative, zero, and positive with epsilon precision

python

transformation

dataframe

pandas