尝试定义用于为应用程序创建价格桶的函数时出错
Error while trying to define a function to create price bucket for apps
我有一个 csv 数据集,我在 Jupyter 中导入并存储在 inp0 下。
我正在尝试使用 .loc 函数为这些创建价格桶 pandas 打赌低于错误。
我的代码:
inp0.loc[inp0.price==0.00, 'Price_Bucket'] = 'Free App'
inp0.loc[[inp0.price>0.00 and inp0.price<3.00],'Price_Bucket'] = 'Apps that cost <3'
inp0.loc[[inp0.price>=3.00 and inp0.price<5.00],'Price_Bucket'] = 'Apps that cost <5'
inp0.loc[inp0.price>=5.00,'Price_Bucket'] = 'Apps that cost >=5'
inp0.price_bucket.value_counts()
ValueError:Series 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
如何解决?
尝试使用 np.where,它的工作方式与 columns/vectors 中的 if else 类似:
import numpy as np
inp0['Price_Bucket'] = np.where(inp0['price']==0.00, 'Free App', np.where(inp0['price']<3.00, 'Apps that cost <3', np.where(inp0['price']<5.00, 'Apps that cost <5', 'Apps that cost >=5')))
您可以使用 pandas cut
函数,而不是编写多个 ifelse
或 np.where
条件:
import pandas as pd
import numpy as np
import math
bins_defined = [0, 0.000001, 3, 5, math.inf] ## price = 0 --> 'Free APP' that's why i've selected the first two interval in a tricky way
labels_defined = ['Free App', 'Apps that cost <3', 'Apps that cost <5', 'Apps that cost >=5']
inp0['Price_Bucket'] = pd.cut(inp0['price'], bins = bins_defined, labels = label_defined, right = False)
# `right` Indicates whether bins includes the rightmost edge or not.
# If right == True (the default), then the bins [1, 2, 3, 4] indicate (1,2], (2,3], (3,4].
为了更好地理解,请参阅 pandas.cut 文档
我有一个 csv 数据集,我在 Jupyter 中导入并存储在 inp0 下。 我正在尝试使用 .loc 函数为这些创建价格桶 pandas 打赌低于错误。
我的代码:
inp0.loc[inp0.price==0.00, 'Price_Bucket'] = 'Free App'
inp0.loc[[inp0.price>0.00 and inp0.price<3.00],'Price_Bucket'] = 'Apps that cost <3'
inp0.loc[[inp0.price>=3.00 and inp0.price<5.00],'Price_Bucket'] = 'Apps that cost <5'
inp0.loc[inp0.price>=5.00,'Price_Bucket'] = 'Apps that cost >=5'
inp0.price_bucket.value_counts()
ValueError:Series 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
如何解决?
尝试使用 np.where,它的工作方式与 columns/vectors 中的 if else 类似:
import numpy as np
inp0['Price_Bucket'] = np.where(inp0['price']==0.00, 'Free App', np.where(inp0['price']<3.00, 'Apps that cost <3', np.where(inp0['price']<5.00, 'Apps that cost <5', 'Apps that cost >=5')))
您可以使用 pandas cut
函数,而不是编写多个 ifelse
或 np.where
条件:
import pandas as pd
import numpy as np
import math
bins_defined = [0, 0.000001, 3, 5, math.inf] ## price = 0 --> 'Free APP' that's why i've selected the first two interval in a tricky way
labels_defined = ['Free App', 'Apps that cost <3', 'Apps that cost <5', 'Apps that cost >=5']
inp0['Price_Bucket'] = pd.cut(inp0['price'], bins = bins_defined, labels = label_defined, right = False)
# `right` Indicates whether bins includes the rightmost edge or not.
# If right == True (the default), then the bins [1, 2, 3, 4] indicate (1,2], (2,3], (3,4].
为了更好地理解,请参阅 pandas.cut 文档