set column = pandas 数据帧结构中其他列值的一些操作
set column = some operation of other column values in a pandas dataframe structure
我正在尝试使用以下逻辑对 pandas 数据帧执行简单操作。感兴趣的列中的值是小数(最多 1 个小数点)。操作的值不能为负,所以如果它是我想要 0。我尝试了 2 种方法来实现此目的,但这两种方法都会导致相同的错误。
方法一:
def compute_size(frame):
for x in list(reversed(range(14, len(frame.columns),2))):
tmp_value = frame.iloc[:,x] - frame.iloc[:,x-2]
if tmp_value < 0:
frame.iloc[:,x] = 0
else:
frame.iloc[:,x] = tmp_value
方法二:
def compute_size(frame):
for x in list(reversed(range(14, len(frame.columns),2))):
frame.iloc[:,x] = max(0,frame.iloc[:,x] - frame.iloc[:,x-2])
当我调用上面的函数时,出现以下错误:
C:\Python27\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
690 raise ValueError("The truth value of a {0} is ambiguous. "
691 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692 .format(self.__class__.__name__))
693
694 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
更新 1:
这是一些示例数据:
df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1],
'Price1':[50,50,50,50],
'Size2':[3,3,3,3],
'Price2':[75,75,75,75],
'Size3':[7,7,7,7],
'Price3':[100,100,100,100],
'Size4':[15,15,15,15],
'Price4':[125,125,125,125],
'Size5':[25,25,25,25],
'Price5':[200,200,200,200],
'Size6':[30,30,30,30],
'Price6':[250,250,250,250],
'Size7':[40,40,40,40],
'Price7':[300,300,300,300]
},columns=['BlahBlah0',
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )
现在,一旦将上述数据帧插入 python,列的顺序就会变得不正常。出于某种原因 pandas 将价格列和尺寸列组合在一起。这不是本意。数据框应该和我展示的一模一样。我不确定您如何将其操纵回上面显示的方式。
假设您能够生成如上图所示的确切数据帧,现在我要执行以下操作:
Size1 = Size1
Size2 = Max(0,Size2 - Size1)
Size3 = Max(0,Size3 - Size2)
Size4 = Max(0,Size4 - Size3)
Size5 = Max(0,Size5 - Size4)
Size6 = Max(0,Size6 - Size5)
Size7 = Max(0,Size7 - Size6)
因此逻辑并不总是从列 x 中减去列 x-2,而是仅对从列索引 14 到最后一列的每隔一列执行操作。
更新 2:
我修复了有关数据帧排序的部分(见上文)。
基于上述逻辑的所需输出是以下数据帧:
df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1],
'Price1':[50,50,50,50],
'Size2':[2,2,2,2],
'Price2':[75,75,75,75],
'Size3':[4,4,4,4],
'Price3':[100,100,100,100],
'Size4':[8,8,8,8],
'Price4':[125,125,125,125],
'Size5':[10,10,10,10],
'Price5':[200,200,200,200],
'Size6':[5,5,5,5],
'Price6':[250,250,250,250],
'Size7':[10,10,10,10],
'Price7':[300,300,300,300]
},columns=['BlahBlah0',
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )
我正在计算当前尺寸和之前尺寸之间的差值,它实质上是在新价格下捕获净额外尺寸。
试一试。它沿轴 1 执行 apply() 并使用列表理解来处理减法。
cols_to_update = ['Size2','Size3','Size4','Size5','Size6','Size7']
cols_to_subtract = ['Size1','Size2','Size3','Size4','Size5','Size6','Size7']
df[cols_to_update] = df[cols_to_subtract].apply(
lambda x : pd.Series([max(x[i] - x[i-1],0) for i in range(1,len(x))]),axis=1)
我正在尝试使用以下逻辑对 pandas 数据帧执行简单操作。感兴趣的列中的值是小数(最多 1 个小数点)。操作的值不能为负,所以如果它是我想要 0。我尝试了 2 种方法来实现此目的,但这两种方法都会导致相同的错误。
方法一:
def compute_size(frame):
for x in list(reversed(range(14, len(frame.columns),2))):
tmp_value = frame.iloc[:,x] - frame.iloc[:,x-2]
if tmp_value < 0:
frame.iloc[:,x] = 0
else:
frame.iloc[:,x] = tmp_value
方法二:
def compute_size(frame):
for x in list(reversed(range(14, len(frame.columns),2))):
frame.iloc[:,x] = max(0,frame.iloc[:,x] - frame.iloc[:,x-2])
当我调用上面的函数时,出现以下错误:
C:\Python27\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
690 raise ValueError("The truth value of a {0} is ambiguous. "
691 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692 .format(self.__class__.__name__))
693
694 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
更新 1:
这是一些示例数据:
df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1],
'Price1':[50,50,50,50],
'Size2':[3,3,3,3],
'Price2':[75,75,75,75],
'Size3':[7,7,7,7],
'Price3':[100,100,100,100],
'Size4':[15,15,15,15],
'Price4':[125,125,125,125],
'Size5':[25,25,25,25],
'Price5':[200,200,200,200],
'Size6':[30,30,30,30],
'Price6':[250,250,250,250],
'Size7':[40,40,40,40],
'Price7':[300,300,300,300]
},columns=['BlahBlah0',
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )
现在,一旦将上述数据帧插入 python,列的顺序就会变得不正常。出于某种原因 pandas 将价格列和尺寸列组合在一起。这不是本意。数据框应该和我展示的一模一样。我不确定您如何将其操纵回上面显示的方式。
假设您能够生成如上图所示的确切数据帧,现在我要执行以下操作:
Size1 = Size1
Size2 = Max(0,Size2 - Size1)
Size3 = Max(0,Size3 - Size2)
Size4 = Max(0,Size4 - Size3)
Size5 = Max(0,Size5 - Size4)
Size6 = Max(0,Size6 - Size5)
Size7 = Max(0,Size7 - Size6)
因此逻辑并不总是从列 x 中减去列 x-2,而是仅对从列索引 14 到最后一列的每隔一列执行操作。
更新 2:
我修复了有关数据帧排序的部分(见上文)。
基于上述逻辑的所需输出是以下数据帧:
df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1],
'Price1':[50,50,50,50],
'Size2':[2,2,2,2],
'Price2':[75,75,75,75],
'Size3':[4,4,4,4],
'Price3':[100,100,100,100],
'Size4':[8,8,8,8],
'Price4':[125,125,125,125],
'Size5':[10,10,10,10],
'Price5':[200,200,200,200],
'Size6':[5,5,5,5],
'Price6':[250,250,250,250],
'Size7':[10,10,10,10],
'Price7':[300,300,300,300]
},columns=['BlahBlah0',
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )
我正在计算当前尺寸和之前尺寸之间的差值,它实质上是在新价格下捕获净额外尺寸。
试一试。它沿轴 1 执行 apply() 并使用列表理解来处理减法。
cols_to_update = ['Size2','Size3','Size4','Size5','Size6','Size7']
cols_to_subtract = ['Size1','Size2','Size3','Size4','Size5','Size6','Size7']
df[cols_to_update] = df[cols_to_subtract].apply(
lambda x : pd.Series([max(x[i] - x[i-1],0) for i in range(1,len(x))]),axis=1)