交换 Pandas 数据框中的部分列
Swap parts of columns in Pandas dataframe
我必须管理一个看起来像这样的数据集
this
当我绘制它时。从图中可以看出,在 x=17 左右,“橙色”列的值正在取代“绿色”列应采用的数据。分别是,“蓝色”数据代替“橙色”数据,“绿色”数据代替“蓝色”数据。这种交换发生在 x=17 左右。稍后在图上(x=24 左右)交换不同。我的问题是如何在正确的位置(列)获取数据。交换点并不总是恒定的,所以我不能只是迭代地交换部分列。我的想法是我必须检查两点之间的差异。当差异大于某个值时,这可能是一个交换点。尽管情况并非总是如此,因为大多数图形都具有非线性行为。
一个典型的数据集包含更多的行,所以我正在寻找一个尽可能参数化的解决方案。
这是上图的数据集:
col1 = [8724.96757035, 8720.86855769, 8713.03560178, 8711.77188717,
8723.40967556, 8717.95864342, 8719.46206709, 8716.15746255,
8715.83456161, 8722.05038594, 8721.822529 , 8714.29076839,
8721.68118216, 8714.94677413, 8706.33839393, 8719.94888389,
8715.71175774, 8480.37544428, 9151.63757245, 9138.71268152,
9127.43234993, 9146.51437639, 9148.00997757, 9130.06677617,
9151.43128313, 8481.34668127, 8482.40548913, 8481.96440291,
8481.39530663, 8482.7611363 , 8481.26267875, 8480.71911933,
8481.02279341]
col2 = [8718.4606092 , 9150.29254687, 9130.86473512, 9140.34929925,
9142.43843709, 9158.33993226, 9148.70914607, 9164.89441174,
9145.08470894, 9147.82723909, 9132.61236281, 9200.58503831,
9129.96054189, 9135.65207477, 9165.43826932, 9145.35463759,
9134.02400092, 8481.58635709, 8480.90717793, 8479.96295137,
8483.73891949, 8481.93224816, 8482.40478411, 8481.96627135,
8481.34086757, 8722.99646005, 8736.61137791, 8724.85719973,
8721.86321039, 8723.91810368, 8720.82987529, 8720.19864748,
8720.00514769]
col3 = [9157.20772734, 8481.17028812, 8479.95897581, 8481.66854465,
8481.12688288, 8481.30670312, 8480.84656953, 8483.54011535,
8481.81742774, 8479.23373517, 8480.44659188, 8480.90515565,
8481.35596211, 8479.94614036, 8480.12735803, 8482.70698043,
8481.50464731, 8725.55716505, 8712.41651697, 8737.46352274,
8719.20402175, 8710.77791026, 8721.07604204, 8718.88881952,
8720.0611123 , 9158.13239686, 9158.70309418, 9185.89920375,
9189.72527817, 9153.04424809, 9152.17774172, 9148.59275477,
9133.33557359]
df = pd.DataFrame({"A":col1, "B":col2, "C":col3})
如有任何建议,我们将不胜感激。提前致谢。
尝试
df.plot(y=["A", "B","C"])
一个简单的解决方案是对数据进行排序,即最小值始终在“A”列中,最大值始终在“C”列中。
df2 = pd.DataFrame(df.apply(sorted, axis=1).to_list()).rename(columns={0:'A', 1:'B', 2:'C'})
df2.plot()
结果图如下所示:
我必须管理一个看起来像这样的数据集 this 当我绘制它时。从图中可以看出,在 x=17 左右,“橙色”列的值正在取代“绿色”列应采用的数据。分别是,“蓝色”数据代替“橙色”数据,“绿色”数据代替“蓝色”数据。这种交换发生在 x=17 左右。稍后在图上(x=24 左右)交换不同。我的问题是如何在正确的位置(列)获取数据。交换点并不总是恒定的,所以我不能只是迭代地交换部分列。我的想法是我必须检查两点之间的差异。当差异大于某个值时,这可能是一个交换点。尽管情况并非总是如此,因为大多数图形都具有非线性行为。 一个典型的数据集包含更多的行,所以我正在寻找一个尽可能参数化的解决方案。 这是上图的数据集:
col1 = [8724.96757035, 8720.86855769, 8713.03560178, 8711.77188717,
8723.40967556, 8717.95864342, 8719.46206709, 8716.15746255,
8715.83456161, 8722.05038594, 8721.822529 , 8714.29076839,
8721.68118216, 8714.94677413, 8706.33839393, 8719.94888389,
8715.71175774, 8480.37544428, 9151.63757245, 9138.71268152,
9127.43234993, 9146.51437639, 9148.00997757, 9130.06677617,
9151.43128313, 8481.34668127, 8482.40548913, 8481.96440291,
8481.39530663, 8482.7611363 , 8481.26267875, 8480.71911933,
8481.02279341]
col2 = [8718.4606092 , 9150.29254687, 9130.86473512, 9140.34929925,
9142.43843709, 9158.33993226, 9148.70914607, 9164.89441174,
9145.08470894, 9147.82723909, 9132.61236281, 9200.58503831,
9129.96054189, 9135.65207477, 9165.43826932, 9145.35463759,
9134.02400092, 8481.58635709, 8480.90717793, 8479.96295137,
8483.73891949, 8481.93224816, 8482.40478411, 8481.96627135,
8481.34086757, 8722.99646005, 8736.61137791, 8724.85719973,
8721.86321039, 8723.91810368, 8720.82987529, 8720.19864748,
8720.00514769]
col3 = [9157.20772734, 8481.17028812, 8479.95897581, 8481.66854465,
8481.12688288, 8481.30670312, 8480.84656953, 8483.54011535,
8481.81742774, 8479.23373517, 8480.44659188, 8480.90515565,
8481.35596211, 8479.94614036, 8480.12735803, 8482.70698043,
8481.50464731, 8725.55716505, 8712.41651697, 8737.46352274,
8719.20402175, 8710.77791026, 8721.07604204, 8718.88881952,
8720.0611123 , 9158.13239686, 9158.70309418, 9185.89920375,
9189.72527817, 9153.04424809, 9152.17774172, 9148.59275477,
9133.33557359]
df = pd.DataFrame({"A":col1, "B":col2, "C":col3})
如有任何建议,我们将不胜感激。提前致谢。
尝试
df.plot(y=["A", "B","C"])
一个简单的解决方案是对数据进行排序,即最小值始终在“A”列中,最大值始终在“C”列中。
df2 = pd.DataFrame(df.apply(sorted, axis=1).to_list()).rename(columns={0:'A', 1:'B', 2:'C'})
df2.plot()
结果图如下所示: