python 中每第 n 列之后的数据帧切片
Dataframe slice after every nth column in python
我的数据框有 6 行和 1488 列 (6, 1488),我需要对数据框进行切片,以便所有切片/块的大小均为 (6, 22)。
所以我想要在每 22 列之后有一个 since。最后,我想将所有这些切片一个一个地附加到另一个下面 - 所以我得到一个最终的数据帧大小 - (~405, 22)
任何帮助将不胜感激。
我不确定你的数据框到底是什么样子,但像这样的东西应该可以。
# create an example dataframe
df = pd.DataFrame(np.random.random((6, 1488)))
df
0 1 2 3 4 5 6 7 8 ... 1479 1480 1481 1482 1483 1484 1485 1486 1487
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.849323 0.179219 0.383729 0.453096 0.515090 0.042625 0.157411 0.738439 0.866627
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.212852 0.947779 0.993973 0.394951 0.678237 0.590767 0.690921 0.792253 0.748520
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.127492 0.324699 0.793800 0.327186 0.619923 0.871256 0.494916 0.487993 0.368654
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.535595 0.400371 0.375005 0.509740 0.412794 0.399939 0.414794 0.769017 0.591004
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.060460 0.297115 0.823037 0.299341 0.923836 0.111853 0.256940 0.344354 0.745989
5 0.686776 0.711688 0.232884 0.403817 0.311352 0.581365 0.942824 0.787317 0.212746 ... 0.049652 0.872466 0.437506 0.727937 0.119991 0.707848 0.178063 0.464412 0.587901
# create the 6x22 dataframes we will append together
# renaming is important so each chunks' columns match up with each other
chunks = [
df.iloc[:, i:i+22].rename(columns=lambda c: c % 22)
for i in range(0, 1488, 22)
]
final_df = pd.concat(chunks, ignore_index=True)
final_df
0 1 2 3 4 5 6 7 8 ... 13 14 15 16 17 18 19 20 21
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.683138 0.241730 0.127795 0.290902 0.342813 0.806268 0.739551 0.545052 0.485129
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.517114 0.937569 0.028149 0.097362 0.047555 0.755910 0.339539 0.513563 0.861521
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.335635 0.256579 0.547100 0.607310 0.925894 0.952812 0.999725 0.687252 0.465104
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.670078 0.593592 0.631335 0.917056 0.737024 0.932694 0.547243 0.514497 0.237268
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.213295 0.625206 0.570912 0.368144 0.715152 0.024020 0.400959 0.992156 0.328769
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
403 0.662156 0.909833 0.106109 0.630261 0.415084 0.212852 0.947779 0.993973 0.394951 ... 0.748520 NaN NaN NaN NaN NaN NaN NaN NaN
404 0.280660 0.324690 0.089441 0.695034 0.040087 0.127492 0.324699 0.793800 0.327186 ... 0.368654 NaN NaN NaN NaN NaN NaN NaN NaN
405 0.299956 0.111437 0.332434 0.312539 0.866787 0.535595 0.400371 0.375005 0.509740 ... 0.591004 NaN NaN NaN NaN NaN NaN NaN NaN
406 0.801716 0.993745 0.653756 0.415967 0.479453 0.060460 0.297115 0.823037 0.299341 ... 0.745989 NaN NaN NaN NaN NaN NaN NaN NaN
407 0.937215 0.811213 0.643623 0.686690 0.843001 0.049652 0.872466 0.437506 0.727937 ... 0.587901 NaN NaN NaN NaN NaN NaN NaN NaN
如果您的数据框的列名不是此示例中的连续数字,您将需要提出自己的映射器,以便每个块中的列匹配。否则,concat
操作将创建一个数据框,其中包含所有列名的超集。
我的数据框有 6 行和 1488 列 (6, 1488),我需要对数据框进行切片,以便所有切片/块的大小均为 (6, 22)。
所以我想要在每 22 列之后有一个 since。最后,我想将所有这些切片一个一个地附加到另一个下面 - 所以我得到一个最终的数据帧大小 - (~405, 22)
任何帮助将不胜感激。
我不确定你的数据框到底是什么样子,但像这样的东西应该可以。
# create an example dataframe
df = pd.DataFrame(np.random.random((6, 1488)))
df
0 1 2 3 4 5 6 7 8 ... 1479 1480 1481 1482 1483 1484 1485 1486 1487
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.849323 0.179219 0.383729 0.453096 0.515090 0.042625 0.157411 0.738439 0.866627
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.212852 0.947779 0.993973 0.394951 0.678237 0.590767 0.690921 0.792253 0.748520
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.127492 0.324699 0.793800 0.327186 0.619923 0.871256 0.494916 0.487993 0.368654
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.535595 0.400371 0.375005 0.509740 0.412794 0.399939 0.414794 0.769017 0.591004
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.060460 0.297115 0.823037 0.299341 0.923836 0.111853 0.256940 0.344354 0.745989
5 0.686776 0.711688 0.232884 0.403817 0.311352 0.581365 0.942824 0.787317 0.212746 ... 0.049652 0.872466 0.437506 0.727937 0.119991 0.707848 0.178063 0.464412 0.587901
# create the 6x22 dataframes we will append together
# renaming is important so each chunks' columns match up with each other
chunks = [
df.iloc[:, i:i+22].rename(columns=lambda c: c % 22)
for i in range(0, 1488, 22)
]
final_df = pd.concat(chunks, ignore_index=True)
final_df
0 1 2 3 4 5 6 7 8 ... 13 14 15 16 17 18 19 20 21
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.683138 0.241730 0.127795 0.290902 0.342813 0.806268 0.739551 0.545052 0.485129
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.517114 0.937569 0.028149 0.097362 0.047555 0.755910 0.339539 0.513563 0.861521
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.335635 0.256579 0.547100 0.607310 0.925894 0.952812 0.999725 0.687252 0.465104
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.670078 0.593592 0.631335 0.917056 0.737024 0.932694 0.547243 0.514497 0.237268
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.213295 0.625206 0.570912 0.368144 0.715152 0.024020 0.400959 0.992156 0.328769
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
403 0.662156 0.909833 0.106109 0.630261 0.415084 0.212852 0.947779 0.993973 0.394951 ... 0.748520 NaN NaN NaN NaN NaN NaN NaN NaN
404 0.280660 0.324690 0.089441 0.695034 0.040087 0.127492 0.324699 0.793800 0.327186 ... 0.368654 NaN NaN NaN NaN NaN NaN NaN NaN
405 0.299956 0.111437 0.332434 0.312539 0.866787 0.535595 0.400371 0.375005 0.509740 ... 0.591004 NaN NaN NaN NaN NaN NaN NaN NaN
406 0.801716 0.993745 0.653756 0.415967 0.479453 0.060460 0.297115 0.823037 0.299341 ... 0.745989 NaN NaN NaN NaN NaN NaN NaN NaN
407 0.937215 0.811213 0.643623 0.686690 0.843001 0.049652 0.872466 0.437506 0.727937 ... 0.587901 NaN NaN NaN NaN NaN NaN NaN NaN
如果您的数据框的列名不是此示例中的连续数字,您将需要提出自己的映射器,以便每个块中的列匹配。否则,concat
操作将创建一个数据框,其中包含所有列名的超集。