为 Pandas 数据框组合 crosstab-pivot-groupby
Combining crosstab-pivot-groupby for Pandas dataframe
我认为这是一个非常简单的问题,但我找不到另一个解决了类似案例的条目。
我有一个 Pandas 数据框,如下所示:
group1 group2 meandiff lower upper reject
0 bacc dry_sed 2575.1697 2033.6713 3116.6681 True
1 bacc junc_hal -81.8513 -555.8132 392.1106 False
2 bacc other_trees -1.2333 -512.6246 510.1579 False
3 bacc phrag 613.2256 0.4309 1226.0204 True
4 bacc water -1074.4667 -1687.2614 -461.6719 True
5 bacc wet_sed -437.1854 -943.2217 68.8508 False
6 dry_sed junc_hal -2657.0210 -3068.3186 -2245.7234 True
7 dry_sed other_trees -2576.4030 -3030.3269 -2122.4792 True
8 dry_sed phrag -1961.9441 -2527.6677 -1396.2204 True
9 dry_sed water -3649.6364 -4215.3600 -3083.9127 True
10 dry_sed wet_sed -3012.3551 -3460.2374 -2564.4728 True
11 junc_hal other_trees 80.6179 -290.1464 451.3823 False
12 junc_hal phrag 695.0769 193.6165 1196.5373 True
13 junc_hal water -992.6154 -1494.0758 -491.1550 True
14 junc_hal wet_sed -355.3341 -718.6767 8.0084 False
15 other_trees phrag 614.4590 77.4825 1151.4354 True
16 other_trees water -1073.2333 -1610.2098 -536.2569 True
17 other_trees wet_sed -435.9521 -846.9253 -24.9788 True
18 phrag water -1687.6923 -2321.9951 -1053.3895 True
19 phrag wet_sed -1050.4111 -1582.2901 -518.5320 True
20 water wet_sed 637.2812 105.4022 1169.1603 True
我想在 group1 和 group2 之间创建一种偶然性 table,但在每个单元格中放入拒绝列中的值。
看起来应该是这样的:
bacc dry_sed junc_hal other_trees phrag water wet_sed
bacc NA 1 0 0 1 1 0
dry_sed 1 NA 1 1 1 1 1
junc_hal 0 1 NA 0 1 1 0
other_trees 0 1 0 NA 1 1 1
phrag 1 1 1 1 NA 1 1
water 1 1 1 1 1 NA 1
wet_sed 0 1 0 1 1 1 NA
NA只是一个参考,可以有任何数字。
有没有一种直接的方法来以这种方式总结数据?在开始使用循环分析 table 之前,我想确定没有简单直接的方法可以实现这一点。
提前致谢。
您可以旋转数据框。
df.pivot(index='group1', columns='group2', values='reject')
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc True False False True True False
dry_sed None True True True True True
junc_hal None None False True True False
other_trees None None None True True True
phrag None None None None True True
water None None None None None True
假设您的数据框名为 df
,您可以:
df['reject_flag'] = df['reject'].astype(int)
output = df.pivot_table(index='group1', columns='group2', values='reject_flag')
它为您提供以下内容:
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc 1.0 0.0 0.0 1.0 1.0 0.0
dry_sed NaN 1.0 1.0 1.0 1.0 1.0
junc_hal NaN NaN 0.0 1.0 1.0 0.0
other_trees NaN NaN NaN 1.0 1.0 1.0
phrag NaN NaN NaN NaN 1.0 1.0
water NaN NaN NaN NaN NaN 1.0
我认为这是一个非常简单的问题,但我找不到另一个解决了类似案例的条目。
我有一个 Pandas 数据框,如下所示:
group1 group2 meandiff lower upper reject
0 bacc dry_sed 2575.1697 2033.6713 3116.6681 True
1 bacc junc_hal -81.8513 -555.8132 392.1106 False
2 bacc other_trees -1.2333 -512.6246 510.1579 False
3 bacc phrag 613.2256 0.4309 1226.0204 True
4 bacc water -1074.4667 -1687.2614 -461.6719 True
5 bacc wet_sed -437.1854 -943.2217 68.8508 False
6 dry_sed junc_hal -2657.0210 -3068.3186 -2245.7234 True
7 dry_sed other_trees -2576.4030 -3030.3269 -2122.4792 True
8 dry_sed phrag -1961.9441 -2527.6677 -1396.2204 True
9 dry_sed water -3649.6364 -4215.3600 -3083.9127 True
10 dry_sed wet_sed -3012.3551 -3460.2374 -2564.4728 True
11 junc_hal other_trees 80.6179 -290.1464 451.3823 False
12 junc_hal phrag 695.0769 193.6165 1196.5373 True
13 junc_hal water -992.6154 -1494.0758 -491.1550 True
14 junc_hal wet_sed -355.3341 -718.6767 8.0084 False
15 other_trees phrag 614.4590 77.4825 1151.4354 True
16 other_trees water -1073.2333 -1610.2098 -536.2569 True
17 other_trees wet_sed -435.9521 -846.9253 -24.9788 True
18 phrag water -1687.6923 -2321.9951 -1053.3895 True
19 phrag wet_sed -1050.4111 -1582.2901 -518.5320 True
20 water wet_sed 637.2812 105.4022 1169.1603 True
我想在 group1 和 group2 之间创建一种偶然性 table,但在每个单元格中放入拒绝列中的值。
看起来应该是这样的:
bacc dry_sed junc_hal other_trees phrag water wet_sed
bacc NA 1 0 0 1 1 0
dry_sed 1 NA 1 1 1 1 1
junc_hal 0 1 NA 0 1 1 0
other_trees 0 1 0 NA 1 1 1
phrag 1 1 1 1 NA 1 1
water 1 1 1 1 1 NA 1
wet_sed 0 1 0 1 1 1 NA
NA只是一个参考,可以有任何数字。
有没有一种直接的方法来以这种方式总结数据?在开始使用循环分析 table 之前,我想确定没有简单直接的方法可以实现这一点。
提前致谢。
您可以旋转数据框。
df.pivot(index='group1', columns='group2', values='reject')
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc True False False True True False
dry_sed None True True True True True
junc_hal None None False True True False
other_trees None None None True True True
phrag None None None None True True
water None None None None None True
假设您的数据框名为 df
,您可以:
df['reject_flag'] = df['reject'].astype(int)
output = df.pivot_table(index='group1', columns='group2', values='reject_flag')
它为您提供以下内容:
group2 dry_sed junc_hal other_trees phrag water wet_sed
group1
bacc 1.0 0.0 0.0 1.0 1.0 0.0
dry_sed NaN 1.0 1.0 1.0 1.0 1.0
junc_hal NaN NaN 0.0 1.0 1.0 0.0
other_trees NaN NaN NaN 1.0 1.0 1.0
phrag NaN NaN NaN NaN 1.0 1.0
water NaN NaN NaN NaN NaN 1.0