两个数据帧的比较

Comparison of two dataframes

我有一个 excel table 的 15200 行,对应于分析其结构的树。我有列中的所有结构(48 个结构),它们已被计算在每棵树上。例如,树 12607 有 3 个结构 CV11、1 个结构 IN12 和其余所有结构的 none (0)。因此,table 看起来像一个巨大的 table,树上有很多 0 和一些结构的出现次数。最后一列是根据在树上找到的结构赋予树的值(每个结构根据其在树上的存在为树提供一个点数)。

问题是:是否有一些结构或结构的组合赋予树很高的价值。当然,根据每个结构体的值,我们可以看出哪个结构体的值比其他的高(例如:结构体CV11的值为15,结构体IN12的值为4)。但我想知道的是,如果我们采用最终值高于 100 的所有树(我们创建一个新数据框 "data100"),然后与最终值低于 100 的树进行比较(我们创建另一个dataframe "data0"),我们能否发现在这些树上发现的结构的数量和发生率存在显着差异?因为价值高的结构可能只存在于价值低于100的树上;因为例如,此结构不允许在同一棵树上找到其他结构。

瞧,我希望我已经提供了足够的细节......如果你有解决这个问题的任何想法或建议......那就太好了!

下面是我的脚本。

    > data100
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2        0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
4        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
6        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0
13       0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
15       0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2        0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0    0    0    0    0    0
3        0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
4        0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
6        0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    2    0    0    0    0    0
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    3    0    0
13       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    3    0    0
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    3    0    0
15       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
2        1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     56
3        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     10
4        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     10
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      4
6        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     24
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     18
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     63
13       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     77
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     54
15       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     20
 [ reached getOption("max.print") -- omitted 60749 rows ]
> sortdata100<-data100[order(data100[,64],decreasing=T),]

> rsortdata100<-sortdata100[sortdata100$ecoval>100,]
> rsortdata100<-na.omit(rsortdata100)#181 lignes
> rsortdata100
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
1291     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1083     0    4    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3919     0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0
14685    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
4021     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
5452     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
14686    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0
4022     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0
1013     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2895     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
4719     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0
682      0    3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0
3444     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1299     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
2713     0    0    0    4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    1    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
1291     0    0    0    0    0    0    0    0   30    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1083     3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3919     0    0    1    0    2    0    0    0    2    0    0    0    3    0    0    0    0    0    0   11    0    0    0
14685    0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
4021     0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
5452     0    0    1    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0
14686    0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    2
4022     0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1013     0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2895     0    0    0    1    0    0    0    0    4    0    0    3    0    4    3    0    0    0    0    0    0    0    0
4719     0    0    0    0    0    0    0    0   10    0    0    0    0    0    0    0    0    0    0    0    0    0    0
682      0    0    0    0    0    0    0    0    0    0    0    0    0    2    1    0    0    0    0    0    0    0    0
3444     0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1299     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0
2713     0    0    0    2    0    3    0    0    2    0    0    0    1    5    1    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
1291     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0   1192
1083     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    424
3919     1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    380
14685    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    370
4021     0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    358
5452     0    0    0    0    0    0    1    0    0   11    0    0    0    0    1    0    0    356
14686    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    354
4022     0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0    346
1013     0    8    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    326
2895     0    1    0    0    0    1    0    1    0    0    0    0    0    0    0    1    0    325
4719     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    324
682      0    0    0    6    0    0    0    0    0    0    0    0    0    0    0    0    0    311
3444     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    306
1299     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    302
2713     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    302
 [ reached getOption("max.print") -- omitted 166 rows ]
> data0<-sortdata100[sortdata100$ecoval<100,]
> data0<-na.omit(data0)
> data0
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
4728     0    0    0    1    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0
5339     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
11766    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
796      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3561     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
10581    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
10618    0    0    0    0    0    0    0    0    0    0    0    1    0    1    0    1    0    1    0    0    0    0    0
14376    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
14389    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0
790      0    0    0    1    0    0    0    0    1    0    0    2    0    0    0    0    0    0    0    0    1    0    0
3974     0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
4739     0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    1    0    0    0    0    0    0
156      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2950     0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    1    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
4728     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0
5339     1    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0
11766    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0    0    0    0    0    0    0    0    0
796      1    1    0    0    1    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3561     0    0    0    0    0    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10581    0    0    0    1    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
10618    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
14376    1    0    0    0    0    0    0    0    1    0    0    0    0    2    0    0    0    0    0    0    0    0    0
14389    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    1    0    0    0    0    0    0    0
790      0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0
3974     0    0    0    0    0    0    0    0    1    0    0    0    4    0    0    0    1    0    0    0    0    0    0
4739     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
156      0    0    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    6    2    0    0    0    0    0    0    0    0
2950     0    1    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
4728     0    0    1    0    0    1    0    0    0    0    0    0    0    0    0    0    0     99
5339     0    1    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0     99
11766    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    1     99
796      1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
3561     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
10581    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    1    0     98
10618    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0     98
14376    2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
14389    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
790      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     97
3974     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     97
4739     0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    1    0     97
156      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     96
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0     96
2950     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     96
 [ reached getOption("max.print") -- omitted 14984 rows ]

也许是这样的?

library(dplyr)
data %>% group_by(ecoval > 100) %>% summarize_all(mean)

这应该会为您提供 ecoval ><= 到 100

每列的平均值