当我使用 R 中的 cbind() 函数将两个数据框组合在一起时,为什么我的新数据框会创建两个新变量?
Why does my new data frame create two new variables when I combine two data frames together using cbind() function in R?
我目前正在做一个数据分析项目,但问题是,我创建的新数据框很奇怪。
mycob1 <- read.csv("MYCOB_1.csv")
mycob1
Date Direction RFU Ct
1 Lot_210927 0 6.3588 9.164329
2 Lot_210927 0 5.0394 11.350701
3 Lot_210927 0 4.9946 37.334669
4 Lot_210927 0 4.8604 8.168337
5 Lot_210927 0 4.9032 37.306613
6 Lot_210927 0 4.9502 22.176353
7 Lot_210927 0 4.7858 23.713427
8 Lot_210927 0 5.2778 10.496994
9 Lot_210927 1 1021.8458 32.119668
10 Lot_210927 1 1020.1998 31.500716
11 Lot_210927 1 1065.8000 31.979674
12 Lot_210927 1 988.0452 31.019754
13 Lot_210927 1 1085.2206 31.557973
14 Lot_210927 1 1072.8540 31.745491
15 Lot_210927 1 1020.6496 31.218151
16 Lot_210927 1 983.4106 31.981162
mycob2 <- read.csv("MYCOB_2.csv")
mycob2
Date Direction RFU Ct
1 Lot_211020 0 0.6876 47.72087
2 Lot_211020 0 40.1056 38.37418
3 Lot_211020 0 97.0882 37.72917
4 Lot_211020 0 10.3170 36.18236
5 Lot_211020 0 67.3742 37.39050
6 Lot_211020 0 10.2540 40.16776
7 Lot_211020 0 6.9624 28.07575
8 Lot_211020 0 9.5718 28.84626
9 Lot_211020 0 13.0306 38.87375
10 Lot_211020 1 860.3956 29.15746
11 Lot_211020 1 884.9338 30.03665
12 Lot_211020 1 1552.2462 27.90839
13 Lot_211020 1 738.2328 29.22760
14 Lot_211020 1 1419.6448 29.13627
15 Lot_211020 1 1441.6212 29.35351
16 Lot_211020 1 424.9774 31.56446
mycob12 <- cbind(mycob1, mycob2, by.x = "Lot_210927", by.y = "Lot_211020")
mycob12
Date Direction RFU Ct Date Direction RFU Ct by.x by.y
1 Lot_210927 0 6.3588 9.164329 Lot_211020 0 0.6876 47.72087 Lot_210927 Lot_211020
2 Lot_210927 0 5.0394 11.350701 Lot_211020 0 40.1056 38.37418 Lot_210927 Lot_211020
3 Lot_210927 0 4.9946 37.334669 Lot_211020 0 97.0882 37.72917 Lot_210927 Lot_211020
4 Lot_210927 0 4.8604 8.168337 Lot_211020 0 10.3170 36.18236 Lot_210927 Lot_211020
5 Lot_210927 0 4.9032 37.306613 Lot_211020 0 67.3742 37.39050 Lot_210927 Lot_211020
6 Lot_210927 0 4.9502 22.176353 Lot_211020 0 10.2540 40.16776 Lot_210927 Lot_211020
7 Lot_210927 0 4.7858 23.713427 Lot_211020 0 6.9624 28.07575 Lot_210927 Lot_211020
8 Lot_210927 0 5.2778 10.496994 Lot_211020 0 9.5718 28.84626 Lot_210927 Lot_211020
9 Lot_210927 1 1021.8458 32.119668 Lot_211020 0 13.0306 38.87375 Lot_210927 Lot_211020
10 Lot_210927 1 1020.1998 31.500716 Lot_211020 1 860.3956 29.15746 Lot_210927 Lot_211020
11 Lot_210927 1 1065.8000 31.979674 Lot_211020 1 884.9338 30.03665 Lot_210927 Lot_211020
12 Lot_210927 1 988.0452 31.019754 Lot_211020 1 1552.2462 27.90839 Lot_210927 Lot_211020
13 Lot_210927 1 1085.2206 31.557973 Lot_211020 1 738.2328 29.22760 Lot_210927 Lot_211020
14 Lot_210927 1 1072.8540 31.745491 Lot_211020 1 1419.6448 29.13627 Lot_210927 Lot_211020
15 Lot_210927 1 1020.6496 31.218151 Lot_211020 1 1441.6212 29.35351 Lot_210927 Lot_211020
16 Lot_210927 1 983.4106 31.981162 Lot_211020 1 424.9774 31.56446 Lot_210927 Lot_211020
为澄清起见,“方向”仅表示样本是正数还是负数。我想找出 RFU 与 Ct 和方向变量之间是否存在相关性。但我似乎无法想出办法。我创建的这个名为“mycob12”的新数据框的奇怪部分是它在末尾创建了两个名为“by.x”和“by.y”的新变量,我不确定我能做什么删除它们。有没有办法删除变量?
编辑:我想使用这些数据框并用它们创建图表来探索任何具有方向、RFU 和 Ct 的模式。我考虑过删除日期并将多个数据框放在彼此之上。
谢谢!
我不确定你到底想做什么,但看看你的数据,只堆叠两个数据帧然后使用 Date
变量对它们进行排序似乎更有意义。
按照上面的数据框:
df1 <- data.frame(Date = c("Lot_210927","Lot_210927","Lot_210927"),
Direction = c(0,0,0),
RFU = c(6.3588,5.0394,4.9946),
Ct = c(9.164329,11.350701,37.334669))
df2 <- data.frame(Date = c("Lot_211020","Lot_211020","Lot_211020"),
Direction = c(0,0,0),
RFU = c(0.6876,40.1056,97.0882),
Ct = c(47.72087,38.37418,37.72917))
您可以使用 bind_rows
将它们与 tidyverse 堆叠在一起:(请注意,它只会叠加两个数据框。我建议仅在您具有完全相同的列名和数据类型时才使用它 - 例如数字,字符等 - 在两个数据框中,否则你应该使用 tidyverse
)
中的 left_join
library(tidyverse)
df_merged <- bind_rows(df1,df2)
df_merged
Date Direction RFU Ct
1 Lot_210927 0 6.3588 9.164329
2 Lot_210927 0 5.0394 11.350701
3 Lot_210927 0 4.9946 37.334669
4 Lot_211020 0 0.6876 47.720870
5 Lot_211020 0 40.1056 38.374180
6 Lot_211020 0 97.0882 37.729170
然后您可以生成如下相关矩阵:
df_num <- df_merged[, c(2:4)]
df_cor <- round(cor(df_num),2)
df_cor %>%
head()
Direction RFU Ct
Direction 1 NA NA
RFU NA 1.00 0.29
Ct NA 0.29 1.00
只需隔离数值变量并用它们绘制相关矩阵。显然,6 个数据点和方向始终为 0 并不是很有趣,但是对于您的完整数据集,它应该是一个很好的起点。
我目前正在做一个数据分析项目,但问题是,我创建的新数据框很奇怪。
mycob1 <- read.csv("MYCOB_1.csv")
mycob1
Date Direction RFU Ct
1 Lot_210927 0 6.3588 9.164329
2 Lot_210927 0 5.0394 11.350701
3 Lot_210927 0 4.9946 37.334669
4 Lot_210927 0 4.8604 8.168337
5 Lot_210927 0 4.9032 37.306613
6 Lot_210927 0 4.9502 22.176353
7 Lot_210927 0 4.7858 23.713427
8 Lot_210927 0 5.2778 10.496994
9 Lot_210927 1 1021.8458 32.119668
10 Lot_210927 1 1020.1998 31.500716
11 Lot_210927 1 1065.8000 31.979674
12 Lot_210927 1 988.0452 31.019754
13 Lot_210927 1 1085.2206 31.557973
14 Lot_210927 1 1072.8540 31.745491
15 Lot_210927 1 1020.6496 31.218151
16 Lot_210927 1 983.4106 31.981162
mycob2 <- read.csv("MYCOB_2.csv")
mycob2
Date Direction RFU Ct
1 Lot_211020 0 0.6876 47.72087
2 Lot_211020 0 40.1056 38.37418
3 Lot_211020 0 97.0882 37.72917
4 Lot_211020 0 10.3170 36.18236
5 Lot_211020 0 67.3742 37.39050
6 Lot_211020 0 10.2540 40.16776
7 Lot_211020 0 6.9624 28.07575
8 Lot_211020 0 9.5718 28.84626
9 Lot_211020 0 13.0306 38.87375
10 Lot_211020 1 860.3956 29.15746
11 Lot_211020 1 884.9338 30.03665
12 Lot_211020 1 1552.2462 27.90839
13 Lot_211020 1 738.2328 29.22760
14 Lot_211020 1 1419.6448 29.13627
15 Lot_211020 1 1441.6212 29.35351
16 Lot_211020 1 424.9774 31.56446
mycob12 <- cbind(mycob1, mycob2, by.x = "Lot_210927", by.y = "Lot_211020")
mycob12
Date Direction RFU Ct Date Direction RFU Ct by.x by.y
1 Lot_210927 0 6.3588 9.164329 Lot_211020 0 0.6876 47.72087 Lot_210927 Lot_211020
2 Lot_210927 0 5.0394 11.350701 Lot_211020 0 40.1056 38.37418 Lot_210927 Lot_211020
3 Lot_210927 0 4.9946 37.334669 Lot_211020 0 97.0882 37.72917 Lot_210927 Lot_211020
4 Lot_210927 0 4.8604 8.168337 Lot_211020 0 10.3170 36.18236 Lot_210927 Lot_211020
5 Lot_210927 0 4.9032 37.306613 Lot_211020 0 67.3742 37.39050 Lot_210927 Lot_211020
6 Lot_210927 0 4.9502 22.176353 Lot_211020 0 10.2540 40.16776 Lot_210927 Lot_211020
7 Lot_210927 0 4.7858 23.713427 Lot_211020 0 6.9624 28.07575 Lot_210927 Lot_211020
8 Lot_210927 0 5.2778 10.496994 Lot_211020 0 9.5718 28.84626 Lot_210927 Lot_211020
9 Lot_210927 1 1021.8458 32.119668 Lot_211020 0 13.0306 38.87375 Lot_210927 Lot_211020
10 Lot_210927 1 1020.1998 31.500716 Lot_211020 1 860.3956 29.15746 Lot_210927 Lot_211020
11 Lot_210927 1 1065.8000 31.979674 Lot_211020 1 884.9338 30.03665 Lot_210927 Lot_211020
12 Lot_210927 1 988.0452 31.019754 Lot_211020 1 1552.2462 27.90839 Lot_210927 Lot_211020
13 Lot_210927 1 1085.2206 31.557973 Lot_211020 1 738.2328 29.22760 Lot_210927 Lot_211020
14 Lot_210927 1 1072.8540 31.745491 Lot_211020 1 1419.6448 29.13627 Lot_210927 Lot_211020
15 Lot_210927 1 1020.6496 31.218151 Lot_211020 1 1441.6212 29.35351 Lot_210927 Lot_211020
16 Lot_210927 1 983.4106 31.981162 Lot_211020 1 424.9774 31.56446 Lot_210927 Lot_211020
为澄清起见,“方向”仅表示样本是正数还是负数。我想找出 RFU 与 Ct 和方向变量之间是否存在相关性。但我似乎无法想出办法。我创建的这个名为“mycob12”的新数据框的奇怪部分是它在末尾创建了两个名为“by.x”和“by.y”的新变量,我不确定我能做什么删除它们。有没有办法删除变量?
编辑:我想使用这些数据框并用它们创建图表来探索任何具有方向、RFU 和 Ct 的模式。我考虑过删除日期并将多个数据框放在彼此之上。
谢谢!
我不确定你到底想做什么,但看看你的数据,只堆叠两个数据帧然后使用 Date
变量对它们进行排序似乎更有意义。
按照上面的数据框:
df1 <- data.frame(Date = c("Lot_210927","Lot_210927","Lot_210927"),
Direction = c(0,0,0),
RFU = c(6.3588,5.0394,4.9946),
Ct = c(9.164329,11.350701,37.334669))
df2 <- data.frame(Date = c("Lot_211020","Lot_211020","Lot_211020"),
Direction = c(0,0,0),
RFU = c(0.6876,40.1056,97.0882),
Ct = c(47.72087,38.37418,37.72917))
您可以使用 bind_rows
将它们与 tidyverse 堆叠在一起:(请注意,它只会叠加两个数据框。我建议仅在您具有完全相同的列名和数据类型时才使用它 - 例如数字,字符等 - 在两个数据框中,否则你应该使用 tidyverse
)
left_join
library(tidyverse)
df_merged <- bind_rows(df1,df2)
df_merged
Date Direction RFU Ct
1 Lot_210927 0 6.3588 9.164329
2 Lot_210927 0 5.0394 11.350701
3 Lot_210927 0 4.9946 37.334669
4 Lot_211020 0 0.6876 47.720870
5 Lot_211020 0 40.1056 38.374180
6 Lot_211020 0 97.0882 37.729170
然后您可以生成如下相关矩阵:
df_num <- df_merged[, c(2:4)]
df_cor <- round(cor(df_num),2)
df_cor %>%
head()
Direction RFU Ct
Direction 1 NA NA
RFU NA 1.00 0.29
Ct NA 0.29 1.00
只需隔离数值变量并用它们绘制相关矩阵。显然,6 个数据点和方向始终为 0 并不是很有趣,但是对于您的完整数据集,它应该是一个很好的起点。