根据另一个 table 中的匹配项创建新列
Create new column based on matches in another table
我有 2 个看起来像这样的数据框(这是一个计数 table)。 data1 有一个名为“方法”的列,我想将其添加到 data2 中。我只想根据 Col1、Col2、Col3 来匹配它(Count 列不需要匹配)。如果匹配,就从data1中取出Method,放到data2中。如果没有匹配项,则将值设置为“未确定”。我在下面的数据框中找到了一个示例,名为 data_final.
data1 <- data.frame("Col1" = c("ABC", "ABC", "EFG", "XYZ"), "Col2" = c("AA", "AA",
"AA", "BB"), "Col3" = c("Al", "B", "Al", "Al"), "Count" = c(1, 4, 6, 2), "Method" =
c("Sample", "Dry", "Sample", "Sample"))
data2 <- data.frame("Col1" = c("ABC", "ABC", "ABC", "EFG", "XYZ", "XYZ"), "Col2" =
c("AA", "AA","CC", "AA", "BB", "CC"), "Col3" = c("Al", "B", "C", "Al", "Al", "C"),
"Count" = c(1, 4, 5, 6, 2, 1))
我想创建一个新的数据框,看起来像我上面描述的那样:
data_final <- data.frame("Col1" = c("ABC", "ABC", "ABC", "EFG", "XYZ", "XYZ"), "Col2"
= c("AA", "AA","CC", "AA", "BB", "CC"), "Col3" = c("Al", "B", "C", "Al", "Al", "C"),
"Count" = c(1, 4, 5, 6, 2, 1), "Method" = c("Sample", "Dry", "Not Determined",
"Sample", "Sample", "Not Determined"))
感谢您的帮助!
在 Base R 中,您可以进行合并:
data3 <- merge( data1, data2, all.y = TRUE )
然后用您选择的字符串替换 NA:
data3[ is.na( data3[ 5 ] ), 5 ] <- "Not Determined"
这给了你
> data3
Col1 Col2 Col3 Count Method
1 ABC AA Al 1 Sample
2 ABC AA B 4 Dry
3 ABC CC C 5 Not Determined
4 EFG AA Al 6 Sample
5 XYZ BB Al 2 Sample
6 XYZ CC C 1 Not Determined
注意:如果您使用的是旧版本的 R (< 4.0),您可能正在处理因子,需要在 with
之前添加额外的因子水平
levels( data3$Method ) <- c( levels( data3$Method ), "Not Determined" )
我有 2 个看起来像这样的数据框(这是一个计数 table)。 data1 有一个名为“方法”的列,我想将其添加到 data2 中。我只想根据 Col1、Col2、Col3 来匹配它(Count 列不需要匹配)。如果匹配,就从data1中取出Method,放到data2中。如果没有匹配项,则将值设置为“未确定”。我在下面的数据框中找到了一个示例,名为 data_final.
data1 <- data.frame("Col1" = c("ABC", "ABC", "EFG", "XYZ"), "Col2" = c("AA", "AA",
"AA", "BB"), "Col3" = c("Al", "B", "Al", "Al"), "Count" = c(1, 4, 6, 2), "Method" =
c("Sample", "Dry", "Sample", "Sample"))
data2 <- data.frame("Col1" = c("ABC", "ABC", "ABC", "EFG", "XYZ", "XYZ"), "Col2" =
c("AA", "AA","CC", "AA", "BB", "CC"), "Col3" = c("Al", "B", "C", "Al", "Al", "C"),
"Count" = c(1, 4, 5, 6, 2, 1))
我想创建一个新的数据框,看起来像我上面描述的那样:
data_final <- data.frame("Col1" = c("ABC", "ABC", "ABC", "EFG", "XYZ", "XYZ"), "Col2"
= c("AA", "AA","CC", "AA", "BB", "CC"), "Col3" = c("Al", "B", "C", "Al", "Al", "C"),
"Count" = c(1, 4, 5, 6, 2, 1), "Method" = c("Sample", "Dry", "Not Determined",
"Sample", "Sample", "Not Determined"))
感谢您的帮助!
在 Base R 中,您可以进行合并:
data3 <- merge( data1, data2, all.y = TRUE )
然后用您选择的字符串替换 NA:
data3[ is.na( data3[ 5 ] ), 5 ] <- "Not Determined"
这给了你
> data3
Col1 Col2 Col3 Count Method
1 ABC AA Al 1 Sample
2 ABC AA B 4 Dry
3 ABC CC C 5 Not Determined
4 EFG AA Al 6 Sample
5 XYZ BB Al 2 Sample
6 XYZ CC C 1 Not Determined
注意:如果您使用的是旧版本的 R (< 4.0),您可能正在处理因子,需要在 with
之前添加额外的因子水平levels( data3$Method ) <- c( levels( data3$Method ), "Not Determined" )