按相对于因子列的最小值过滤行
filter rows by minimum value relative to a factor column
我需要通过相对于列的最小值来过滤 DF。示例:
RowNumber
Some_Factor
Value
One_of_many_random_columns
1
A
10
Hello World!
2
A
15
Hello World!
3
A
8
Hello World!
4
B
20
Hello Again!
5
B
18
Hello Again!
6
B
25
Hello Again!
在此示例中,我想过滤第 3 行和第 5 行。因为它们相对于 DF$Some_Factor
具有最小值 DF$Value
。
提前致谢。
df %>%
group_by(Some_Factor) %>%
filter(Value == min(Value))
我们可以在 group_by
:
之后使用 slice_min
library(dplyr)
df %>%
group_by(Some_Factor) %>%
slice_min(Value) %>%
ungroup()
RowNumber Some_Factor Value One_of_many_random_columns
<int> <chr> <int> <chr>
1 3 A 8 Hello World!
2 5 B 18 Hello Again!
在subset
中使用ave
。
subset(dat, Some_Factor == ave(Some_Factor, RowNumber, FUN=min))
# RowNumber Some_Factor Value One_of_many_random_columns
# 3 A 8 Hello World!
# 5 B 18 Hello Again!
数据:
dat <- structure(list(RowNumber = c("A", "A", "A", "B", "B", "B"), Some_Factor = c(10L,
15L, 8L, 20L, 18L, 25L), Value = c("Hello", "Hello", "Hello",
"Hello", "Hello", "Hello"), One_of_many_random_columns = c("World!",
"World!", "World!", "Again!", "Again!", "Again!")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
data.table
选项:
library(data.table)
setDT(df)[ , .SD[which.min(Value)], by = Some_factor]
输出:
Some_factor RowNumber Value One_of_many_random_columns
1: A 3 8 Hello World!
2: B 5 18 Hello Again!
数据
df <- data.frame(RowNumber = c(1,2,3,4,5,6),
Some_factor = c("A", "A", "A", "B", "B", "B"),
Value = c(10,15,8,20,18,25),
One_of_many_random_columns = c("Hello World!", "Hello World!", "Hello World!", "Hello Again!", "Hello Again!", "Hello Again!"))
我需要通过相对于列的最小值来过滤 DF。示例:
RowNumber | Some_Factor | Value | One_of_many_random_columns |
---|---|---|---|
1 | A | 10 | Hello World! |
2 | A | 15 | Hello World! |
3 | A | 8 | Hello World! |
4 | B | 20 | Hello Again! |
5 | B | 18 | Hello Again! |
6 | B | 25 | Hello Again! |
在此示例中,我想过滤第 3 行和第 5 行。因为它们相对于 DF$Some_Factor
具有最小值 DF$Value
。
提前致谢。
df %>%
group_by(Some_Factor) %>%
filter(Value == min(Value))
我们可以在 group_by
:
slice_min
library(dplyr)
df %>%
group_by(Some_Factor) %>%
slice_min(Value) %>%
ungroup()
RowNumber Some_Factor Value One_of_many_random_columns
<int> <chr> <int> <chr>
1 3 A 8 Hello World!
2 5 B 18 Hello Again!
在subset
中使用ave
。
subset(dat, Some_Factor == ave(Some_Factor, RowNumber, FUN=min))
# RowNumber Some_Factor Value One_of_many_random_columns
# 3 A 8 Hello World!
# 5 B 18 Hello Again!
数据:
dat <- structure(list(RowNumber = c("A", "A", "A", "B", "B", "B"), Some_Factor = c(10L,
15L, 8L, 20L, 18L, 25L), Value = c("Hello", "Hello", "Hello",
"Hello", "Hello", "Hello"), One_of_many_random_columns = c("World!",
"World!", "World!", "Again!", "Again!", "Again!")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
data.table
选项:
library(data.table)
setDT(df)[ , .SD[which.min(Value)], by = Some_factor]
输出:
Some_factor RowNumber Value One_of_many_random_columns
1: A 3 8 Hello World!
2: B 5 18 Hello Again!
数据
df <- data.frame(RowNumber = c(1,2,3,4,5,6),
Some_factor = c("A", "A", "A", "B", "B", "B"),
Value = c(10,15,8,20,18,25),
One_of_many_random_columns = c("Hello World!", "Hello World!", "Hello World!", "Hello Again!", "Hello Again!", "Hello Again!"))