如何根据另一列中的值比较数据框中单个列中的两个因素,如果不匹配则删除它们

How to compare two factors in a single column in a dataframe based on the values in another column and delete them if they don't match

我正在尝试根据另一列中的值(在本例中为日期)比较两个因素。如果它们不匹配,我想删除该行。

示例:

>head(data)
 light date
1 0    20190314
2 0    20190317
3 1    20190314
4 0    20190318
5 1    20190316
6 1    20190318
7 1    20190314

所以我希望结果是:

>head(data)

 light date
1 0    20190314
2 1    20190314
3 0    20190318
4 1    20190318
5 1    20190314

提前致谢

这是一种解决方案。

输入

tribble(~light, ~date,
"0","20190314",
"0","20190317",
"1","20190314",
"0","20190318",
"1","20190316",
"1","20190318",
"1","20190314"
) ->d

代码

library(dplyr)
d %>% group_by(date) %>% # group by date
  mutate(is_keep = if_else("0" %in% light & "1" %in% light, 1,0)) %>% # create a temporary column to keep track if date has both 0 and 1. 
  filter(is_keep==1) %>% # filter out rows to keep
  select(-is_keep) %>% # remove temp column
  ungroup() #ungroup df

输出

  light date    
  <chr> <chr>   
1 0     20190314
2 1     20190314
3 0     20190318
4 1     20190318
5 1     20190314

您可以通过检查某个值是否存在于某个其他数据框中的特定列来过滤您的数据框:

data <- data %>%
  filter(date %in% unique(other_df$reference_column))

选项subset

subset(data, date %in% unique(other_df$reference_column))