在 dplyr 中用字符类型条件进行总结

Question

我想计算一个国家单独列出的次数以及与其他国家一起列出的次数。

这是我的数据集的一部分：

address_countries2
name_countries      n_countries
China               1                      
China               1
Usa                 1                        
Usa                 1
China France        2               
China France        2
India               1                      
India               1
Jordan Germany      2

我用下面的代码提取了每个国家出现的次数。

publication_countries <- address_countries2 %>% 
  select(name_countries, n_countries) %>% 
  unnest_tokens(word, name_countries) %>%
  group_by(word) %>% 
  summarise(TP = n())

 head(publication_countries)
 # A tibble: 6 x 2
    word          TP
    <chr>       <int>
   1 China         4
   2 Usa           2
   3 France        2
   4 India         2
   5 Jordan        1       
   6 Germany       1

我想创建一个新列，其中包含一个国家单独列出的行数，以及一个第二列，其中包含一个国家与其他国家一起列出的次数。

期望的输出 像这样：

 head(publication_countries)
 # A tibble: 6 x 2
    word          TP      single_times      with_other_countries
    <chr>       <int>            <int>                     <int>   
   1 China         4                2                         2
   2 Usa           2                2                         0
   3 France        2                0                         2
   4 India         2                2                         0
   5 Jordan        1                0                         1
   6 Germany       1                0                         1

从这个 link 我看到了一种可能的条件总结方法，但是，在我的情况下，我需要使用不同于 sum() 的东西，因为我的条件对象是字符形式 (列词）。

summarise(TP = n() , IP = count(word[n_countries=="1"]))

但是我得到这个错误：

Error in summarise_impl(.data, dots) : 
  Evaluation error: no applicable method for 'groups' applied to an object of    class "character"

如有任何帮助，我们将不胜感激:)

非常感谢

Answer 1

dat%>% 
   select(name_countries, n_countries) %>% 
   unnest_tokens(word, name_countries) %>%
   group_by(word)%>%mutate(TP=n())%>%
   group_by(n_countries,word)%>%mutate(Tp1=n())%>%
   unique()%>%spread(n_countries,Tp1,0)
# A tibble: 6 x 4
# Groups:   word [6]
     word    TP   `1`   `2`
*   <chr> <int> <dbl> <dbl>
1   china     4     2     2
2  france     2     0     2
3 germany     1     0     1
4   india     2     2     0
5  jordan     1     0     1
6     usa     2     2     0

在 dplyr 中用字符类型条件进行总结

Summarize with character type conditions in dplyr

r

dplyr

summarize