挣扎于行操作

Struggling with row operations

下面是示例数据。我正在尝试进行行操作并且有点困惑。期望的结果如下。代码的第一次尝试也在下面以及由此产生的错误。这样做的目的是对每个时期的 441 行和 442 行求和,得出该时期的零售总额,然后绘制新的 44 行总数和份额(结果的最后一行)。但是,主要目标是创建新的总和行。

 library(dplyr) 


     area <- c("Clark","Clark","Douglas","Douglas","Clark","Clark","Douglas","Douglas","Statewide","Statewide")
    industry <-c(441,442,441,442,441,442,441,442,"000","000")
    employment <-c(100,50,101,65,102,52,103,67,1500,2200)
    period <- c("2016-1-31","2016-1-31","2016-1-31","2016-1-31","2016-2-28","2016-2-28","2016-2-28","2016-2-28","2016-1-31","2016-2-28")

   statewide <- data.frame(area,industry,employment,period)

   statewide <- statewide %>% (pivot_wider(names_from = industry, values_from = employment))

   Error in UseMethod("pivot_wider") : 
   no applicable method for 'pivot_wider' applied to an object of class "c('double', 
   'numeric')"


area     industry    employment     period
Clark        441        100          2016-1-31
Clark        442         50          2016-1-31
Douglas      441        101          2016-1-31
Dougals      442         65          2016-1-31
Statewide     44        316          2016-1-31
Statewide    000        1500         2016-1-31
Statewide    000       316/1500      2016-1-31

在到达最后(上图)之前,我在想 pivot_wider 之后的结果应该是这样的。从那里,我会对列求和,然后 pivot_longer 以产生上述结果。

  clark441     clark442     douglas 441     douglas442     NewComputedColumn       period
   100            50            101             65            316      2016-1-31      

新答案

请参阅下文了解如何解决数据透视表错误问题以及出现错误的原因。

为了达到您在评论中解释的所需输出,您不想旋转。你想要 group_by()summarize().

library(dplyr)
  group_by(area, period) %>%
  summarize(employment = sum(employment))
# A tibble: 6 x 3
# Groups:   area [3]
  area      period    employment
  <chr>     <chr>          <dbl>
1 Clark     2016-1-31        150
2 Clark     2016-2-28        154
3 Douglas   2016-1-31        166
4 Douglas   2016-2-28        170
5 Statewide 2016-1-31       1500
6 Statewide 2016-2-28       2200

旧答案

您的 pivot_wider() 调用周围的括号是不必要的,并且会阻止 pivot_wider() 工作。

library(dplyr)
library(tidyr)

statewide %>% 
    pivot_wider(names_from = industry, values_from = employment)
# A tibble: 6 x 5
  area      period    `441` `442` `000`
  <chr>     <chr>     <dbl> <dbl> <dbl>
1 Clark     2016-1-31   100    50    NA
2 Douglas   2016-1-31   101    65    NA
3 Clark     2016-2-28   102    52    NA
4 Douglas   2016-2-28   103    67    NA
5 Statewide 2016-1-31    NA    NA  1500
6 Statewide 2016-2-28    NA    NA  2200

这就是为什么额外的括号很麻烦。管道运算符 %>% 获取一个函数的输出并将其作为下一个函数的第一个参数。那个额外的括号被认为是下一个函数。然后,R 试图评估 pivot_wider() 调用,但它没有可操作的数据。它将参数解释为返回您在创建数据框时定义的原始向量。 运行下面的代码,得到同样的错误:

pivot_wider(names_from = industry, values_from = employment)