R中的数据操作转换为时间序列数据

Question

我正在使用 url link 下载此数据集：

https://files.hawaii.gov/dbedt/census/census_2020/data/redistricting/PLtable1_2020-county.xlsx

所以在 R 中我将其编码为：

url_dbedt_dicennial <- "https://files.hawaii.gov/dbedt/census/census_2020/data/redistricting/PLtable1_2020-county.xlsx"

    # download the xls to a temporary file
    temp <- tempfile(fileext = ".xlsx")
    download.file(url = url_dbedt_dicennial, destfile = temp, mode = "wb")
    
    # data from dbedt dicennial (look at each step to understand)
    data_in_dbedt_dicennial <- temp %>%
      readxl::read_excel(
        range = cellranger::as.cell_limits("A6:H15"),) %>%
        t() %>%

生成的输出如下：

转置后我现在正在努力的是如何将列重新标记为“时间”、“HI”、“HON”、“HAW”、“KAU”、“MAU”，然后消除 V1、V3、V8 和 V9 .我知道我可以逐一手动删除列，但是有一种聪明的方法吗？县应重新标记为时间。

最终我想对时间变量使用mutate函数，即

mutate(time)

并使用

将数据转换为时间序列

tsbox::ts_long()

夏威夷州应标记为“HI”，夏威夷县应标记为“HAW”，火奴鲁鲁市和县应标记为“HON”，考艾县应标记为“KAU”，毛伊县 1/ 应标记为“MAU”。 =15=]

Answer 1

所以这比我最初想象的要复杂一些，部分原因是 t()，它实际上是为处理矩阵而设计的。幸运的是，我能够在的其他地方找到一些指导，在那里我找到了 transpose_df()。虽然这可行，但我想这可以稍微清理一下。

data_in_dbedt_dicennial <- temp %>%
  readxl::read_excel(
    range = cellranger::as.cell_limits("A6:H15"),) %>% 
  na.omit()
  
transpose_df <- function(df) {
  t_df <- data.table::transpose(df)
  colnames(t_df) <- rownames(df)
  rownames(t_df) <- colnames(df)
  t_df <- t_df %>%
    tibble::rownames_to_column(.data = .) %>%
    tibble::as_tibble(.)
  return(t_df)
}

data_in_dbedt_dicennial <- transpose_df(data_in_dbedt_dicennial) %>% 
  .[-1,] %>% 
  rename(
    Year = rowname, HI = `1`, HAW = `2`, 
    HON = `3`, KAU = `4`, MAU = `5`
  ) %>% 
  mutate(across(everything(), as.integer))

输出:

# A tibble: 7 × 6
   Year      HI    HAW     HON   KAU    MAU
             
1  1960  632772  61332  500409 28176  42855
2  1970  769913  63468  630528 29761  46156
3  1980  964691  92053  762565 39082  70991
4  1990 1108229 120317  836231 51177 100504
5  2000 1211537 148677  876156 58463 128241
6  2010 1360301 185079  953207 67091 154924
7  2020 1455271 200629 1016508 73298 164836

R中的数据操作转换为时间序列数据

Data manipulation in R to be converted into time series data

r

data-cleaning