spread() 正在产生 NA 值。(R 编程)
spread() is producing NA values.(R programming)
我正在使用 R 库 tidycensus 数据从 census.gov 下载数据。然后我正在使用转换数据
传播()。每个大地水准面都有许多具有估计值的列,但它为其余列生成 NA。
actual data
data after applying spread function
请帮我更正数据。
输出:
structure(list(GEOID = c(13001950100, 13001950100, 13001950100,
13001950100, 13001950100, 13001950100), NAME = c("Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia"), variable = c("S2401_C01_001",
"S2401_C01_002", "S2401_C01_003", "S2401_C01_004", "S2401_C01_005",
"S2401_C01_006"), estimate = c(1406, 271, 54, 54, 0, 0), moe = c(214,
87, 43, 43, 13, 13)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
如果您希望每个 ID 排成一行:
library(tidyverse)
df <- df %>%
pivot_wider(names_from = variable, values_from = c("estimate", "moe"))
选项dcast
library(data.table)
dcast(setDT(df), GEOID + NAME ~ variable, value.var = c("estimate", "moe"))
我正在使用 R 库 tidycensus 数据从 census.gov 下载数据。然后我正在使用转换数据 传播()。每个大地水准面都有许多具有估计值的列,但它为其余列生成 NA。
actual data
data after applying spread function
请帮我更正数据。
输出:
structure(list(GEOID = c(13001950100, 13001950100, 13001950100,
13001950100, 13001950100, 13001950100), NAME = c("Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia",
"Census Tract 9501, Appling County, Georgia"), variable = c("S2401_C01_001",
"S2401_C01_002", "S2401_C01_003", "S2401_C01_004", "S2401_C01_005",
"S2401_C01_006"), estimate = c(1406, 271, 54, 54, 0, 0), moe = c(214,
87, 43, 43, 13, 13)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
如果您希望每个 ID 排成一行:
library(tidyverse)
df <- df %>%
pivot_wider(names_from = variable, values_from = c("estimate", "moe"))
选项dcast
library(data.table)
dcast(setDT(df), GEOID + NAME ~ variable, value.var = c("estimate", "moe"))