访问 R 数据框中不存在的数据
Accessing data not existing in R dataframe
我有一个名为 productCheck 的数据框:
prod <- c("GAS","GAS","GLP","GLP","GNV")
monthYear <- c("2016-06-01","2016-07-01","2016-06-01","2016-07-01","2016-07-01")
meanValue <- c(3,5,8,1,6)
price <- c(0,0,0,0,0)
productCheck <- data.frame(prod,monthYear,meanValue,price)
productCheck$prod <- as.factor(productCheck$prod)
productCheck$monthYear <- as.factor(productCheck$monthYear)
当我执行以下循环时,出现错误:
for (j in levels(productCheck$prod))
{
firstPeriod <- NA
for (k in levels(productCheck$monthYear))
{
if (!is.na(firstPeriod))
{
secondPeriod <- k
productCheck[productCheck$monthYear==j & productCheck$prod==secondPeriod,]$price <-
100*(productCheck[productCheck$monthYear==secondPeriod & productCheck$prod==j,]$meanValue -
productCheck[productCheck$monthYear==firstPeriod & productCheck$prod==j ,]$meanValue) /
productCheck[productCheck$monthYear==firstPeriod & productCheck$prod==j ,]$meanValue
}
firstPeriod <- k
}
}
Error in $<-.data.frame
(*tmp*
, "price", value = numeric(0)) : replacement has 0 rows, data has 1
问题是 GNV 产品没有“2016-06-01”期间的信息。我怎样才能避免这个错误?
我觉得您的代码不必要地太长了 for 循环并且有问题,正如您所展示的那样。我可以看到几种选择,其中之一是:
library(tidyverse)
productCheck %>%
pivot_wider(names_from =monthYear, values_from = meanValue) %>%
mutate(price = 100*(`2016-07-01` - `2016-06-01`)/`2016-06-01`)
# A tibble: 3 x 4
prod price `2016-06-01` `2016-07-01`
<fct> <dbl> <dbl> <dbl>
1 GAS 66.7 3 5
2 GLP -87.5 8 1
3 GNV NA NA 6
您的原始数据:
prod <- c("GAS", "GAS", "GLP", "GLP", "GNV")
monthYear <- c("2016-06-01", "2016-07-01", "2016-06-01", "2016-07-01", "2016-07-01")
meanValue <- c(3, 5, 8, 1, 6)
productCheck <- data.frame(prod, monthYear, meanValue)
我有一个名为 productCheck 的数据框:
prod <- c("GAS","GAS","GLP","GLP","GNV")
monthYear <- c("2016-06-01","2016-07-01","2016-06-01","2016-07-01","2016-07-01")
meanValue <- c(3,5,8,1,6)
price <- c(0,0,0,0,0)
productCheck <- data.frame(prod,monthYear,meanValue,price)
productCheck$prod <- as.factor(productCheck$prod)
productCheck$monthYear <- as.factor(productCheck$monthYear)
当我执行以下循环时,出现错误:
for (j in levels(productCheck$prod))
{
firstPeriod <- NA
for (k in levels(productCheck$monthYear))
{
if (!is.na(firstPeriod))
{
secondPeriod <- k
productCheck[productCheck$monthYear==j & productCheck$prod==secondPeriod,]$price <-
100*(productCheck[productCheck$monthYear==secondPeriod & productCheck$prod==j,]$meanValue -
productCheck[productCheck$monthYear==firstPeriod & productCheck$prod==j ,]$meanValue) /
productCheck[productCheck$monthYear==firstPeriod & productCheck$prod==j ,]$meanValue
}
firstPeriod <- k
}
}
Error in
$<-.data.frame
(*tmp*
, "price", value = numeric(0)) : replacement has 0 rows, data has 1
问题是 GNV 产品没有“2016-06-01”期间的信息。我怎样才能避免这个错误?
我觉得您的代码不必要地太长了 for 循环并且有问题,正如您所展示的那样。我可以看到几种选择,其中之一是:
library(tidyverse)
productCheck %>%
pivot_wider(names_from =monthYear, values_from = meanValue) %>%
mutate(price = 100*(`2016-07-01` - `2016-06-01`)/`2016-06-01`)
# A tibble: 3 x 4
prod price `2016-06-01` `2016-07-01`
<fct> <dbl> <dbl> <dbl>
1 GAS 66.7 3 5
2 GLP -87.5 8 1
3 GNV NA NA 6
您的原始数据:
prod <- c("GAS", "GAS", "GLP", "GLP", "GNV")
monthYear <- c("2016-06-01", "2016-07-01", "2016-06-01", "2016-07-01", "2016-07-01")
meanValue <- c(3, 5, 8, 1, 6)
productCheck <- data.frame(prod, monthYear, meanValue)