R 函数中的相对引用观察
Relatively Referencing Observations in a function in R
在编写计算向量中每个观测值的函数时,我如何引用所述观测值以包括距离当前正在操作的观测值预定数量的观测值的单元格?如果每一行都是 i,例如 i = 1、2、... 等,我如何引用第 i-1 行中的列?
这是一个模拟我的困境的示例数据集:
> letters <- c('a', 'b', 'c', 'b', 'e')
> numbers <- c('1', '', '2', '', '3')
> sample <- cbind(letters, numbers)
> sample
letters numbers
[1,] "a" "1"
[2,] "b" ""
[3,] "c" "2"
[4,] "b" ""
[5,] "e" "3"
我想用之前观察到的 sample$numbers
中的值填充 sample$numbers
中的每个空单元格。我如何引用在其创建过程中创建的观察?例如,我试过:
> sample$numbers <- ifelse(sample$numbers == "", sample$numbers[as.numeric(rownames(sample)) - 1], sample$numbers)
Error in sample$numbers : $ operator is invalid for atomic vectors
我也试过用sample$letters
中常用的b
来补缺值:
> f1 <- function(df, cols, match_with, to_x = 'b'){
+ df[cols] <- lapply(df[cols], function(i)
+ ifelse(grepl(to_x, match_with, fixed = TRUE), sample$numbers[as.numeric(rownames(sample)) - 1],
+ i))
+ return(df)
+ }
> sample = f1(sample, cols = c('numbers'), match_with = sample$letters)
Hide Traceback
Rerun with Debug
Error in sample$letters : $ operator is invalid for atomic vectors
5.
grepl(to_x, match_with, fixed = TRUE)
4.
ifelse(grepl(to_x, match_with, fixed = TRUE), sample$numbers[as.numeric(rownames(sample)) -
1], i)
3.
FUN(X[[i]], ...)
2.
lapply(df[cols], function(i) ifelse(grepl(to_x, match_with, fixed = TRUE),
sample$numbers[as.numeric(rownames(sample)) - 1], i))
1.
f1(sample, cols = c("numbers"), match_with = sample$letters)
我的麻烦似乎是,在这两种情况下,我都在使用 sample$numbers[as.numeric(rownames(sample)) - 1]
来引用 sample$numbers
在之前观察中的值。有更好的方法吗?
假设您有一个 data.frame 而不是上面使用的矩阵(为了能够使用 $
引用列),您可以为此使用 zoo::na.locf
:
#make a data.frame instead of a matrix
sample <- data.frame(letters, numbers)
library(zoo)
#if your data has '' empty cells then convert those to NA
sample$numbers[sample$numbers == ''] <- NA
sample$numbers <- na.locf(sample$numbers)
输出:
sample
letters numbers
1 a 1
2 b 1
3 c 2
4 b 2
5 e 3
sample[,"numbers"] <- sapply(seq_along(sample[,"numbers"]),
function(x) ifelse(sample[,"numbers"][x] == '',
sample[,"numbers"][x-1],
sample[,"numbers"][x]))
letters numbers
[1,] "a" "1"
[2,] "b" "1"
[3,] "c" "2"
[4,] "b" "2"
[5,] "e" "3"
您可以使用 DataCombine
包中的 FillDown
函数:
library(DataCombine)
letters <- c('a', 'b', 'c', 'b', 'e')
numbers <- c('1', '', '2', '', '3')
numbers[numbers==""] <- NA # replace empty strings with NA
sample <- data.frame(letters,numbers)
FillDown(sample,"numbers")
letters <- c('a', 'b', 'c', 'b', 'e')
numbers <- c('1', '', '2', '', '3')
sample <- data.frame(letters, numbers, stringsAsFactors = F)
sample$numbers[sample$numbers == ""] <- c(sample$numbers[2:nrow(sample)], NA)[sample$numbers == ""]
在编写计算向量中每个观测值的函数时,我如何引用所述观测值以包括距离当前正在操作的观测值预定数量的观测值的单元格?如果每一行都是 i,例如 i = 1、2、... 等,我如何引用第 i-1 行中的列?
这是一个模拟我的困境的示例数据集:
> letters <- c('a', 'b', 'c', 'b', 'e')
> numbers <- c('1', '', '2', '', '3')
> sample <- cbind(letters, numbers)
> sample
letters numbers
[1,] "a" "1"
[2,] "b" ""
[3,] "c" "2"
[4,] "b" ""
[5,] "e" "3"
我想用之前观察到的 sample$numbers
中的值填充 sample$numbers
中的每个空单元格。我如何引用在其创建过程中创建的观察?例如,我试过:
> sample$numbers <- ifelse(sample$numbers == "", sample$numbers[as.numeric(rownames(sample)) - 1], sample$numbers)
Error in sample$numbers : $ operator is invalid for atomic vectors
我也试过用sample$letters
中常用的b
来补缺值:
> f1 <- function(df, cols, match_with, to_x = 'b'){
+ df[cols] <- lapply(df[cols], function(i)
+ ifelse(grepl(to_x, match_with, fixed = TRUE), sample$numbers[as.numeric(rownames(sample)) - 1],
+ i))
+ return(df)
+ }
> sample = f1(sample, cols = c('numbers'), match_with = sample$letters)
Hide Traceback
Rerun with Debug
Error in sample$letters : $ operator is invalid for atomic vectors
5.
grepl(to_x, match_with, fixed = TRUE)
4.
ifelse(grepl(to_x, match_with, fixed = TRUE), sample$numbers[as.numeric(rownames(sample)) -
1], i)
3.
FUN(X[[i]], ...)
2.
lapply(df[cols], function(i) ifelse(grepl(to_x, match_with, fixed = TRUE),
sample$numbers[as.numeric(rownames(sample)) - 1], i))
1.
f1(sample, cols = c("numbers"), match_with = sample$letters)
我的麻烦似乎是,在这两种情况下,我都在使用 sample$numbers[as.numeric(rownames(sample)) - 1]
来引用 sample$numbers
在之前观察中的值。有更好的方法吗?
假设您有一个 data.frame 而不是上面使用的矩阵(为了能够使用 $
引用列),您可以为此使用 zoo::na.locf
:
#make a data.frame instead of a matrix
sample <- data.frame(letters, numbers)
library(zoo)
#if your data has '' empty cells then convert those to NA
sample$numbers[sample$numbers == ''] <- NA
sample$numbers <- na.locf(sample$numbers)
输出:
sample
letters numbers
1 a 1
2 b 1
3 c 2
4 b 2
5 e 3
sample[,"numbers"] <- sapply(seq_along(sample[,"numbers"]),
function(x) ifelse(sample[,"numbers"][x] == '',
sample[,"numbers"][x-1],
sample[,"numbers"][x]))
letters numbers
[1,] "a" "1"
[2,] "b" "1"
[3,] "c" "2"
[4,] "b" "2"
[5,] "e" "3"
您可以使用 DataCombine
包中的 FillDown
函数:
library(DataCombine)
letters <- c('a', 'b', 'c', 'b', 'e')
numbers <- c('1', '', '2', '', '3')
numbers[numbers==""] <- NA # replace empty strings with NA
sample <- data.frame(letters,numbers)
FillDown(sample,"numbers")
letters <- c('a', 'b', 'c', 'b', 'e')
numbers <- c('1', '', '2', '', '3')
sample <- data.frame(letters, numbers, stringsAsFactors = F)
sample$numbers[sample$numbers == ""] <- c(sample$numbers[2:nrow(sample)], NA)[sample$numbers == ""]