R：按列平滑 df 值

Question

我有一个 df，我想将其转换为折线图。正如您在下面看到的，这些图表有很多尖峰，所以我决定我应该平滑每个变量的值。

有没有办法在不使用循环的情况下用 20 行平均值替换 df 的每个单元格？

示例数据：

df = structure(list(Date = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 
100, 110, 120, 130, 140, 150, 160, 170, 180, 190), `0_3` = c(2.96069423175089, 
2.98934234468417, 3.0166710770045, 2.93318848451928, 2.9029582526956, 
2.93122886133033, 2.95467584624211, 2.92056074766355, 2.9673590504451, 
2.99909118448955, 3.0678648899907, 3.08758664146188, 3.16639741518578, 
3.1981536432575, 3.23886639676113, 3.32871012482663, 3.2554847841472, 
3.33575054387237, 3.25720703856234, 3.28495034377387), `0_6` = c(2.65441551812149, 
2.70340525084481, 2.75205080709182, 2.71591526344378, 2.76472214542438, 
2.73393461104848, 2.75387263339071, 2.77453271028037, 2.7299703264095, 
2.66585883065738, 2.69600247908274, 2.67800882167612, 2.7140549273021, 
2.63765248928454, 2.69905533063428, 2.66990291262136, 2.689313517339, 
2.75562001450326, 2.77049794084613, 2.78838808250573)), .Names = c("Date", 
"0_3", "0_6"), row.names = c(NA, 20L), class = "data.frame")

到目前为止我只能用循环来平滑它：

smooth_factor = 5
smooth_df = df[smooth_factor:nrow(df),]
for (i in rownames(smooth_df)) {
  i = as.numeric(i)
  for (j in colnames(smooth_df)[2:ncol(smooth_df)]){
  # The first column contains Date that should not be smoothed
    smooth_percent[i,j] = mean(df[(i-smooth_factor):i,j])
  }
}
smooth_df$Date = df$Date

如果我将此方法应用于更大的数据集，这就是它的样子：

变成

Answer 1

您正在寻找 apply (base) 和 rollmean（来自包 zoo）

 library(zoo)
 npoints <- 5
 apply(df,2,function(x){rollmean(x,npoints)})

      Date      0_3      0_6
 [1,]   20 2.960571 2.718102
 [2,]   30 2.954678 2.734006
 [3,]   40 2.947745 2.744099
 [4,]   50 2.928522 2.748595
 [5,]   60 2.935357 2.751406
 [6,]   70 2.954583 2.731634
 [7,]   80 2.981910 2.724047
 [8,]   90 3.008493 2.708875
 [9,]  100 3.057660 2.696779
[10,]  110 3.103819 2.678316
[11,]  120 3.151774 2.684955
[12,]  130 3.203943 2.679735
[13,]  140 3.237522 2.681996
[14,]  150 3.271393 2.690309
[15,]  160 3.283204 2.716878
[16,]  170 3.292421 2.734744

与 data.table 将是：

df[, lapply(.SD,function(x){rollmean(x,npoints)}),.SDcols = names(df)]

R：按列平滑 df 值

R: smooth df values by columns

r

smoothing

dataframe