当有 NA 值时,我可以得到 geom_smooth() 以允许换行吗?

Can I get geom_smooth() to allow line breaks when there are NA values?

我希望找到一种在使用 geom_smooth() 时显示换行符的方法 - 这可能吗?

这是我正在使用的示例数据和代码以及生成的图:

game_number <- c(1:52)

toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
        16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
        15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)

toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
plot <- ggplot(toi_df, aes(x = game_number, y = toi, group = player, colour = player)) +
            geom_line(size = 0.6) +
            geom_smooth(se = F, size = 1) +
            scale_y_continuous(limits = c(0, 25), expand = c(0, 0))

结果图如下所示。您可以在 geom_line() 中看到 NA 线中断,但 geom_smooth() 线连接 NA 值。在这种情况下,有没有办法让 geom_smooth() 表现得像 geom_line() 一样?或者使用其他一些 ggplot 命令代替?谢谢!

我会建议一种方法,您可以在独立数据框中计算 geom_smooth() 输出,然后与原始数据合并。这里使用 broomtidyverse 包的方法:

library(tidyverse)
library(broom)

首先是数据:

#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
         16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
         15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)

现在,我们计算平滑模型:

#Create smooth
model <- loess(toi ~ game_number, data = toi_df)

我们创建一个数据框来保存结果:

#Augment model output in a new dataframe
toi_df2 <- augment(model, toi_df)

我们合并数据:

#Merge data
toi_df3 <- merge(toi_df,
                 toi_df2[,c("player","game_number",".fitted")],
                 by=c("player","game_number"),all.x = T)

最后,我们使用 geom_line():

绘图
#Plot
ggplot(toi_df3, aes(x = game_number, y = toi, group = player, colour = player)) +
  geom_line(size = 0.6) +
  geom_line(aes(y=.fitted),size=1) +
  scale_y_continuous(limits = c(0, 25), expand = c(0, 0))

输出:

如果您有多个玩家,则该方法可行。在这种情况下,您可以按玩家分组(group_by() 来自 dplyr)并使用 do() 函数来估计每个玩家的平滑模型。

更新:

我为多人游戏添加了一个代码。在这种情况下,我创建了一个函数来遍历列表中玩家定义的组。创建函数后,您必须使用 split() 来获取每个玩家的列表。函数 myfunsmooth() 计算 loess。然后,您绑定数据并绘制绘图。这里的代码:

虚拟数据:

#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
         16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
         15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
toi_df0 <- tibble(player = 'Zach Ellenthal', game_number = game_number, toi = toi)
toi_df0$toi <- toi_df0$toi+15 
toi_dfm <- rbind(toi_df,toi_df0)

loess()的函数:

#Function for smoothing
myfunsmooth <- function(x)
{
  #Model
  model <- loess(toi ~ game_number, data = x)
  #Augment model output in a new dataframe
  y <- augment(model, x)
  #Merge data
  z <- merge(x,y[,c("player","game_number",".fitted")],
                   by=c("player","game_number"),all.x = T)
  #Return
  return(z)
}

然后,我们创建列表:

#Create list by player
List <- split(toi_dfm,toi_dfm$player)

我们应用该函数并将结果绑定到一个新的数据框中:

#Apply function
List2 <- lapply(List, myfunsmooth)
#Bind all
dfglobal <- do.call(rbind,List2)
rownames(dfglobal)<-NULL

最后,我们绘制:

#Plot
ggplot(dfglobal, aes(x = game_number, y = toi, group = player, colour = player)) +
  geom_line(size = 0.6) +
  geom_line(aes(y=.fitted),size=1) +
  scale_y_continuous(limits = c(0, 45), expand = c(0, 0)) 

输出: