Dataframe 仅出现在控制台中,而不出现在环境中(Rstudio)

Dataframe only appears in console, not in environment (Rstudio)

我在 R/Rstudio 方面经验不足,但我目前正在为我的工作创建一个包,我 运行 遇到了一个我无法解决的问题。我已经创建了几个函数,我使用 return() 来让代码创建的数据框出现在我的环境中。但是,在这一个中,我只获得了 R 控制台中显示的数据框的前 38 行。

函数代码:

widen <- function(projectpath) {

  project.df <- readr::read_csv(file = projectpath)

  projectwide.df <- project.df %>%
    dplyr::select(-c(1, Detection_limit)) %>%
    tidyr::pivot_wider(names_from = Element, values_from = Concentration)

  projectwide.df <- as.data.frame(projectwide.df)

  return(projectwide.df)

}

我在有和没有 as.data.frame() 的情况下都试过了,也只试过 data.frame,但都没有用。但是,当我 运行 代码本身(而不是作为函数)测试它时,它确实起作用了。当然,然后我没有使用return()功能,这似乎是问题所在。

起初我的问题是数据显示为 tibble 而不是数据框,但我相信这不再是我的问题,因为它出现在控制台的数据框上方,如果我正确理解了我之前的输出,这意味着它实际上是一个数据框:

cols(
  X1 = col_double(),
  Sample = col_character(),
  Date.x = col_character(),
  Filter_type = col_character(),
  Filter_size = col_double(),
  Filter_box_nr = col_double(),
  Filter_blank = col_character(),
  Volume = col_double(),
  Date.y = col_date(format = ""),
  Day = col_double(),
  Treatment = col_character(),
  Element = col_character(),
  Concentration = col_double(),
  Detection_limit = col_double()
)

这是我使用 return() 的其他功能之一,它在这里工作:

importxrf <- function(datapath, infopath) {

  datafile.df <- importdata(datapath = datapath)

  infofile.df <- importinfo(infopath = infopath)

  projectfile.df <- dplyr::inner_join(datafile.df, infofile.df, by = "Sample")

  notinprojectfile.df <- dplyr::anti_join(datafile.df, infofile.df, by = "Sample")

  if(nrow(notinprojectfile.df) > 0) {
    warning("WARNING! There are samples that do not match between your raw data file and information file.")
  }

  return(projectfile.df)

}

据我所知,这两个函数之间的唯一区别是 as.data.frame() 的使用,但正如我所提到的,没有这个它也不起作用。如果有人能帮我解决这个问题,我将不胜感激!谢谢

编辑: 这是我在代码中创建数据框的代码版本,因此它是可重现的。这只显示了我拥有的实际数据集的前 5 行。

widennn <- function() {

  Sample <- c("COM001", "COM001", "COM001", "COM001", "COM001")
  Element <- c("C", "N", "O", "Na", "Mg")
  Concentration <- c(-4.19727307987776, 0.292013243234358, 0.328051062623146, -0.0555794187038898, 0.0353942596959773)
  Detection_limit <- c(1.22193802149026, 0.312338639119395, 0.0322146560280234, 0.0362539069926691, 0.00465264605182871)

  firstrows.df <- data.frame(Sample, Element, Concentration, Detection_limit)

  projectwide.df <- firstrows.df %>%
    dplyr::select(-c(Detection_limit)) %>%
    tidyr::pivot_wider(names_from = Element, values_from = Concentration)

  projectwide.df <- as.data.frame(projectwide.df)

  return(projectwide.df)

}

我认为这里的问题与 scoping 有关。当您创建一个分配对象的函数时,该对象仅存在于该函数创建的特殊环境中。所以不能从外部访问它。函数创建的变量在函数完成时丢失。您可以将对象分配给您想要的任何环境。您可以使用 <<- 函数将对象分配给全局环境。但是,让用户手动分配对象被认为是最佳实践。这更容易排除故障并为用户提供更多控制权。请看下面两种赋值方式:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Not optimal
widennn <- function() {
  
  Sample <- c("COM001", "COM001", "COM001", "COM001", "COM001")
  Element <- c("C", "N", "O", "Na", "Mg")
  Concentration <- c(-4.19727307987776, 0.292013243234358, 0.328051062623146, -0.0555794187038898, 0.0353942596959773)
  Detection_limit <- c(1.22193802149026, 0.312338639119395, 0.0322146560280234, 0.0362539069926691, 0.00465264605182871)
  
  firstrows.df <- data.frame(Sample, Element, Concentration, Detection_limit)
  
  projectwide.df <- firstrows.df %>%
    dplyr::select(-c(Detection_limit)) %>%
    tidyr::pivot_wider(names_from = Element, values_from = Concentration)
  # This is not recommended
  projectwide.df <<- as.data.frame(projectwide.df)
  #return(projectwide.df)
}

widennn()

projectwide.df
#>   Sample         C         N         O          Na         Mg
#> 1 COM001 -4.197273 0.2920132 0.3280511 -0.05557942 0.03539426


# Better 
widennn <- function() {
  
  Sample <- c("COM001", "COM001", "COM001", "COM001", "COM001")
  Element <- c("C", "N", "O", "Na", "Mg")
  Concentration <- c(-4.19727307987776, 0.292013243234358, 0.328051062623146, -0.0555794187038898, 0.0353942596959773)
  Detection_limit <- c(1.22193802149026, 0.312338639119395, 0.0322146560280234, 0.0362539069926691, 0.00465264605182871)
  
  firstrows.df <- data.frame(Sample, Element, Concentration, Detection_limit)
  
  projectwide.df <- firstrows.df %>%
    dplyr::select(-c(Detection_limit)) %>%
    tidyr::pivot_wider(names_from = Element, values_from = Concentration)
  
  return(projectwide.df)
}

output <- widennn()

output
#> # A tibble: 1 x 6
#>   Sample     C     N     O      Na     Mg
#>   <chr>  <dbl> <dbl> <dbl>   <dbl>  <dbl>
#> 1 COM001 -4.20 0.292 0.328 -0.0556 0.0354

reprex package (v0.3.0)

于 2020-12-07 创建

如果我对问题的理解有误,请指正。