是否可以将数据框对象转换为 tribble 构造函数?

Is is possible to convert a dataframe object to a tribble constructor?

我有这样的数据:

library(tidyverse)
df <- tibble(
    x = c(0, 179, 342, 467, 705, 878, 1080, 1209, 1458, 1639, 1805, 2000, 2121, 2339, 2462, 2676, 
      2857, 3049, 3227, 3403, 3583, 3651, 4009, 4034, 4151, 4194, 4512, 4523, 4679, 5789), 
    y = c(4.7005, 4.8598, 5.0876, 5.0938, 5.3891, 5.6095, 5.8777, 6.0064, 6.3063, 6.4723, 6.6053, 
          6.8145, 6.9078, 7.1701, 7.2633, 7.3865, 7.5766, 7.644, 7.8018, 7.9505, 8.0974, 8.1937, 
          8.2391, 8.294, 8.3143, 8.3452, 8.5092, 8.5172, 8.5993, 9.0275))

是否可以将我的数据框/tibble 对象转换为 tribble "constructor"?

我正在寻找类似 dput 的东西,但更轻量级并且专门用于数据帧。

在这个问题的 previous version 中发火之后,我花了一些时间试图破解我正在寻找的东西:

"mribble"函数(如make tribble):

mribble <- function(df) {

    names <- colnames(df)
    names <- sapply("~", paste, names, sep = "")
    names <- as.character(names)
    names <- paste(names, collapse = ", ")
    names <- paste(names, ",\n", sep = "")

    rows <- NULL
    for(i in seq_along(1:nrow(df))) {
        r <- as.character(df[i,])
        r <- paste(r, collapse = ", ")
        r <- paste(r, ",\n", sep = "")
        rows <- c(rows, r)
    }

    last <- rows[length(rows)]
    rows <- rows[-length(rows)]
    last <- substr(last, 1, nchar(last)-3)
    rows <- c(rows, last)

    meat <- c(names, rows)
    meat <- paste(meat, collapse = "")

    bun <- paste("df <- tribble(\n", meat, ")", sep = "")

    cat(bun)
}

mribble(df)

这会将其打印到控制台:

df <- tribble(
    ~x, ~y,
    0, 4.7005,
    179, 4.8598,
    342, 5.0876,
    467, 5.0938,
    705, 5.3891,
    878, 5.6095,
    1080, 5.8777,
    1209, 6.0064,
    1458, 6.3063,
    1639, 6.4723,
    1805, 6.6053,
    2000, 6.8145,
    2121, 6.9078,
    2339, 7.1701,
    2462, 7.2633,
    2676, 7.3865,
    2857, 7.5766,
    3049, 7.644,
    3227, 7.8018,
    3403, 7.9505,
    3583, 8.0974,
    3651, 8.1937,
    4009, 8.2391,
    4034, 8.294,
    4151, 8.3143,
    4194, 8.3452,
    4512, 8.5092,
    4523, 8.5172,
    4679, 8.5993,
    5789, 9.027)

我的解决方案非常糟糕,不适用于字符。非常感谢反馈。

我认为 mc_tribble 是一个更好的名字,看起来你可以将它浓缩为:

mc_tribble <- function(indf, indents = 4, mdformat = TRUE) {
  name <- as.character(substitute(indf))
  name <- name[length(name)]

  meat <- capture.output(write.csv(indf, quote = TRUE, row.names = FALSE))
  meat <- paste0(
    paste(rep(" ", indents), collapse = ""),
    c(paste(sprintf("~%s", names(indf)), collapse = ", "),
      meat[-1]))

  if (mdformat) meat <- paste0("    ", meat)
  obj <- paste(name, " <- tribble(\n", paste(meat, collapse = ",\n"), ")", sep = "")
  if (mdformat) cat(paste0("    ", obj)) else cat(obj)
}

试试看:

short_iris <- head(iris)

mc_tribble(short_iris)

改进:

  • 更短的代码
  • 捕获 "tibble"
  • 的名称
  • 有缩进的参数
  • 在 Stack Overflow 上有一个方便添加 4 个粘贴空格的参数
  • 听起来更好吃

我已将此添加到 my "SOfun" package。你可以安装它:

source("http://news.mrdwab.com/install_github.R")
install_github("mrdwab/overflow-mrdwab") # for writeClip -- plus it's awesome
install_github("mrdwab/SOfun")

用法很简单:

library(SOfun)
mc_tribble(short_iris)

优点:

  • 现在将输出复制到剪贴板(如果您安装了 "overflow")
  • 比以前更实惠!

我受 @A5C1D2H2I1M1N2O1R2T1 的启发创建了一个更广泛的解决方案。它处理大多数标准列类型,包括字符、因子、整数、数字、逻辑和列表。对于第二个参数,它还采用了 dput 的语法,因此应该能够输出到文件、连接或(默认情况下)控制台。它还采用了 dput 的标准 return 值,这是它的输入,不可见。

dput_to_var <- function(x) {
  con <- textConnection("out", "w", local = TRUE)
  dput(x, con)
  close(con)
  paste(out, collapse = "")
}

dput_tribble <- function(indf, file = "") {
  stopifnot(is.data.frame(indf))
  cols <- lapply(indf, function(col) {
    switch(class(col),
           factor =, character = paste0("\"", col, "\""),
           logical =, numeric =, integer = col,
           list = lapply(col, dput_to_var)
           )
  })
  meat <- c(paste(sprintf("~%s", names(indf)), collapse = ", "),
            do.call(paste, c(cols, sep = ", ")))
  out <- paste0("tribble(\n", paste(meat, collapse = ",\n"), ")")
  if (is.character(file)) {
    if (nzchar(file)) {
      file <- file(file, "wt")
      on.exit(close(file))
    } else {
      file <- stdout()
    }
  }
  writeLines(out, file)
  invisible(indf)
}

根据@alistaire 的建议,我forked the tibble package and added this. I have made a pull request

datapasta::dpasta()应该合适。您示例的输出:

dpasta(df)
tibble::tribble(
    ~x,     ~y,
     0, 4.7005,
   179, 4.8598,
   342, 5.0876,
   467, 5.0938,
   705, 5.3891,
   878, 5.6095,
  1080, 5.8777,
  1209, 6.0064,
  1458, 6.3063,
  1639, 6.4723,
  1805, 6.6053,
  2000, 6.8145,
  2121, 6.9078,
  2339, 7.1701,
  2462, 7.2633,
  2676, 7.3865,
  2857, 7.5766,
  3049,  7.644,
  3227, 7.8018,
  3403, 7.9505,
  3583, 8.0974,
  3651, 8.1937,
  4009, 8.2391,
  4034,  8.294,
  4151, 8.3143,
  4194, 8.3452,
  4512, 8.5092,
  4523, 8.5172,
  4679, 8.5993,
  5789, 9.0275
  )

https://cran.r-project.org/web/packages/datapasta/index.html https://github.com/MilesMcBain/datapasta

根据基里尔的 comment here, 这现在是包的一部分:https://github.com/krlmlr/deparse,在函数 deparsec

library(deparse)
#> 
#> Attaching package: 'deparse'
#> The following object is masked from 'package:base':
#> 
#>     deparse
library(tibble)

# dataframe
df <- tribble(
  ~a, ~b, ~c,
  1L, 0.1, "a"
)

# tribble script
deparsec(df, as_tribble = TRUE)
#> tribble(
#>   ~a, ~b,  ~c, 
#>   1L, 0.1, "a"
#> )

reprex package (v0.2.0) 创建于 2018-08-09。