R:如何批量创建多个不同变量名的.R脚本文件?
R: How to batch create multiple copies of an .R script file with different variable names?
这是我第一次 post 来这里,我正在尽我所知努力遵循指南,所以请多多包涵。
我想创建大量类似的 .R 脚本文件,它们仅在使用的变量名称和提及这些变量的字符串中有所不同。当然,这也可以通过搜索和替换来实现,但我想知道是否有更方便的解决方案来更快地创建一堆。
让我们以这个编造的脚本为例(实际数据在这里无关紧要):
prefix.AnExemplaryRandomVariable <- rnorm(n = 100, mean = 0, sd = 1)
AnotherRandomVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.AnExemplaryRandomVariable, AnotherRandomVariable.suffix,
type = "p", pch = "*", xlab = "An Exemplary Random Variable",
ylab = "Another Random Variable", main = "A plot of An Exemplary
Random Variable and Another Random Variable")
我的想法是定义两个向量,每个向量都有 k 个新名称。
newNamesVar1 <- c("prefix.FirstVariable", "prefix.SomeData")
newNamesVar2 <- c("SecondVariable.suffix", "CannotThinkOfMoreNames.suffix")
我要查找的结果是 k 个新的 .R 文件,如下所示:
prefix.FirstVariable <- rnorm(n = 100, mean = 0, sd = 1)
SecondVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.FirstVariable, SecondVariable.suffix, type = "p",
pch = "*", xlab = "First Variable", ylab = "Second Variable",
main = "A plot of First Variable and Second Variable")
和
prefix.SomeData <- rnorm(n = 100, mean = 0, sd = 1)
CannotThinkOfMoreNames.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.SomeData, CannotThinkOfMoreNames.suffix, type = "p",
pch = "*", xlab = "Some Data", ylab = "Cant Think Of More Names",
main = "A plot of Some Data and Cannot Think Of More Names")
我看到以下两个挑战:
- 用相应的向量条目替换原始变量名
- 检查任何字符串是否与原始变量名称相似并替换它们,同时保持语法和格式(区分大小写、间距...)不变。
这是我第一次尝试将 R 用于实际数据分析之外的任何事情,因此我什至无法提供太多的代码草稿。我能够使用 ls()
获取变量名,但我对下一步该做什么一无所知,主要是因为更改不会应用于当前活动的文件,而是应用于一个全新的文件一.
感谢任何解决方案、技巧、提示或建议!
谢谢!
这是一种方法。
此答案的设置:
writeLines('
prefix.AnExemplaryRandomVariable <- rnorm(n = 100, mean = 0, sd = 1)
AnotherRandomVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.AnExemplaryRandomVariable, AnotherRandomVariable.suffix,
type = "p", pch = "*", xlab = "An Exemplary Random Variable",
ylab = "Another Random Variable",
main = "A plot of An Exemplary Random Variable and Another Random Variable")
', "template.R")
Table 要使用的替换值,其中列名表示模板字符串,列值是替换文本。
replacements <- data.frame(
"An Exemplary Random Variable" = c("First Variable", "Some Data"),
"Another Random Variable" = c("Second Variable", "Cannot Think Of More Names"),
check.names = FALSE
)
replacements
# An Exemplary Random Variable Another Random Variable
# 1 First Variable Second Variable
# 2 Some Data Cannot Think Of More Names
替换template.R
中每个模板字符串的工作进行替换,最终存储到新文件中。
code <- readLines("template.R")
for (row in seq_len(nrow(replacements))) {
newcode <- code
for (col in seq_along(replacements)) {
if (!is.na(replacements[row,col])) {
ptn1 <- colnames(replacements)[col] # original
ptn2 <- gsub(" +", "", ptn1) # "Title Case Sentence" to "TitleCaseSentence"
repl1 <- replacements[row,col]
repl2 <- gsub(" +", "", repl1)
newcode <- gsub(paste0("\b", ptn1, "\b"), repl1,
gsub(paste0("\b", ptn2, "\b"), repl2, newcode))
}
}
writeLines(newcode, sprintf("code_%s.R", row))
}
如果替换字符串(replacements
中特定单元格内的值)是 NA
,则不会尝试替换该模式。
输出:
code_1.R
prefix.FirstVariable <- rnorm(n = 100, mean = 0, sd = 1)
SecondVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.FirstVariable, SecondVariable.suffix,
type = "p", pch = "*", xlab = "First Variable",
ylab = "Second Variable",
main = "A plot of First Variable and Second Variable")
code_2.R
prefix.SomeData <- rnorm(n = 100, mean = 0, sd = 1)
CannotThinkOfMoreNames.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.SomeData, CannotThinkOfMoreNames.suffix,
type = "p", pch = "*", xlab = "Some Data",
ylab = "Cannot Think Of More Names",
main = "A plot of Some Data and Cannot Think Of More Names")
限制:
- 模式字符串在它们自己的行上必须是连续的,所以请注意我将模板
main=
字符串更改为不跨越两行
- 模式字符串不能直接preceded/followed字母;
\b
(正则表达式单词边界)的使用允许一些字符(如文字 .
),但这并没有试图变得更漂亮
已编辑:完成后,我意识到用空格定义模式和替换字符串可能更容易,并且然后删除第二个 (TitleCase) 模式的空格。这样就避免了用 title-case 分割字符串的一些歧义和技巧。它还允许您的模式或替换为 而不是 标题大小写。
这是我第一次 post 来这里,我正在尽我所知努力遵循指南,所以请多多包涵。
我想创建大量类似的 .R 脚本文件,它们仅在使用的变量名称和提及这些变量的字符串中有所不同。当然,这也可以通过搜索和替换来实现,但我想知道是否有更方便的解决方案来更快地创建一堆。
让我们以这个编造的脚本为例(实际数据在这里无关紧要):
prefix.AnExemplaryRandomVariable <- rnorm(n = 100, mean = 0, sd = 1)
AnotherRandomVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.AnExemplaryRandomVariable, AnotherRandomVariable.suffix,
type = "p", pch = "*", xlab = "An Exemplary Random Variable",
ylab = "Another Random Variable", main = "A plot of An Exemplary
Random Variable and Another Random Variable")
我的想法是定义两个向量,每个向量都有 k 个新名称。
newNamesVar1 <- c("prefix.FirstVariable", "prefix.SomeData")
newNamesVar2 <- c("SecondVariable.suffix", "CannotThinkOfMoreNames.suffix")
我要查找的结果是 k 个新的 .R 文件,如下所示:
prefix.FirstVariable <- rnorm(n = 100, mean = 0, sd = 1)
SecondVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.FirstVariable, SecondVariable.suffix, type = "p",
pch = "*", xlab = "First Variable", ylab = "Second Variable",
main = "A plot of First Variable and Second Variable")
和
prefix.SomeData <- rnorm(n = 100, mean = 0, sd = 1)
CannotThinkOfMoreNames.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.SomeData, CannotThinkOfMoreNames.suffix, type = "p",
pch = "*", xlab = "Some Data", ylab = "Cant Think Of More Names",
main = "A plot of Some Data and Cannot Think Of More Names")
我看到以下两个挑战:
- 用相应的向量条目替换原始变量名
- 检查任何字符串是否与原始变量名称相似并替换它们,同时保持语法和格式(区分大小写、间距...)不变。
这是我第一次尝试将 R 用于实际数据分析之外的任何事情,因此我什至无法提供太多的代码草稿。我能够使用 ls()
获取变量名,但我对下一步该做什么一无所知,主要是因为更改不会应用于当前活动的文件,而是应用于一个全新的文件一.
感谢任何解决方案、技巧、提示或建议!
谢谢!
这是一种方法。
此答案的设置:
writeLines('
prefix.AnExemplaryRandomVariable <- rnorm(n = 100, mean = 0, sd = 1)
AnotherRandomVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3)
plot(prefix.AnExemplaryRandomVariable, AnotherRandomVariable.suffix,
type = "p", pch = "*", xlab = "An Exemplary Random Variable",
ylab = "Another Random Variable",
main = "A plot of An Exemplary Random Variable and Another Random Variable")
', "template.R")
Table 要使用的替换值,其中列名表示模板字符串,列值是替换文本。
replacements <- data.frame(
"An Exemplary Random Variable" = c("First Variable", "Some Data"),
"Another Random Variable" = c("Second Variable", "Cannot Think Of More Names"),
check.names = FALSE
)
replacements
# An Exemplary Random Variable Another Random Variable
# 1 First Variable Second Variable
# 2 Some Data Cannot Think Of More Names
替换template.R
中每个模板字符串的工作进行替换,最终存储到新文件中。
code <- readLines("template.R")
for (row in seq_len(nrow(replacements))) {
newcode <- code
for (col in seq_along(replacements)) {
if (!is.na(replacements[row,col])) {
ptn1 <- colnames(replacements)[col] # original
ptn2 <- gsub(" +", "", ptn1) # "Title Case Sentence" to "TitleCaseSentence"
repl1 <- replacements[row,col]
repl2 <- gsub(" +", "", repl1)
newcode <- gsub(paste0("\b", ptn1, "\b"), repl1,
gsub(paste0("\b", ptn2, "\b"), repl2, newcode))
}
}
writeLines(newcode, sprintf("code_%s.R", row))
}
如果替换字符串(replacements
中特定单元格内的值)是 NA
,则不会尝试替换该模式。
输出:
code_1.R
prefix.FirstVariable <- rnorm(n = 100, mean = 0, sd = 1) SecondVariable.suffix <- rnorm(n = 100, mean = 10, sd = 3) plot(prefix.FirstVariable, SecondVariable.suffix, type = "p", pch = "*", xlab = "First Variable", ylab = "Second Variable", main = "A plot of First Variable and Second Variable")
code_2.R
prefix.SomeData <- rnorm(n = 100, mean = 0, sd = 1) CannotThinkOfMoreNames.suffix <- rnorm(n = 100, mean = 10, sd = 3) plot(prefix.SomeData, CannotThinkOfMoreNames.suffix, type = "p", pch = "*", xlab = "Some Data", ylab = "Cannot Think Of More Names", main = "A plot of Some Data and Cannot Think Of More Names")
限制:
- 模式字符串在它们自己的行上必须是连续的,所以请注意我将模板
main=
字符串更改为不跨越两行 - 模式字符串不能直接preceded/followed字母;
\b
(正则表达式单词边界)的使用允许一些字符(如文字.
),但这并没有试图变得更漂亮
已编辑:完成后,我意识到用空格定义模式和替换字符串可能更容易,并且然后删除第二个 (TitleCase) 模式的空格。这样就避免了用 title-case 分割字符串的一些歧义和技巧。它还允许您的模式或替换为 而不是 标题大小写。