试图访问参数的名称而不是它在 R 中的值
Trying to Access the Name of the Parameter & Not its Value in R
我正在尝试创建一个函数来创建一个 Wordcloud,其中包含参数、数据框及其列之一。但是,第一个语句中存在错误。我想让“DataFrame$Column”作为 VectorSource 的参数传递。我怎样才能最好地实现这一目标?
createsWordcloud <- function(df, col) {
# An Object of Class VectorSource which extends the Class Source representing a vector where each entry is interpreted as a document.
# Every Element of the Corpus is stored as a Document...
# The Bug is right here!..
corpus <- Corpus(VectorSource(paste(df, "$", col, sep="")))
# Convert the Corpus to Plain Text Document
corpus <- tm_map(corpus, PlainTextDocument)
# Remove Punctuation & STOPWORDS...
# STOPWORDs are commonly used words in the English Language... i.e. I, me, my
# To view the full list of STOPWORDS, type stopwords('english') in the Console...
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeWords, stopwords('english'))
# Next we perform STEMMING... All the words will be converted to their stem
# i.e. learning -> learn, walked -> walk
# These Words will be Plotted Only Once!
corpus <- tm_map(corpus, stemDocument)
wordcloud(corpus, max.words=100, random.order=FALSE)
# These parameters are used to limit the number of words plotted.
# max.words will plot the specified number of words and discard least frequent terms,
# whereas, min.freq will discard all terms whose frequency is below the specified value.
}
有两种方法可以做到这一点;一是采用非标准评价。此外,对于您的特定任务,简单地传入 df$col
并让函数采用向量而不是数据框可能是可行的,因为您只在给定的代码中使用该列。
如果确实需要传入列名,标准方式是将列名作为字符串传入,并用子集([.data.frame
)操作符引用:
readcol <- function(df, col) {
df[, col]
}
然后
> readcol(data.frame(x=1:10), "x")
[1] 1 2 3 4 5 6 7 8 9 10
评价不规范
如果你真的不想引用列名,你需要对 col
参数进行惰性评估以从数据框中提取它:
readcol.nse <- function(df, col) {
eval(substitute(col), df, parent.frame())
}
然后
> readcol.nse(data.frame(x=1:10), x)
[1] 1 2 3 4 5 6 7 8 9 10
标准警告适用于此——要非常小心非标准评估。它很难以编程方式使用(因为将列名传递给另一个函数很棘手),并且可能会对更复杂的表达式产生不直观的副作用。字符串形式有点笨拙,但更易于组合。
我正在尝试创建一个函数来创建一个 Wordcloud,其中包含参数、数据框及其列之一。但是,第一个语句中存在错误。我想让“DataFrame$Column”作为 VectorSource 的参数传递。我怎样才能最好地实现这一目标?
createsWordcloud <- function(df, col) {
# An Object of Class VectorSource which extends the Class Source representing a vector where each entry is interpreted as a document.
# Every Element of the Corpus is stored as a Document...
# The Bug is right here!..
corpus <- Corpus(VectorSource(paste(df, "$", col, sep="")))
# Convert the Corpus to Plain Text Document
corpus <- tm_map(corpus, PlainTextDocument)
# Remove Punctuation & STOPWORDS...
# STOPWORDs are commonly used words in the English Language... i.e. I, me, my
# To view the full list of STOPWORDS, type stopwords('english') in the Console...
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeWords, stopwords('english'))
# Next we perform STEMMING... All the words will be converted to their stem
# i.e. learning -> learn, walked -> walk
# These Words will be Plotted Only Once!
corpus <- tm_map(corpus, stemDocument)
wordcloud(corpus, max.words=100, random.order=FALSE)
# These parameters are used to limit the number of words plotted.
# max.words will plot the specified number of words and discard least frequent terms,
# whereas, min.freq will discard all terms whose frequency is below the specified value.
}
有两种方法可以做到这一点;一是采用非标准评价。此外,对于您的特定任务,简单地传入 df$col
并让函数采用向量而不是数据框可能是可行的,因为您只在给定的代码中使用该列。
如果确实需要传入列名,标准方式是将列名作为字符串传入,并用子集([.data.frame
)操作符引用:
readcol <- function(df, col) {
df[, col]
}
然后
> readcol(data.frame(x=1:10), "x")
[1] 1 2 3 4 5 6 7 8 9 10
评价不规范
如果你真的不想引用列名,你需要对 col
参数进行惰性评估以从数据框中提取它:
readcol.nse <- function(df, col) {
eval(substitute(col), df, parent.frame())
}
然后
> readcol.nse(data.frame(x=1:10), x)
[1] 1 2 3 4 5 6 7 8 9 10
标准警告适用于此——要非常小心非标准评估。它很难以编程方式使用(因为将列名传递给另一个函数很棘手),并且可能会对更复杂的表达式产生不直观的副作用。字符串形式有点笨拙,但更易于组合。