如何使用 text2vec 手套函数解决 R 错误:未使用的参数 (grain_size = 100000)?
How to resolve R Error using text2vec glove function: unused argument (grain_size = 100000)?
尝试通过 documentation and 中的 text2vec 小插图来为一些推文创建词嵌入:
head(twtdf$Tweet.content)
[1] "$NFLX $GS $INTC $YHOO $LVS\n$MSFT $HOG $QCOM $LUV $UAL\n$MLNX $UA $BIIB $GOOGL $GM $V\n$SKX $GE $CAT $MCD $AAL $SBUX"
[2] "Good news frequent fliers. @AmericanAir says lower fares will be here for awhile"
[3] "Wall St. closing out the week with more earnings. What to watch:\n▶︎ $MCD\n▶︎ $AAL\n▶︎ $CAT\n"
[4] "Barrons loves $AAL at low multiple bc it's \"insanely profitable\". Someone tell them how cycles+ multiples work."
[5] "These airlines are now offering in-flight Wi-Fi $DAL $AAL"
基本按照给定的指南操作:
library(text2vec)
require(text2vec)
twtdf <- read.csv("tweets.csv",header=T, stringsAsFactors = F)
twtdf$ID <- seq.int(nrow(twtdf))
tokens = twtdf$Tweet.content %>% tolower %>% word_tokenizer
length(tokens)
it = itoken(tokens)
# create vocabulary
v = create_vocabulary(it) %>%
prune_vocabulary(term_count_min = 5)
# create co-occurrence vectorizer
vectorizer = vocab_vectorizer(v, grow_dtm = F, skip_grams_window = 5L)
#dtm <- create_dtm(it, vectorizer, grow_dtm = R)
it = itoken(tokens)
tcm = create_tcm(it, vectorizer)
glove_model = glove(tcm, word_vectors_size = 50, vocabulary = v, x_max = 10, learning_rate = .2)
fit(tcm, glove_model, n_iter = 15)
#when this was executed, R couldn't find the function
#fit <- GloVe(tcm = tcm, word_vectors_size = 50, x_max = 10, learning_rate = 0.2, num_iters = 15)
但是,每当我开始执行 glove_model
时,我都会收到以下错误:
Error in .subset2(public_bind_env, "initialize")(...) :
unused argument (grain_size = 100000)
In addition: Warning message:
'glove' is deprecated.
Use 'GloVe' instead.
*我确实尝试使用 GloVe
代替,但我收到错误消息,即尽管重新安装了 text2vec 包并 require
ing 了 R 仍找不到函数。
为了检查以确保我的数据不是某种格式问题,我尝试 运行 使用 movie_review
数据的代码并遇到了同样的问题。为了彻底起见,我还尝试指定 grain_size
参数,但得到了同样的错误。我检查了 Git 存储库中的问题,但在此站点或 Internet 查询中没有看到任何内容。
还有其他人遇到过这个问题吗?或者是新人问题?
只需为模型使用正确的构造函数:glove = GlobalVectors$new(word_vectors_size = 50, vocabulary = vocab, x_max = 10)
glove()
是来自非常旧的软件包版本的旧版本。
显然 GlobalVectors
构造函数再次更改,现在直接从 TCM 获取词汇信息?
glove = GlobalVectors$new(rank = 50, x_max = 10)
wv_main = glove$fit_transform(tcm, n_iter = 10, convergence_tol = 0.01, n_threads = 8)
尝试通过 documentation and
head(twtdf$Tweet.content)
[1] "$NFLX $GS $INTC $YHOO $LVS\n$MSFT $HOG $QCOM $LUV $UAL\n$MLNX $UA $BIIB $GOOGL $GM $V\n$SKX $GE $CAT $MCD $AAL $SBUX"
[2] "Good news frequent fliers. @AmericanAir says lower fares will be here for awhile"
[3] "Wall St. closing out the week with more earnings. What to watch:\n▶︎ $MCD\n▶︎ $AAL\n▶︎ $CAT\n"
[4] "Barrons loves $AAL at low multiple bc it's \"insanely profitable\". Someone tell them how cycles+ multiples work."
[5] "These airlines are now offering in-flight Wi-Fi $DAL $AAL"
基本按照给定的指南操作:
library(text2vec)
require(text2vec)
twtdf <- read.csv("tweets.csv",header=T, stringsAsFactors = F)
twtdf$ID <- seq.int(nrow(twtdf))
tokens = twtdf$Tweet.content %>% tolower %>% word_tokenizer
length(tokens)
it = itoken(tokens)
# create vocabulary
v = create_vocabulary(it) %>%
prune_vocabulary(term_count_min = 5)
# create co-occurrence vectorizer
vectorizer = vocab_vectorizer(v, grow_dtm = F, skip_grams_window = 5L)
#dtm <- create_dtm(it, vectorizer, grow_dtm = R)
it = itoken(tokens)
tcm = create_tcm(it, vectorizer)
glove_model = glove(tcm, word_vectors_size = 50, vocabulary = v, x_max = 10, learning_rate = .2)
fit(tcm, glove_model, n_iter = 15)
#when this was executed, R couldn't find the function
#fit <- GloVe(tcm = tcm, word_vectors_size = 50, x_max = 10, learning_rate = 0.2, num_iters = 15)
但是,每当我开始执行 glove_model
时,我都会收到以下错误:
Error in .subset2(public_bind_env, "initialize")(...) :
unused argument (grain_size = 100000)
In addition: Warning message:
'glove' is deprecated.
Use 'GloVe' instead.
*我确实尝试使用 GloVe
代替,但我收到错误消息,即尽管重新安装了 text2vec 包并 require
ing 了 R 仍找不到函数。
为了检查以确保我的数据不是某种格式问题,我尝试 运行 使用 movie_review
数据的代码并遇到了同样的问题。为了彻底起见,我还尝试指定 grain_size
参数,但得到了同样的错误。我检查了 Git 存储库中的问题,但在此站点或 Internet 查询中没有看到任何内容。
还有其他人遇到过这个问题吗?或者是新人问题?
只需为模型使用正确的构造函数:glove = GlobalVectors$new(word_vectors_size = 50, vocabulary = vocab, x_max = 10)
glove()
是来自非常旧的软件包版本的旧版本。
显然 GlobalVectors
构造函数再次更改,现在直接从 TCM 获取词汇信息?
glove = GlobalVectors$new(rank = 50, x_max = 10)
wv_main = glove$fit_transform(tcm, n_iter = 10, convergence_tol = 0.01, n_threads = 8)