RserveException: eval failed 语法错误
RserveException: eval failed Syntax error
我有一个 R 函数可以从 html 页面中删除所有 html 数据。
当我在 R 中 运行 它时它起作用
但是当我通过 Rserve 运行 它产生错误时:
Exception in thread "main" org.rosuda.REngine.Rserve.RserveException:
eval failed, request status: R parser: syntax error
at org.rosuda.REngine.Rserve.RConnection.eval(RConnection.java:234)
at CereScope_Data.main(CereScope_Data.java:80)
Java Eval 我在哪里得到错误:
REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");
我的 R 函数:
rawdata 是一个 HTML 页面
RemoveHtml <- function(rawdata) {
library("tm")
## Convering Data To UTF-8 Format
## Creating Corpus
Encoding(rawdata) <- "latin1"
docs <- Corpus(VectorSource(iconv(rawdata, from = "latin1", to = "UTF-8", sub = "")))
toSpace <- content_transformer(function(x , pattern) gsub(pattern, " ", x))
docs <- gsub("[^\b]*(<style).*?(</style>)", " ", docs)
docs <- Corpus(VectorSource(gsub("[^\b]*(<script).*?(</script>)", " ", docs)))
docs <- tm_map(docs, toSpace, "<.*?>")
docs <- tm_map(docs, toSpace, "(//).*?[^\n]*")
docs <- tm_map(docs, toSpace, "/")
docs <- tm_map(docs, toSpace, "\\t")
docs <- tm_map(docs, toSpace, "\\n")
docs <- tm_map(docs, toSpace, "\\")
docs <- tm_map(docs, toSpace, "@")
docs <- tm_map(docs, toSpace, "\|")
docs <- tm_map(docs, toSpace, "\\"")
docs <- tm_map(docs, toSpace, ",")
RemoveHtmlDocs <- tm_map(docs, stripWhitespace)
return(as.character(RemoveHtmlDocs)[1])
}
Update - Things I tried already
- Escaping characters which may cause problems such as Single and Double Quotes and Backslashes
- I also tried assigning whole data to an R variable through eval and then running the function
新更新 - 问题已解决
- Escaping characters were causing problems such as Single and Double Quotes and Backslashes
- Another line which was no longer necessary was causing the problem as I didn't comment or remove it.
谢谢大家!! :)
检查我的答案以获取描述!! :)
错误在
REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");
In Java, \ is an escape character. So it escapes the meaning of "
which is meant to act as r expression
解决方案:只需在传递给 eval
函数之前附加 lstrRawData
作为
exp = "RemoveHtml(\"" + lstrRawData + "\")";
REXP lstrRemoveHtml = cobjConn.eval(exp)
转义字符是问题所在。为了解决这个问题,我转义了转义和引号。
我创建了此方法以使其更简单:
public static String Regexer(String Data) {
String RegexedData = Data.replaceAll("\\", "\\\\").replaceAll("'", "\\'").replaceAll("\"", "\\\"");
return (RegexedData);
}
我在上面的函数中再次对转义字符进行了转义,以便它们在 R 函数中也被转义。
提示:不要忘记将 REXP 转换为 Java 变量。 :)
我有一个 R 函数可以从 html 页面中删除所有 html 数据。 当我在 R 中 运行 它时它起作用 但是当我通过 Rserve 运行 它产生错误时:
Exception in thread "main" org.rosuda.REngine.Rserve.RserveException: eval failed, request status: R parser: syntax error
at org.rosuda.REngine.Rserve.RConnection.eval(RConnection.java:234) at CereScope_Data.main(CereScope_Data.java:80)
Java Eval 我在哪里得到错误:
REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");
我的 R 函数: rawdata 是一个 HTML 页面
RemoveHtml <- function(rawdata) {
library("tm")
## Convering Data To UTF-8 Format
## Creating Corpus
Encoding(rawdata) <- "latin1"
docs <- Corpus(VectorSource(iconv(rawdata, from = "latin1", to = "UTF-8", sub = "")))
toSpace <- content_transformer(function(x , pattern) gsub(pattern, " ", x))
docs <- gsub("[^\b]*(<style).*?(</style>)", " ", docs)
docs <- Corpus(VectorSource(gsub("[^\b]*(<script).*?(</script>)", " ", docs)))
docs <- tm_map(docs, toSpace, "<.*?>")
docs <- tm_map(docs, toSpace, "(//).*?[^\n]*")
docs <- tm_map(docs, toSpace, "/")
docs <- tm_map(docs, toSpace, "\\t")
docs <- tm_map(docs, toSpace, "\\n")
docs <- tm_map(docs, toSpace, "\\")
docs <- tm_map(docs, toSpace, "@")
docs <- tm_map(docs, toSpace, "\|")
docs <- tm_map(docs, toSpace, "\\"")
docs <- tm_map(docs, toSpace, ",")
RemoveHtmlDocs <- tm_map(docs, stripWhitespace)
return(as.character(RemoveHtmlDocs)[1])
}
Update - Things I tried already
- Escaping characters which may cause problems such as Single and Double Quotes and Backslashes
- I also tried assigning whole data to an R variable through eval and then running the function
新更新 - 问题已解决
- Escaping characters were causing problems such as Single and Double Quotes and Backslashes
- Another line which was no longer necessary was causing the problem as I didn't comment or remove it.
谢谢大家!! :) 检查我的答案以获取描述!! :)
错误在
REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");
In Java, \ is an escape character. So it escapes the meaning of
"
which is meant to act as r expression
解决方案:只需在传递给 eval
函数之前附加 lstrRawData
作为
exp = "RemoveHtml(\"" + lstrRawData + "\")";
REXP lstrRemoveHtml = cobjConn.eval(exp)
转义字符是问题所在。为了解决这个问题,我转义了转义和引号。 我创建了此方法以使其更简单:
public static String Regexer(String Data) {
String RegexedData = Data.replaceAll("\\", "\\\\").replaceAll("'", "\\'").replaceAll("\"", "\\\"");
return (RegexedData);
}
我在上面的函数中再次对转义字符进行了转义,以便它们在 R 函数中也被转义。
提示:不要忘记将 REXP 转换为 Java 变量。 :)