运行 通过 RStudio 的 SparkR

Running SparkR through RStudio

我使用下面的 link 学习如何通过 RStudio 运行 SparkR:

http://blog.danielemaasit.com/2015/07/26/installing-and-starting-sparkr-locally-on-windows-8-1-and-rstudio/

我在第 4.5 节遇到问题。

if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
  Sys.setenv(SPARK_HOME = "C:/Apache/spark-2.0.0")
}
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session(master = "local[*]", sparkConfig = list(spark.driver.memory = "1g"))

library(SparkR)
sc<-sparkR.session(master = "local")
sqlContext <- sparkRSQL.init(sc)

DF <- createDataFrame(sqlContext, faithful)

Error comes up when I run the DF function: 

Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
    at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
    at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
    at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
    at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
    at org.a
In addition: Warning message:
'createDataFrame(sqlContext...)' is deprecated.
Use 'createDataFrame(data, schema = NULL, samplingRatio = 1.0)' instead.
See help("Deprecated") 

我真的不知道错误是什么,我们将不胜感激。

谢谢!

试试这个

Sys.setenv(SPARK_HOME = "C://Apache/spark-2.0.0")

您需要使用上面的“//”。