在 Scala-Spark 中执行时出现异常 - java.lang.NumberFormatException:对于输入字符串:"volume"
Exception while executing in Scala-Spark - java.lang.NumberFormatException: For input string: "volume"
我在执行以下代码片段时遇到异常。
我正在使用的数据集是“stocks.csv”和
其中有列 - 日期、交易品种、交易量、开盘价、收盘价、高价、低价和 adjclose
val stock =
sc.textFile("C:/Users/kondr/Desktop/stocks/stocks.csv")
val splits = stock.map(record => record.split(","))
val symvol = splits.map(arr => (arr(1),arr(2).toInt))
val maxvol = symvol.reduceByKey((vol1,vol2) =>
Math.max(vol1,vol2),1)
maxvol.collect().foreach(println)
错误信息
21/05/05 14:09:31 错误执行器:阶段 2.0 (TID 2) 中的任务 0.0 出现异常
java.lang.NumberFormatException:对于输入字符串:“volume”
在 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
这是跳过第一行的方法
stock.zipWithIndex().filter(_._2 != 0)
.map(_._1)
.map(record => record.split(" "))
.map(arr => (arr(1),arr(2).toInt))
.reduceByKey((vol1,vol2) => Math.max(vol1,vol2),1)
或者你可以直接读取到dataframe如下
val csvDF = spark.read
.option("header", true)
.option("delimiter", " ")
.csv("stock.txt")
csvDF.show(false)
输出:
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+
|date |symbol|volume |open |close |high |low |adjclose |
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+
|18-04-2019|A |2874100|75.73000336|76.16999817|76.54000092|75.30999756|76.16999817|
|17-04-2019|A |4472000|78.15000153|75.43000031|78.31999969|74.45999908|75.43000031|
|16-04-2019|A |3441500|80.81999969|77.55000305|80.95999908|77.19000244|77.55000305|
|15-04-2019|A |1627300|81 |80.40000153|81.12999725|79.91000366|80.40000153|
|12-04-2019|A |1249300|81.43000031|80.98000336|82.05999756|80.90000153|80.98000336|
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+
我在执行以下代码片段时遇到异常。 我正在使用的数据集是“stocks.csv”和 其中有列 - 日期、交易品种、交易量、开盘价、收盘价、高价、低价和 adjclose
val stock =
sc.textFile("C:/Users/kondr/Desktop/stocks/stocks.csv")
val splits = stock.map(record => record.split(","))
val symvol = splits.map(arr => (arr(1),arr(2).toInt))
val maxvol = symvol.reduceByKey((vol1,vol2) =>
Math.max(vol1,vol2),1)
maxvol.collect().foreach(println)
错误信息
21/05/05 14:09:31 错误执行器:阶段 2.0 (TID 2) 中的任务 0.0 出现异常 java.lang.NumberFormatException:对于输入字符串:“volume” 在 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
这是跳过第一行的方法
stock.zipWithIndex().filter(_._2 != 0)
.map(_._1)
.map(record => record.split(" "))
.map(arr => (arr(1),arr(2).toInt))
.reduceByKey((vol1,vol2) => Math.max(vol1,vol2),1)
或者你可以直接读取到dataframe如下
val csvDF = spark.read
.option("header", true)
.option("delimiter", " ")
.csv("stock.txt")
csvDF.show(false)
输出:
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+
|date |symbol|volume |open |close |high |low |adjclose |
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+
|18-04-2019|A |2874100|75.73000336|76.16999817|76.54000092|75.30999756|76.16999817|
|17-04-2019|A |4472000|78.15000153|75.43000031|78.31999969|74.45999908|75.43000031|
|16-04-2019|A |3441500|80.81999969|77.55000305|80.95999908|77.19000244|77.55000305|
|15-04-2019|A |1627300|81 |80.40000153|81.12999725|79.91000366|80.40000153|
|12-04-2019|A |1249300|81.43000031|80.98000336|82.05999756|80.90000153|80.98000336|
+----------+------+-------+-----------+-----------+-----------+-----------+-----------+