如何使用 ScalaMock 评估使用特定 Spark Dataframe 参数调用的函数并获得有用的输出
How to use ScalaMock to evaluate that function was called with certain Spark Dataframe parameter and have useful output
我一直在看:
- http://scalamock.org/user-guide/advanced_topics/
- https://scalamock.org/user-guide/matching/
- https://scalamock.org/quick-start/
但还没有完全得到我想要的结果,但基本上我已经完成了这个测试
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = stub[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData")
).toDF("number", "word")
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(expectedParquetDf)
service.publishToDatabase()
(mockUtil.insertDataFrameIntoDb_).verify(expectedParquetDf,"myTable").once()
}
但如果该测试失败(由于数据帧不匹配),则输出不理想,简单地说
[info] Expected:
[info] inAnyOrder {
[info] <stub-4> UtilitiesService.getDataFrameFromParquet(path) any number of times (called once)
[info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres[number: int, word: string] once (never called - UNSATISFIED)
[info] }
[info]
[info] Actual:
[info] <stub-4> UtilitiesService.getDataFrameFromParquet(oath)
[info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres([number: int, word: string], "myTable" (myFile.scala:28)
字符串部分正确,但数据框部分正确;仅在说某列被删除时才有用,如果有坏行等则更少。是否有改善此问题的好方法?
目前我的兔子洞把我带到了下面,这仍然不起作用,并且 return true 使“&&”部分工作的“断言”功能感觉就像在那里必须是更好的方法。我可以在标准验证中覆盖一些比较器功能吗?
:
def assertStringsAreEqual(expectedPath:String, actualPath:String) : Boolean = {
assert(actualPath == expectedPath)
true
}
def assertDataFramesAreEqual(expected: DataFrame, actual: DataFrame) : Boolean = {
AssertHelpers.assertDataEqual(expected, actual) //verbos info, asserts on each row etc
true
}
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = stub[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData"),
(898, "wrongData"),
(999, "extraRow")
).toDF("number", "word")
val incorrectExample = Seq(
(999, "testData"),
(999, "testData")
).toDF("number", "word")
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
(_mockUtilService.insertDataFrameIntoPostgres _).
expects(where { {
(actualDf, path) => assertDataFramesAreEqual(expectedParquetDf, actualDf) && assertStringsAreEqual(path, "ExpectedTable")
} })
.once()
service.publishToDb()
}
作为参考,我的目标是在某处弹出这样的内容:
Expected:
Dataframe:
[number, word]
[999, "testData"]
[898, "wrongData"]
[999, "extraRow"]
Actual:
Dataframe
[number, word]
[999, "testData"]
[999, "testData"]
所以这仍然不理想,但是使用“expects.onCall”我可以获得我想要的输出
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = mock[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData"),
(898, "wrongData"),
(999, "extraRow")
).toDF("number", "word")
val incorrectExample = Seq(
(999, "testData"),
(999, "testData")
).toDF("number", "word")
//set up expectations
(mockUtilService.insertDataFrameIntoPostgres _).expects(*,"ExpectedTable").onCall( { (df: DataFrame, path:String) =>
AssertHelpers.assertDataEqual(df, expectedParquetDf)
Right(sxDbData)
})
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
service.publishToDb()
}
希望有人对此有更清晰的解决方案
我一直在看:
- http://scalamock.org/user-guide/advanced_topics/
- https://scalamock.org/user-guide/matching/
- https://scalamock.org/quick-start/
但还没有完全得到我想要的结果,但基本上我已经完成了这个测试
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = stub[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData")
).toDF("number", "word")
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(expectedParquetDf)
service.publishToDatabase()
(mockUtil.insertDataFrameIntoDb_).verify(expectedParquetDf,"myTable").once()
}
但如果该测试失败(由于数据帧不匹配),则输出不理想,简单地说
[info] Expected:
[info] inAnyOrder {
[info] <stub-4> UtilitiesService.getDataFrameFromParquet(path) any number of times (called once)
[info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres[number: int, word: string] once (never called - UNSATISFIED)
[info] }
[info]
[info] Actual:
[info] <stub-4> UtilitiesService.getDataFrameFromParquet(oath)
[info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres([number: int, word: string], "myTable" (myFile.scala:28)
字符串部分正确,但数据框部分正确;仅在说某列被删除时才有用,如果有坏行等则更少。是否有改善此问题的好方法?
目前我的兔子洞把我带到了下面,这仍然不起作用,并且 return true 使“&&”部分工作的“断言”功能感觉就像在那里必须是更好的方法。我可以在标准验证中覆盖一些比较器功能吗? :
def assertStringsAreEqual(expectedPath:String, actualPath:String) : Boolean = {
assert(actualPath == expectedPath)
true
}
def assertDataFramesAreEqual(expected: DataFrame, actual: DataFrame) : Boolean = {
AssertHelpers.assertDataEqual(expected, actual) //verbos info, asserts on each row etc
true
}
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = stub[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData"),
(898, "wrongData"),
(999, "extraRow")
).toDF("number", "word")
val incorrectExample = Seq(
(999, "testData"),
(999, "testData")
).toDF("number", "word")
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
(_mockUtilService.insertDataFrameIntoPostgres _).
expects(where { {
(actualDf, path) => assertDataFramesAreEqual(expectedParquetDf, actualDf) && assertStringsAreEqual(path, "ExpectedTable")
} })
.once()
service.publishToDb()
}
作为参考,我的目标是在某处弹出这样的内容:
Expected:
Dataframe:
[number, word]
[999, "testData"]
[898, "wrongData"]
[999, "extraRow"]
Actual:
Dataframe
[number, word]
[999, "testData"]
[999, "testData"]
所以这仍然不理想,但是使用“expects.onCall”我可以获得我想要的输出
scenario("myFunction reads parquet and writes to db") {
var mockUtil: UtilitiesService = mock[UtilitiesService]
val service = new myService(mockUtil)
val expectedParquetDf = Seq(
(999, "testData"),
(898, "wrongData"),
(999, "extraRow")
).toDF("number", "word")
val incorrectExample = Seq(
(999, "testData"),
(999, "testData")
).toDF("number", "word")
//set up expectations
(mockUtilService.insertDataFrameIntoPostgres _).expects(*,"ExpectedTable").onCall( { (df: DataFrame, path:String) =>
AssertHelpers.assertDataEqual(df, expectedParquetDf)
Right(sxDbData)
})
(mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
service.publishToDb()
}
希望有人对此有更清晰的解决方案