ScalaCheck:选择一个具有自定义概率分布的整数

ScalaCheck: choose an integer with custom probability distribution

我想在 ScalaCheck 中创建一个生成器,生成介于 1 和 100 之间的数字,但对接近 1 的数字有类似钟的偏差。

Gen.choose() 在最小值和最大值之间随机分配数字:

scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)

并且Gen.chooseNum()对上限和下限增加了偏差:

scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)

我想要一个 choose() 函数,它会给我一个看起来像这样的结果:

scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)

我看到 choose()chooseNum() 将隐含的 Choose 特征作为参数。我应该使用它吗?

你可以使用 Gen.frequency() (1):

 val frequencies = List(
   (50000, Gen.choose(0, 9)),
   (38209, Gen.choose(10, 19)),
   (27425, Gen.choose(20, 29)),
   (18406, Gen.choose(30, 39)),
   (11507, Gen.choose(40, 49)),
   ( 6681, Gen.choose(50, 59)),
   ( 3593, Gen.choose(60, 69)),
   ( 1786, Gen.choose(70, 79)),
   (  820, Gen.choose(80, 89)),
   (  347, Gen.choose(90, 100))
 )

 (1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
 res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)

我从 https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative 得到了频率。该代码只是 table(%3 或 mod3)的示例,但我认为您可以理解。

我不能对此给予太多赞誉,并将向您指出这个 excel借出的页面: http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml

这在很大程度上取决于您所说的 "bell-like" 是什么意思。你的例子没有显示任何负数,但数字“1”不能在钟的中间并且不会产生任何负数,除非它是一个非常非常小的钟!

请原谅可变循环,但有时当我不得不拒绝集合构建中的值时我会使用它们:

object Test_Stack extends App {

  val r = new java.util.Random()

  val maxBellAttempt = 102
  val stdv = maxBellAttempt / 3  //this number * 3 will happen about 99% of the time


  val collectSize = 100000
  var filled = false


  val l = scala.collection.mutable.Buffer[Int]()

  //ref article above "What are the minimum and maximum values with nextGaussian()?"

  while(l.size < collectSize){

    val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
    //the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more

    if (temp <= maxBellAttempt) l+= temp

  }

  val res = l.to[scala.collection.immutable.Seq]
  //println(res.mkString("\n"))
}

这是我刚刚将输出粘贴到 excel 并做了一个 "countif" 来显示每个的频率的分布: