如何使用Ramda.js计算数组元素的相对频率?

How to calculate relative frequency of array elements using Ramda.js?

我有一个包含重复值的数组,我想获取每个重复值的相对频率(即比例)。

对我来说,将其视为一个两步过程似乎很自然:

  1. 计算每个值出现的次数;和
  2. 将该计数除以原始数组的 长度

为了完成第一步,我们可以使用 R.countBy() from ramda.js:

const R = require("ramda")

const myLetters = ["a", "a", "a", "a", "b", "b", "c", "c", "c", "c"]
const counted = R.countBy(R.identity)(myLetters)
counted // => gives {"a": 4, "b": 2, "c": 4}

现在第二步是将 counted 除以 myLetters 的长度:

counted / R.length(myLetters) // obviously this doesn't work because it's not mapped

我有点不知道如何正确映射它。我不喜欢的当前笨拙的解决方案:

// 1. manually calculate the length and store to a variable
const nvals = R.length(myLetters)

// 2. create a custom division function
const divide_by_length = (x) => R.divide(x, nvals)

// 3. map custom function to `counted`
R.map(divide_by_length, counted) // gives {"a": 0.4, "b": 0.2, "c": 0.4}

虽然这可行,但必须有一种更直接的方法 ramdacounted{"a": 0.4, "b": 0.2, "c": 0.4}

您需要将对数组中的项目进行计数的结果与数组的长度相结合。

您可以使用 R.ap 作为 S 组合子,方法是为其提供 2 个函数。 S 组合子签名是 S = (f, g) => x => f(x)(g(x)),其中 fg 是函数。

你的情况:

f - 创建一个用长度除以柯里化的地图

g - 创建计数对象

const { ap, pipe, length, divide, __, map, countBy, identity } = R

const fn = ap(
  pipe(length, divide(__), map), // curry a map by divide by length
  countBy(identity), // create the counts
)

const myLetters = ["a", "a", "a", "a", "b", "b", "c", "c", "c", "c"]

const counted = fn(myLetters)

console.log(counted)
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.28.0/ramda.min.js" integrity="sha512-t0vPcE8ynwIFovsylwUuLPIbdhDj6fav2prN9fEu/VYBupsmrmk9x43Hvnt+Mgn2h5YPSJOk7PMo9zIeGedD1A==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>

我非常喜欢 Ori Drori 的方法,利用 ap (f, g) //~> (x) => f (x) (g (x)) 这一事实。 (chain 用于函数是相关的 chain (f, g) //~> (x) => f (g (x)) (x)。)

我最初的想法是相似的,而是使用 lift,它将对值进行操作的函数提升为对这些值的 容器 进行操作的函数。当容器是函数时,它的操作类似于 lift (f) (g, h) //~> (x) => f (g (x), h (x)),尽管它更通用,因为 lift (f) 是可变的,提供给它的函数也是如此,因此它生成的函数也是如此,例如lift (f) (g, h, i, j) //~> (a, b, c) => f (g (a, b, c), h (a, b, c), i (a, b, c), j (a, b, c))

所以,非常相似,我写道:

const frequencies = lift (map) (
  pipe (length, flip (divide)),
  countBy (identity)
) 

const myLetters = ["a", "a", "a", "a", "b", "b", "c", "c", "c", "c"]

console .log (frequencies (myLetters))
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.28.0/ramda.min.js"></script>
<script> const {lift, map, pipe, length, flip, divide, countBy, identity} = R </script>

尽管如此,我不清楚 point-free 方法是否能提供任何好处。我不确定我更喜欢哪个,但是这个非 point-free Ramda 在我看来可读性差不多:

const frequencies = (letters, total = letters.length) => 
  map (n => n / total) (countBy (identity) (letters) )


const myLetters = ["a", "a", "a", "a", "b", "b", "c", "c", "c", "c"]

console .log (frequencies (myLetters))
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.28.0/ramda.min.js"></script>
<script> const {lift, map, pipe, length, flip, divide, countBy, identity} = R </script>

我认为@Scott 的回答(第二个解决方案)让我损失了一些钱。

如果我们抢先定义两个辅助函数用于计数和除法:

const myCount = R.countBy(R.identity);
const myDivideBy = (divisor, arr) => R.map(elem => elem / divisor); // I was missing this part

那么我们可以这样做:

const calcFreq = (arr) => {
    return R.pipe(myCount, myDivideBy(arr.length))(arr)
}

这正是我一开始想象的两步过程:先数,再除。

calcFreq(myLetters) // gives {"a": 0.4, "b": 0.2, "c": 0.4}

相关post: