如何对 fp-ts 中的对象数组执行分组聚合?

How to perform grouped-by aggregations on array of objects in fp-ts?

我正在尝试分析作为嵌套对象数组给出的数据。我想使用 fp-ts 生态系统,我想弄清楚如何将 grouped-by 计算与任何预定义的函数(例如,计算平均值、中位数、众数、总和、标准差等)。

例子

我有一个对象数组,其中每个对象都包含有关不同学生的数据。我们这里有 3 个学生。

const studentsGrades = [
  {
    name: 'john',
    age: 21,
    classes: {
      history: {
        grade: 89,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 95,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 81,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 77,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'amanda',
    age: 20,
    classes: {
      history: {
        grade: 95,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 99,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 89,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 65,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'rachel',
    age: 19,
    classes: {
      history: {
        grade: 80,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 90,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 100,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 88,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },
];

我想进行不同的计算。例如,物理的 平均 成绩是多少?文学的 中位数 成绩是多少?人文学科类的标准差是多少?


我对此进行推理的一种方法是单独定义对数组进行这些计算的独立函数。例如:
平均

const calcMean = (arr: number[]): number => {
    return arr.reduce((acc, v, i, a) => acc + v / a.length, 0); // 
};

中位数

const calcMedian = (arr: number[]): number => {
  if (!arr.length) return undefined;
  const s = [...arr].sort((a, b) => a - b);
  const mid = Math.floor(s.length / 2);
  return s.length % 2 === 0 ? ((s[mid - 1] + s[mid]) / 2) : s[mid];
}; // 

标准差

const calcStandardDeviation = (arr: number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
}; // 

好吧,但现在呢?我如何在 studentsGrades 上应用任何感兴趣的函数(即 calcMean()calcMedian()calcStandardDeviation()),通过按相关键分组来回答我的分析问题?

如果您使用 fp-ts,您应该使用 Option 而不是为 calcMedian 返回 undefined。当参数不修改数组时,将参数键入为只读数组也很好:

import * as O from 'fp-ts/Option';

const calcMean = (arr: readonly number[]): number => {
  return arr.reduce((acc, v) => acc + v, 0) / arr.length;
};

const calcMedian = (arr: readonly number[]): O.Option<number> => {
  if (!arr.length) return O.none;
  const sorted = [...arr].sort((a, b) => a - b);
  const mid = Math.trunc(sorted.length / 2);
  return O.some(
    sorted.length % 2 === 0
      ? (sorted[mid - 1]! + sorted[mid]!) / 2
      : sorted[mid]!
  );
};

const calcStandardDeviation = (arr: readonly number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
};

获取受试者数据:

import * as RA from 'fp-ts/ReadonlyArray';
import * as O from 'fp-ts/Option';
import {pipe} from 'function';

const gradesByClass = (className: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.filterMap(({classes}) => O.fromNullable(classes[className]?.grade))
  );

const gradesByCategory = (categoryName: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.chain(({classes}) => Object.values(classes)),
    RA.filterMap(({category, grade}) => category === categoryName ? O.some(grade) : O.none)
  );

然后你可以像这样使用这些函数:

calcMean(gradesByClass('physics'))
calcMedian(gradesByClass('literature'))
calcStandardDeviation(gradesByCategory('humanities'))