在 SML 中查找 int 列表的模式以及它在没有库函数的情况下出现的位置

Question

我正在尝试查找出现频率最高的模式或值。我想要一个像 :

这样的函数

mode:' 'a list -> (''a * int) list

它 return 是模式和它发生的地方，除非有平局然后 return 所有出现都是这样的：

mode([1,1,2,3,5,8]) ===> [(1,2)]
mode([1,3,5,2,3,5]) ===> [(3,2),(5,2)]
mode([true,false,true,true]) ====>[(true,3)]

我正在尝试在没有 SML 库函数的情况下执行此操作。

到目前为止我得到了：

fun mode(L)=
if null L then nil
else if hd L= hd (tl L) then 1+mode(hd(tl L))
else mode(tl L);

我知道这是不对的我想我很好奇你们是如何保持模式发生位置和模式是什么的索引以及return它们作为列表中的元组。

Answer 1

您正在尝试通过多个更简单的练习来解决多个部分的练习。从您目前的进度来看，您是否考虑过解决一些非常相似的练习以达到最终目标？这通常是解决编程问题时的好建议：将您当前的问题简化为更简单的问题并解决它们。

我会先尝试解决这个问题

在列表的元素上构建 histogram : ''a list -> (''a * int) list：
```
fun histogram [] = ...
  | histogram (x::xs) = ...
```
通过将每个 x 及其 count 插入直方图中来实现。
```
fun insert (x, []) = ...
  | insert (x, (y, count) :: hist) = ...
```
并编写一些您可以偶尔执行的测试。

查找列表的mode : ''a list -> ''a:

fun mode xs = ... (histogram xs)

通过在直方图中找到计数最大的元素来执行此操作：

fun findMax [] = ... (* what if the list/histogram is empty? *)
  | findMax [(x, count)] = ...
  | findMax ((x, count) :: (y, count) :: hist) = ...

并最终尝试解决这个问题

当您很好地掌握了递归表示和导航常规直方图时，您可以创建带注释的 histogram : (''a * int * int list) list，它不仅包含每个元素的频率，还包含它们的位置在输入列表中：
```
fun histogram_helper ([], _) = ...
  | histogram_helper (x::xs, i) = ... histogram_helper (xs, i+1) ...
```
通过将每个 x 及其 count 和位置 i 以及先前找到的位置 is 插入直方图中来执行此操作：
```
fun insert (x, i, []) = ...
  | insert (x, i, (y, count, is) :: hist) = ...
```
查找列表的（可能多个）mode : ''a list -> (''a * int list) list：
```
fun mode xs = ... (histogram xs)
```
通过在直方图中找到计数最大的（可能是多个）元素来做到这一点：
```
fun findMax ([],                     countMax, tmpModes) = ...
  | findMax ((x, count, is) :: hist, countMax, tmpModes) = ...
```
其中 countMax : int 是 tmpModes : (''a * int * int list) list 中重复的频率。这里countMax和tmpModes是累加结果参数。通过确定 (x, count, is) 是否应该被丢弃以支持所有 tmpModes，或者它应该被添加到 tmpModes，或者它应该被选择以支持所有 tmpNodes 来做到这一点

I am curious on how you both keep the indices of where the mode occurs and what the mode is and return them as tuples in a list.

是的，这不是微不足道的。使用我建议的子问题划分，回答这个问题取决于我们是在 histogram 函数还是 findMax 函数中：

在 histogram 中，您可以将索引存储为包含元素和频率的元组的一部分。在 findMax 中，由于您可能会收集多个结果，因此您需要跟踪哪个频率最高 (countMax) 以及 temporary 模式选择是 (tmpModes);在以后的递归调用中进行替换或添加。

所以回答你的问题：在一个累积参数中。

以及对您的代码片段的一些反馈

fun mode(L)=
if null L then nil
else if hd L= hd (tl L) then 1+mode(hd(tl L))
else mode(tl L);

使用模式匹配代替 null、hd 和 tl:

fun count_4s [] = 0
  | count_4s (x::xs) = (if x = 4 then 1 else 0) + count_4s xs

fun count_ns ([],    _) = 0
  | count_ns (x::xs, n) = (if x = n then 1 else 0) + count_ns (xs, n)

fun count_12 ([], ones, twos) = (ones, twos)
  | count_12 (x::xs, ones, twos) =
      if x = 1 then count_12 (xs, ones+1, twos) else
      if x = 2 then count_12 (xs, ones, twos+1) else
      count_12 (xs, ones, twos)

fun count_abc ([], result) = result
  | count_abc (x::xs, ((a, ca), (b, cb), (c, cc))) =
      count_abc (xs, if x = a then ((a, ca+1), (b, cb), (c, cc)) else
                     if x = b then ((a, ca), (b, cb+1), (c, cc)) else
                     if x = c then ((a, ca), (b, cb), (c, cc+1)) else
                     ((a, ca), (b, cb), (c, cc)))

构建直方图是对此的一种扩展，而不是像 4 这样的固定值，或者像 ones 和 twos 这样的固定值，你有一个它们的整个列表，你必须动态地寻找你得到的那个，x，并确定它是否需要添加到直方图中或在直方图中递增。

最好的方法是在辅助函数中执行此操作，例如，如果 count_abc 是使用辅助函数制作的，

fun insert_abc (x, ((a, ca), (b, cb), (c, cc))) =
    if x = a then ((a, ca+1), (b, cb), (c, cc)) else
    if x = b then ((a, ca), (b, cb+1), (c, cc)) else
    if x = c then ((a, ca), (b, cb), (c, cc+1)) else
    ((a, ca), (b, cb), (c, cc)))

fun count_abc ([], result) = result
  | count_abc (x::xs, result) =
      count_abc (xs, insert (x, result))

仅代替直方图表示

(''a * int) * (''a * int) * (''a * int)

你想要

(''a * int) list

和insert应该是递归的而不是insert_abc是如何重复的。

在 SML 中查找 int 列表的模式以及它在没有库函数的情况下出现的位置

Finding the mode of an int list in SML and where it occurs without library functions

recursion

ml

sml

smlnj

statistical-mode

我会先尝试解决这个问题

并最终尝试解决这个问题

以及对您的代码片段的一些反馈