二维 numpy 数组，其中项目之间的间距由函数定义

Question

我需要一个介于最小值和最大值之间的整数列表或二维数组，其中整数之间的间隔根据分布函数成反比变化。换句话说，在分布的最大值处，密度应该是最高的。在我的例子中，类似于 k 参数为 1.5 的 Weibull 概率密度函数会很好。输出看起来像这样：

>>> min = 1
>>> max = 500
>>> peak = 100
>>> n = 18
>>> myfunc(min, max, peak, n)
[1, 50, 75, 88, 94, 97, 98, 99, 100, 102, 106, 112, 135, 176, 230, 290, 360, 500]

我已经尝试了一种使用 np.random.weibull() 函数来填充 numpy 数组的方法，但这并不够好；生成 20 个项目的列表时的随机化意味着间距不令人满意。最好避免从分布中生成随机数，而是执行我上面描述的操作，直接控制间距。谢谢你。

编辑：我提到 Weibull 分布是因为它是不对称的，但当然任何给出类似结果的类似分布函数也可以，并且可能更合适。

Edit2：所以我想要一个非线性的 numpy space！

Edit3：正如我在一条评论中回答的那样，我想避免生成随机数，以便每次运行具有相同输入参数时函数输出都是相同的。

Answer 1

如果我没理解你的问题，这个函数应该可以满足你的要求：

def weibullspaced(min, max, k, arrsize):
    wb = np.random.weibull(k, arrsize - 1)
    spaced = np.zeros((arrsize,))
    spaced[1:] = np.cumsum(wb)
    diff = max - min
    spaced *= diff / spaced[-1]
    return min + np.rint(spaced)

您当然可以替换成您想要的任何分布，但您说过您想要 Weibull。这就是您要找的功能吗？

Answer 2

这是对我自己的问题的一个相当不优雅但简单的解决方案。我通过使用三角分布函数简化了事情。这很好，因为很容易指定最小值和最大值。名为 "spacing()" 的函数根据指定的数学函数提供与 x 值的间距量。通过 while 循环递增后，我将最大值添加到列表中，以便出现完整范围。然后我在转换为 numpy 数组时转换为整数。

这种方法的缺点是我必须手动指定最小和最大步长。我宁愿指定返回数组的长度！

import numpy as np
import math

Min = 1.0
Max = 500.0
peak = 100.0

minstep = 1.0
maxstep = 50.0


def spacing(x):
    # Triangle distribution:
    if x < peak:
        # Since we are calculating gradients I keep everything as floats for now. 
        grad = (minstep - maxstep)/(peak - Min)
        return grad*x + maxstep
    elif x == peak:
        return minstep
    else:
        grad = (maxstep-minstep)/(Max-peak)
        return grad*x + minstep

def myfunc(Min, Max, peak, minstep, maxstep):
    x = 1.0
    chosen = []
    while x < Max:
        space = spacing(x)
        chosen.append(x)
        x += space
    chosen.append(Max)

    # I cheat with the integers by casting the list to ints right at the end: 
    chosen = np.array(chosen, dtype = 'int')
    return chosen

print myfunc(1.0, 500.0, 100.0, 1.0, 50.0)

输出：

[  1  50  75  88  94  97  99 100 113 128 145 163 184 208 235 264 298 335  378 425 478 500]

二维 numpy 数组，其中项目之间的间距由函数定义

2D numpy array where spacing between items is defined by a function

python

arrays

numpy

distribution