什么是 MATLAB 离散化的 OCTAVE 等价物

What is an OCTAVE equivalent of MATLAB discretize

x=rand(1,10); bins=discretize(x,0:0.25:1);

Matlab R2020b 中上述行的 运行 实例为 x 和 bins 生成以下输出。

x = 0.1576, 0.9706, 0.9572, 0.4854, 0.8003, 0.1419, 0.4218, 0.9157, 0.7922, 0.9595

bins = 1, 4, 4, 2, 4, 1, 2, 4, 4, 4

内置函数 discretize 尚未在 Octave 中实现。如何在 OCTAVE 中实现相同的 bin 值?谁能启发我?我正在使用 Octave 6.2.0.

您可以将 interp1 与 'previous' 选项一起使用:

edges = 0:0.25:1;
x = [0.1576, 0.9706, 0.9572, 0.4854, 0.8003, 0.1419, 0.4218, 0.9157, 0.7922, 0.9595];
bins = interp1 (edges, 1:numel(edges), x, 'previous')

另一个可以使用的函数是lookup:

bins = lookup(edges, x);

这里我比较了 interp1lookuphistc 的性能,正如 推荐的:

edges = 0:0.025:1;
x = sort (rand(1,1000000));

disp ("-----INTERP1-------")
tic;bins = interp1 (edges, 1:numel(edges), x, 'previous');toc

disp ("-----HISTC-------")
tic;[ ~, bins ] = histc (x, edges);toc

disp ("-----LOOKUP-------")
tic; bins = lookup (edges, x);toc

结果:

-----INTERP1-------
Elapsed time is 0.0593688 seconds.
-----HISTC-------
Elapsed time is 0.0224149 seconds.
-----LOOKUP-------
Elapsed time is 0.0114679 seconds.

tl;博士:

x = [ 0.1576, 0.9706, 0.9572, 0.4854, 0.8003, 0.1419, 0.4218, 0.9157, 0.7922, 0.9595 ]
[ ~, Bins ] = histc( x, 0: 0.25: 1 )
% Bins = 1   4   4   2   4   1   2   4   4   4

解释:

根据matlab手册:

Earlier versions of MATLAB® use the hist and histc functions as the primary way to create histograms and calculate histogram bin counts [...] The use of hist and histc in new code is discouraged [...] histogram, histcounts, and discretize are the recommended histogram creation and computation functions for new code.

The behavior of discretize is similar to that of the histcounts function. Use histcounts to find the number of elements in each bin. On the other hand, use discretize to find which bin each element belongs to (without counting).

Octave 尚未实现 discretize,但仍支持 histc,正如上面所暗示的,它做同样的事情,但具有不同的接口。

根据 histc

的八度文档
-- [N, IDX] = histc ( X, EDGES )
 Compute histogram counts.

 [...]

 When a second output argument is requested an index matrix is also
 returned.  The IDX matrix has the same size as X.  Each element of
 IDX contains the index of the histogram bin in which the
 corresponding element of X was counted.

因此你的问题的答案是

[ ~, Bins ] = histc( x, 0:0.25:1 )

使用你的例子:

x = [ 0.1576, 0.9706, 0.9572, 0.4854, 0.8003, 0.1419, 0.4218, 0.9157, 0.7922, 0.9595 ]
[ ~, Bins ] = histc( x, 0: 0.25: 1 )
% Bins = 1   4   4   2   4   1   2   4   4   4

PS。如果您喜欢 discretize 提供的接口,您可以通过适当包装 histc 轻松自己创建此函数:

discretize = @(X, EDGES) nthargout( 2, @histc, X, EDGES )

您现在可以像示例中那样直接使用此 discretize 函数。