Matlab：关于示例中使用的熵单位的混淆

Question

图 1. 假设图。 y 轴：平均熵。 x 轴：位。

这个问题是上一个问题

的延续

我想计算一个随机变量的熵，它是一个连续随机变量的离散版本 (0/1) x。随机变量表示称为 Tent Map 的非线性动力系统的状态。帐篷地图的迭代产生长度为 N 的时间序列。

一旦离散化时间序列的熵等于动力系统的熵，代码就应该退出。理论上已知系统的熵，H is log_e(2) or ln(2) = 0.69大约。代码的 objective 是为了找到迭代次数，j 需要产生与系统熵相同的熵，H.

问题1：我的问题是当我计算作为信息消息的二进制时间序列的熵时，我应该在与H相同的基础上进行计算吗？或者我应该将 H 的值转换为位，因为信息消息在 0/1 中吗？两者都给出不同的结果，即不同的 j 值。

问题2：有可能出现0或1的概率为零，因此对应的熵可以变为无穷大。为了防止这种情况，我想到了使用 if-else 进行检查。但是，循环

if entropy(:,j)==NaN
     entropy(:,j)=0;
 end

似乎没有用。将感激不尽的想法，并帮助解决这个问题。谢谢

更新： 我实施了更正代码的建议和答案。但是，我之前的解决逻辑不正确。在修改后的代码中，我想计算具有 2、8、16、32 位的时间序列长度的熵。对于每个代码长度，计算熵。从动力系统的每个不同初始条件开始，对每个代码长度的熵计算重复 N 次。采用这种方法来检查熵在哪个代码长度变为 1。熵与位图的性质应该从零开始增加并逐渐接近 1，然后饱和 - 对于所有剩余位保持不变。我无法得到这条曲线（图 1）。感谢您帮助纠正我出错的地方。

clear all

 H = 1  %in bits
 Bits = [2,8,16,32,64];
threshold = 0.5;
N=100;  %Number of runs of the experiment


for r = 1:length(Bits)


t = Bits(r)

for Runs = 1:N
    x(1)            = rand;

    for j = 2:t


        % Iterating over the Tent Map


        if x(j - 1) < 0.5
            x(j) = 2 * x(j - 1);
        else
            x(j) = 2 * (1 - x(j - 1));
        end % if
    end
    %Binarizing the output of the Tent Map
    s  = (x >=threshold);
    p1 = sum(s == 1 ) / length(s);  %calculating probaility of number of 1's
    p0 = 1 - p1;  % calculating probability of number of 0'1

    entropy(t) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculating entropy in bits

    if isnan(entropy(t))
        entropy(t) = 0;
    end



    %disp(abs(lambda-H))



end


  Entropy_Run(Runs) =  entropy(t)
end
Entropy_Bits(r) = mean(Entropy_Run)
plot(Bits,Entropy_Bits)

Answer 1

首先你只是

function [mean_entropy, bits] = compute_entropy(bits, blocks, threshold, replicate)

    if replicate
        disp('Replication is ON');
    else
        disp('Replication is OFF');
    end

    %%
    % Populate random vector
    if replicate
        seed = 849;
        rng(seed);
    else
        rng('default');
    end

    rs = rand(blocks);


    %%
    % Get random
    trial_entropy = zeros(length(bits));

    for r = 1:length(rs)

        bit_entropy = zeros(length(bits), 1); % H

        % Traverse bit trials
        for b = 1:(length(bits)) % N

            tent_map = zeros(b, 1); %Preallocate for memory management

            %Initialize
            tent_map(1) = rs(r);

            for j = 2:b % j is the iterator, b is the current bit

                if tent_map(j - 1) < threshold
                    tent_map(j) = 2 * tent_map(j - 1);
                else
                    tent_map(j) = 2 * (1 - tent_map(j - 1));
                end % if
            end

            %Binarize the output of the Tent Map
            s  = find(tent_map >= threshold);
            p1 = sum(s == 1) / length(s);  %calculate probaility of number of 1's
            %p0 = 1 - p1;  % calculate probability of number of 0'1

            bit_entropy(b) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculate entropy in bits

            if isnan(bit_entropy(b))
                bit_entropy(b) = 0;
            end

            %disp(abs(lambda-h))

        end

        trial_entropy(:, r) = bit_entropy;

        disp('Trial Statistics')
        data = get_summary(bit_entropy);
        disp('Mean')
        disp(data.mean);
        disp('SD')
        disp(data.sd);

    end

    % TO DO Compute the mean for each BIT index in trial_entropy
    mean_entropy = 0;

    disp('Overall Statistics')
    data = get_summary(trial_entropy);
    disp('Mean')
    disp(data.mean);
    disp('SD')
    disp(data.sd);

    %This is the wrong mean...
    mean_entropy = data.mean;

    function summary = get_summary(entropy)
        summary = struct('mean', mean(entropy), 'sd', std(entropy));
    end
end

然后你只需要

% Entropy Script
clear all

%% Settings
replicate = false; % = false % Use true for debugging only.
%H = 1;  %in bits
Bits = 2.^(1:6);
Threshold = 0.5;
%Tolerance = 0.001;
Blocks = 100;  %Number of runs of the experiment

%% Run
[mean_entropy, bits] = compute_entropy(Bits, Blocks, Threshold, replicate);

%What we want
%plot(bits, mean_entropy);

%What we have
plot(1:length(mean_entropy), mean_entropy);

Answer 2

对于问题 1，H 和 entropy 可以是 nats 或 bits 单位，只要它们都是使用相同的单位计算的。换句话说，您应该对两者都使用 log 或对两者都使用 log2。使用您提供的代码示例，H 和 entropy 可以使用一致的 nats 单位正确计算。如果您更喜欢以位为单位工作，H 的转换应该给您 H = log(2)/log(2) = 1（或使用转换因子 1/log(2) ~ 1.443、H ~ 0.69 * 1.443 ~ 1）。

对于问题 2，正如@noumenal 已经指出的，您可以使用 isnan 检查 NaN。或者，您可以检查 p1 是否在 (0,1) 内（不包括 0 和 1）：

if (p1 > 0 && p1 < 1)
    entropy(:,j) = -p1 * log(p1) - (1 - p1) * log(1 - p1); %calculating entropy  in natural base e
else
    entropy(:, j) = 0;
end

Matlab：关于示例中使用的熵单位的混淆

Matlab : Confusion regarding unit of entropy to use in an example

matlab

signal-processing

entropy