生成并评估 200 个法线并减少它们

Generating ande valuating 200 Normals and reducing them

我正在尝试使用 tensorflow 中的二次近似来估计正态密度(来自 McElreath 的 Statistical Rethinking 的代码 4.14)。

我目前的代码是:

import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
from  tensorflow_probability import distributions as tfd

_BASE_URL = "https://raw.githubusercontent.com/rmcelreath/rethinking/Experimental/data"

 HOWELL_DATASET_PATH = f"{_BASE_URL}/Howell1.csv"

df = pd.read_csv(HOWELL_DATASET_PATH, sep=';')
df = df[df['age'] >= 18]

mu = tf.linspace(start=140.0, stop=160.0, num=200)
sigma= tf.linspace(start=4.0, stop=9.0, num=200)

tf.reduce_sum(tfd.Normal(loc=mu, scale=sigma).log_prob(df.height))

由于 df 形状为 (352,) 而我正在创建 (200,) 点以评估我的正态分布。

但是

tf.reduce_sum(tfd.Normal(loc=mu, scale=sigma).log_prob(2))

tf.reduce_sum(tfd.Normal(loc=mu[0], scale=sigma[0]).log_prob(df.height))

两者都有效。

我需要在我的网格上创建一个 (200, 352) 张量 - 每个 musigma 一个法线,然后用我的示例数据对其进行评估 - df .我的问题是:我该怎么做?

所以,我发现一种方法是创建一个 (200, 200, 352) 网格,然后重塑,其余计算直接进行。

import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
from  tensorflow_probability import distributions as tfd

_BASE_URL = "https://raw.githubusercontent.com/rmcelreath/rethinking/Experimental/data"

 HOWELL_DATASET_PATH = f"{_BASE_URL}/Howell1.csv"

df = pd.read_csv(HOWELL_DATASET_PATH, sep=';')
df = df[df['age'] >= 18]


mu = tf.linspace(start=140.0, stop=160.0, num=200)
sigma = tf.linspace(start=7.0, stop=9.0, num=200)

means, variances, _  = tf.meshgrid(mu, sigma,  np.zeros((352,)).astype(np.float32))
means = tf.reshape(means, [40000, 352])
variances = tf.reshape(variances, [40000, 352])

normal = tfd.Normal(loc=means, scale=variances)

log_lik = tf.reduce_sum(normal.log_prob(df.height), axis=1)

logprob_mu = tfd.Normal(178.0, 20.0).log_prob(means)
logprob_sigma = tfd.Uniform(low=0.0, high=50.0).log_prob(variances)

log_joint_prod = log_lik + logprob_mu[:, 0] + logprob_sigma[:, 0]
joint_prob_tf = tf.exp(log_joint_prod - tf.reduce_max(log_joint_prod))

我认为 TFP 的联合分配很好地表达了这一点:

mu = tf.linspace(start=140.0, stop=160.0, num=200)
sigma = tf.linspace(start=7.0, stop=9.0, num=200)

def mk_joint(nobs):
  return tfd.JointDistributionNamed(dict(
      mu=tfd.Normal(178, 20),
      sigma=tfd.Uniform(0, 50),
      height=lambda mu, sigma: tfd.Sample(tfd.Normal(loc=mu, scale=sigma), nobs)
  ))
joint = mk_joint(len(df))
joint.sample()
print(f'joint event shape: {joint.event_shape}')
lp = joint.log_prob(dict(mu=mu[:,tf.newaxis], sigma=sigma, height=df.height))
import matplotlib.pyplot as plt
plt.imshow(lp)
plt.xlabel('sigma')
plt.xticks(np.arange(len(sigma))[::10], sigma[::10].numpy().round(2), rotation=90)
plt.ylabel('mu')
plt.yticks(np.arange(len(mu))[::10], mu[::10].numpy().round(2))
plt.show()

=> joint event shape: {'sigma': TensorShape([]), 'mu': TensorShape([]), 'height': TensorShape([352])}