Python 求数值体积积分的函数?

Python function to find the numeric volume integral?

目标

我想计算数值标量场的 3D 体积积分。

代码

对于这个post,我将使用一个可以精确计算积分的例子。因此,我选择了以下功能:

在Python中,我定义了函数和一组3D点,然后在这些点上生成离散值:

import numpy as np


# Make data.
def function(x, y, z):
    return x**y**z

N = 5
grid = np.meshgrid(
    np.linspace(0, 1, N),
    np.linspace(0, 1, N),
    np.linspace(0, 1, N)
)

points = np.vstack(list(map(np.ravel, grid))).T

x = points[:, 0]
y = points[:, 1]
z = points[:, 2]

values = [function(points[i, 0], points[i, 1], points[i, 2])
          for i in range(len(points))]

问题

如果我不知道底层函数,即如果我只有坐标 (x, y, z) 和 values,我如何找到积分?

实现您想要的目标的最简单方法可能是 scipy 的 integration function。这是你的例子:

from scipy import integrate

# Make data.
def func(x,y,z):
    return x**y**z

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)

您是否知道您创建的函数与您在图像中显示的函数不同。您创建的是指数 (x^y^z),而您显示的只是乘法。如果要表示图像中的函数,请使用

def func(x,y,z):
    return x*y*z

希望这能回答您的问题,否则请发表评论!

编辑:

误读了您的post。如果您只有结果,并且它们的间隔不规则,则您将不得不找出某种形式的插值(即线性)和查找 -table。如果您不知道如何创建它,请告诉我。如果您将 func 定义为 return 来自原始数据

的插值,则仍然可以使用其余的陈述答案

解决此问题的一个好方法是使用 scipytplquad 集成。然而,要使用它,我们需要一个函数而不是一个浊点。

一个简单的解决方法是使用插值器,以获得近似于我们的浊点的函数 - 例如,如果数据位于规则网格上,我们可以使用 scipy 的 RegularGridInterpolator :

import numpy as np
from scipy import integrate
from scipy.interpolate import RegularGridInterpolator

# Make data.
def function(x,y,z):
    return x*y*z

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

values = function(*np.meshgrid(x,y,z, indexing='ij'))

# Interpolate:
function_interpolated = RegularGridInterpolator((x, y, z), values)

# tplquad integrates func(z,y,x)
f = lambda z,y,x : my_interpolating_function([z,y,x])

result, error = integrate.tplquad(f, xmin, xmax, lambda _: ymin, lambda _:ymax,lambda *_: zmin, lambda *_: zmax)

在上面的例子中,我们得到 result = 0.12499999999999999 - 足够接近了!

第一个答案很好地解释了处理这个问题的主要方法。只是想通过展示 sklearn 包和机器学习回归的强大功能来说明另一种方法。

在 3D 中进行 meshgrid 会得到一个非常大的 numpy 数组,

import numpy as np

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1

x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

grid = np.array(np.meshgrid(x,y,z, indexing='ij'))
grid.shape = (3, 5, 5, 5) # 2*5*5*5 = 250 numbers

250 个数字在视觉上不是很直观。具有不同的可能索引('ij' 或 'xy')。使用回归,我们可以用很少的输入点 (15-20) 得到相同的结果。

# building random combinations from (x,y,z)

X = np.random.choice(x, 20)[:,None]
Y = np.random.choice(y, 20)[:,None]
Z = np.random.choice(z, 20)[:,None]

xyz = np.concatenate((X,Y,Z), axis = 1)
data = np.multiply.reduce(xyz, axis = 1)

所以输入(网格)只是一个 2D numpy 数组,

xyz.shape
(20, 3)

有了相应的数据,

data.shape = (20,)

现在回归函数和积分,

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from scipy import integrate

pipe=Pipeline([('polynomial',PolynomialFeatures(degree=3)),('modal',LinearRegression())])
pipe.fit(xyz, data)

def func(x,y,z):
    return pipe.predict([[x, y, z]])

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)
print(result)
0.1257

这种方法在点数有限的情况下很有用。

根据您的要求,听起来最合适的技术是 Monte Carlo 集成:

# Step 0 start with some empirical data
observed_points = np.random.uniform(0,1,size=(10000,3))

unknown_fn = lambda x: np.prod(x) # just used to generate fake values

observed_values = np.apply_along_axis(unknown_fn, 1, observed_points) 

K = 1000000

# Step 1 - assume that f(x,y,z) can be approximated by an interpolation
# of the data we have (you could get really fancy with the 
# selection of interpolation method - we'll stick with straight lines here)

from scipy.interpolate import LinearNDInterpolator
f_interpolate = LinearNDInterpolator(observed_points, observed_values)

# Step 2 randomly sample from within convex hull of observed data
# Step 2a - Uniformly sample from bounding 3D-box of data
lower_bounds = observed_points.min(axis=0)
upper_bounds = observed_points.max(axis=0)

sampled_points = np.random.uniform(lower_bounds, upper_bounds,size=(K, 3))
# Step 2b - Reject points outside of convex hull...
# Luckily, we get a np.nan from LinearNDInterpolator in this case

sampled_values = f_interpolate(sampled_points)
rejected_idxs = np.argwhere(np.isnan(sampled_values))

# Step 2c - Remember accepted values of estimated f(x_i, y_i, z_i)
final_sampled_values = np.delete(sampled_values, rejected_idxs, axis=0)

# Step 3 - Calculate estimate of volume of observed data domain
#  Since we sampled uniformly from the convex hull of data domain,
#  each point was selected with P(x,y,z)= 1 / Volume of convex hull
volume = scipy.spatial.ConvexHull(observed_points).volume

# Step 4 - Multiply estimated volume of domain by average sampled value
I_hat = volume * final_sampled_values.mean()
print(I_hat)

要了解为什么会这样,请参阅:https://cs.dartmouth.edu/wjarosz/publications/dissertation/appendixA.pdf