线性组合：确定四个系列（光谱）的标量以拟合已知光谱

Question

我有四个“主要”光谱，我想找到 coefficients/scalars 以最适合我的数据。目标是了解数据中有多少主体 x。我正在尝试将每个主要光谱的“百分比组成”与整个光谱相比较（即 50% a1、25% a2、20% a3、5% a4。）

    #spec = spectrum, a1,a2,a3,a4 = principal components these are all nx1 dimensional arrays
    c = 0 #some scalar
    d = 0 #some scalar
    e = 0 #some scalar
    g = 0 #some scalar
def f(spec, c, d, e, g):
    y = spec - (a1.multiply(c) - a2.multiply(d) - a3.multiply(e)- a4.multiply(g))
    return np.dot(y, y)
    res = optimize.minimize(f, spec, args=(c,d,e,g), method='COBYLA', options={'rhobeg': 1.0, 'maxiter': 1000, 'disp': False, 'catol': 0.0002}) #z[0], z[1], z[2], z[3]
    best = res['x']

我遇到的问题是它似乎没有给我标量值 (c,d,e,g)，而是另一个 nx1 维数组。非常感谢任何帮助。也对其他 minimize/fit 技术开放。

Answer 1

经过一些工作，我发现有两种方法对这个问题给出了相似的结果。

mport numpy as np
import pandas as pd
import csv
import os
from scipy import optimize

path = '[insert path]'
os.chdir(path)
data = 'data.csv' #original spectra
factors = 'factors.csv' #factor spectra
nfn = 'weights.csv' #new filename
df_data = pd.read_csv(data, header = 0) # read in the spectrum file
df_factors = pd.read_csv(factors, header = 0)
# this array a is going to be our factors
a = df_factors[['0','1','2','3']

需要将因子谱与原始数据框分开。

a1 = pd.Series(a['0']) 
a2 = pd.Series(a['1'])
a3 = pd.Series(a['2'])
a4 = pd.Series(a['3'])
b = df_data[['0.75M']] # original spectrum!
b = pd.Series(b['0.75M']) # needs to be in a series

x0 是我对系数的初步猜测

x0 = np.array([0., 0., 0.,0.])
def f(c):
    return b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))

使用 Scipy 中的最小二乘法优化最小二乘法然后在同一个包中使用最小化，两者都有效，IMO 中最小化稍微好一些。

res = optimize.least_squares(f, x0, bounds = (0, np.inf))
xbest = res.x

x0 = np.array([0., 0., 0., 0.])
def f(c):
    y = b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
    return np.dot(y,y)

res = optimize.minimize(f, x0, bounds = ((0,np.inf),(0,np.inf),(0,np.inf),(0,np.inf)))

线性组合：确定四个系列（光谱）的标量以拟合已知光谱

Linear Combination: Determine scalars for four series (spectra) to fit known spectrum

arrays

minimize

series

python-3.x

scipy-optimize