线性组合:确定四个系列(光谱)的标量以拟合已知光谱
Linear Combination: Determine scalars for four series (spectra) to fit known spectrum
我有四个“主要”光谱,我想找到 coefficients/scalars 以最适合我的数据。目标是了解数据中有多少主体 x。我正在尝试将每个主要光谱的“百分比组成”与整个光谱相比较(即 50% a1、25% a2、20% a3、5% a4。)
#spec = spectrum, a1,a2,a3,a4 = principal components these are all nx1 dimensional arrays
c = 0 #some scalar
d = 0 #some scalar
e = 0 #some scalar
g = 0 #some scalar
def f(spec, c, d, e, g):
y = spec - (a1.multiply(c) - a2.multiply(d) - a3.multiply(e)- a4.multiply(g))
return np.dot(y, y)
res = optimize.minimize(f, spec, args=(c,d,e,g), method='COBYLA', options={'rhobeg': 1.0, 'maxiter': 1000, 'disp': False, 'catol': 0.0002}) #z[0], z[1], z[2], z[3]
best = res['x']
我遇到的问题是它似乎没有给我标量值 (c,d,e,g),而是另一个 nx1 维数组。非常感谢任何帮助。也对其他 minimize/fit 技术开放。
经过一些工作,我发现有两种方法对这个问题给出了相似的结果。
mport numpy as np
import pandas as pd
import csv
import os
from scipy import optimize
path = '[insert path]'
os.chdir(path)
data = 'data.csv' #original spectra
factors = 'factors.csv' #factor spectra
nfn = 'weights.csv' #new filename
df_data = pd.read_csv(data, header = 0) # read in the spectrum file
df_factors = pd.read_csv(factors, header = 0)
# this array a is going to be our factors
a = df_factors[['0','1','2','3']
需要将因子谱与原始数据框分开。
a1 = pd.Series(a['0'])
a2 = pd.Series(a['1'])
a3 = pd.Series(a['2'])
a4 = pd.Series(a['3'])
b = df_data[['0.75M']] # original spectrum!
b = pd.Series(b['0.75M']) # needs to be in a series
x0 是我对系数的初步猜测
x0 = np.array([0., 0., 0.,0.])
def f(c):
return b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
使用 Scipy 中的最小二乘法优化最小二乘法
然后在同一个包中使用最小化,两者都有效,IMO 中最小化稍微好一些。
res = optimize.least_squares(f, x0, bounds = (0, np.inf))
xbest = res.x
x0 = np.array([0., 0., 0., 0.])
def f(c):
y = b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
return np.dot(y,y)
res = optimize.minimize(f, x0, bounds = ((0,np.inf),(0,np.inf),(0,np.inf),(0,np.inf)))
我有四个“主要”光谱,我想找到 coefficients/scalars 以最适合我的数据。目标是了解数据中有多少主体 x。我正在尝试将每个主要光谱的“百分比组成”与整个光谱相比较(即 50% a1、25% a2、20% a3、5% a4。)
#spec = spectrum, a1,a2,a3,a4 = principal components these are all nx1 dimensional arrays
c = 0 #some scalar
d = 0 #some scalar
e = 0 #some scalar
g = 0 #some scalar
def f(spec, c, d, e, g):
y = spec - (a1.multiply(c) - a2.multiply(d) - a3.multiply(e)- a4.multiply(g))
return np.dot(y, y)
res = optimize.minimize(f, spec, args=(c,d,e,g), method='COBYLA', options={'rhobeg': 1.0, 'maxiter': 1000, 'disp': False, 'catol': 0.0002}) #z[0], z[1], z[2], z[3]
best = res['x']
我遇到的问题是它似乎没有给我标量值 (c,d,e,g),而是另一个 nx1 维数组。非常感谢任何帮助。也对其他 minimize/fit 技术开放。
经过一些工作,我发现有两种方法对这个问题给出了相似的结果。
mport numpy as np
import pandas as pd
import csv
import os
from scipy import optimize
path = '[insert path]'
os.chdir(path)
data = 'data.csv' #original spectra
factors = 'factors.csv' #factor spectra
nfn = 'weights.csv' #new filename
df_data = pd.read_csv(data, header = 0) # read in the spectrum file
df_factors = pd.read_csv(factors, header = 0)
# this array a is going to be our factors
a = df_factors[['0','1','2','3']
需要将因子谱与原始数据框分开。
a1 = pd.Series(a['0'])
a2 = pd.Series(a['1'])
a3 = pd.Series(a['2'])
a4 = pd.Series(a['3'])
b = df_data[['0.75M']] # original spectrum!
b = pd.Series(b['0.75M']) # needs to be in a series
x0 是我对系数的初步猜测
x0 = np.array([0., 0., 0.,0.])
def f(c):
return b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
使用 Scipy 中的最小二乘法优化最小二乘法 然后在同一个包中使用最小化,两者都有效,IMO 中最小化稍微好一些。
res = optimize.least_squares(f, x0, bounds = (0, np.inf))
xbest = res.x
x0 = np.array([0., 0., 0., 0.])
def f(c):
y = b -((c[0]*a1)+(c[1]*a2)+(c[2]*a3)+(c[3]*a4))
return np.dot(y,y)
res = optimize.minimize(f, x0, bounds = ((0,np.inf),(0,np.inf),(0,np.inf),(0,np.inf)))