如何将数据集中的selectN行,排列成矩阵,求其行列式?

How to select N rows from dataset, arrange as matrix, find its determinant?

我想将我的数据集的前 N ​​行排列成矩阵(使得 N>=列数),即在本例中为 6,找到 |Matrix.T*Matrix| 的行列式,所以我最终的矩阵乘积将是一个 6x6 矩阵。

将第 1 列 'Serial_no' 设置为索引

已编辑问题: 我想从我的完整数据集中找到 7 行矩阵,这样它将给出最大行列式 |Matrix.T*Matrix|的产品。我还想要最佳集合的索引值。

数据集:

Serial_no,A,B,C,D,E,F
1,0.379,-0.588,-1.69,-0.0135,0.083,-0.0297
2,-0.144,0.278,0.354,-0.000672,-0.0228,0.014
3,0.295,-0.157,-1.63,-0.00451,0.0778,-0.00969
4,0.371,-0.623,-4.98,-0.000253,0.0872,-0.0109
5,0.369,-3.11,-8.3,-0.0000105,0.0871,-0.0327
6,0.369,-0.899,-7.19,-0.0000177,0.0872,-0.0109
7,0.383,-1.04,-2.76,-0.00418,0.089,-0.033
8,0.369,-1.04,-8.3,-0.00000263,0.0871,-0.0109
9,-0.124,0.421,0.679,0.00246,-0.0216,0.0133
10,0.37,2.15,-17.1,0.000244,0.0871,0.0109
11,0.369,5.61,-14.9,0.0000352,0.0872,0.0327
12,0.369,1.45,-11.6,-0.000000963,0.0872,0.0109
13,0.369,3.53,-9.41,-0.00000186,0.0872,0.0327
14,0.369,6.44,-17.2,0.000513,0.0872,0.0327
15,-0.11,-2.57,4.11,-0.000127,-0.0209,-0.0131
16,-0.11,-2.76,4.43,-0.000606,-0.0211,-0.0132
17,0.37,0.761,-6.09,0.0000571,0.0871,0.0109
18,0.3678,1.45,-3.88,0.00209,0.0865,0.0325
19,0.381,-2.46,-19.4,-0.00274,0.0874,-0.0111
20,0.369,4.36,-11.6,-0.000003,0.0872,0.0327
21,-0.111,-1.74,2.79,0.000000903,-0.0209,-0.0131
22,-0.111,-1.91,3.05,-0.000000953,-0.0209,-0.0131
23,0.368,2.28,-6.09,0.000164,0.0871,0.0327
24,-0.11,-0.913,1.46,-0.0000412,-0.0209,-0.0131
25,-0.111,-1.08,1.73,-0.0000101,-0.0209,-0.0131
26,-0.144,-0.278,0.354,0.000672,-0.0228,-0.014
27,0.344,-0.344,-2.76,-0.00202,0.0877,-0.0107
28,0.369,3.11,-8.3,0.0000105,0.0871,0.0327
29,0.383,1.04,-2.76,0.00418,0.089,0.033
30,-0.124,-0.421,0.679,-0.00246,-0.0216,-0.0133
import pandas as pd
import numpy as np

#importing t dataset with pandas
dataset=pd.read_csv('Dataset.csv')
dataset = dataset.set_index('Serial_no')
X=dataset.iloc[:,:]

len_of_col = len(dataset.columns)

N = int(input("Enter total no. rows : "))

您可以执行以下操作:

import pandas as pd
import numpy as np

#importing t dataset with pandas
dataset=pd.read_csv('Dataset.csv')
dataset = dataset.set_index('Serial_no')
X=dataset.iloc[:,:]

len_of_col = len(dataset.columns)

N = int(input("Enter total no. of strain gauges >= No. of Loads : "))
# Note: N has to be equal to the number of cols, not greater

datamatrix = X[:N]
det = np.linalg.det(datamatrix)

给你:

N = 7

# first N rows
mat = df.iloc[:N]

np.linalg.det(mat.T @ mat)
# 3.91198281101018e-11

更新:如果你的数据不是太长,一个for循环帮你找到所有的行列式:

N = 7

def my_det(df,i):
    mat = df.iloc[i:i+N]
    return np.linalg.det(mat.T @ mat)

all_det = [my_det(df,i) for i in range(len(df)-N)]

print(np.argmax(all_det))
# 7

print(np.max(all_det))
# 6.453644515027227e-11