添加二维张量作为数据框的列
Add 2-D tensor as column of dataframe
我的数据框看起来像
INCIDENT_NUMBER
0 INC000030884498
1 INC000029956111
2 INC000029555353
3 INC000029555338
对于上述四个事件,我也有一个二维张量
sample_concatenated_embedding=
tensor(
[[ 0.6993, -0.1427, -0.1532, ..., 0.8386, 0.5151, 0.8906],
[ 0.7382, -0.8497, 0.1363, ..., 0.8054, 0.5432, 0.9082],
[ 0.0835, -0.2431, -0.0815, ..., 0.8025, 0.5217, 0.9041],
[-0.0346, -0.2396, -0.5831, ..., 0.7591, 0.6138, 0.9649]],
grad_fn=<ViewBackward>)
嵌入的大小为 [4, 161280]
我想在我的 Dataframe 的连续四行中插入张量
最后的 Dataframe 应该是这样的
INCIDENT_NUMBER embedding
0 INC000030884498 [ 0.6993, -0.1427, -0.1532, ..., 0.8386, 0.5151, 0.8906]
1 INC000029956111 [ 0.7382, -0.8497, 0.1363, ..., 0.8054, 0.5432, 0.9082]
2 INC000029555353 [ 0.0835, -0.2431, -0.0815, ..., 0.8025, 0.5217, 0.9041]
3 INC000029555338 [-0.0346, -0.2396, -0.5831, ..., 0.7591, 0.6138, 0.9649]
如果张量是级数,我可以简单地使用下面的命令
my_dataframe['embedding'] = sample_concatenated_embedding
我可以使用 for 循环并像
这样轻松地插入到数据框中
empty_dataframe = pd.DataFrame(columns=['incident','embedding'])
for item in range(0,4):
INCIDENT_NUMBER = my_dataframe['INCIDENT_NUMBER'].iloc[item]
temp_df = pd.DataFrame([[INCIDENT_NUMBER, sample_concatenated_embedding[item]], columns=['incident','embedding'])
frames = [empty_dataframe, temp_df]
empty_dataframe = pd.concat(frames)
但是 for 循环是低效的。有没有更短的方法来达到最终目标
如果INCIDENT_NUMBER
的值索引和sample_concatenated_embedding
的值索引匹配。您可以将 sample_concatenated_embedding
转换为列表,然后将其分配给新列,如
import pandas as pd
df = pd.DataFrame({'INCIDENT_NUMBER': ['INC000030884498', 'INC000029956111', 'INC000029555353', 'INC000029555338']})
data = [[ 0.6993, -0.1427, -0.1532, 0.8386, 0.5151, 0.8906],
[ 0.7382, -0.8497, 0.1363, 0.8054, 0.5432, 0.9082],
[ 0.0835, -0.2431, -0.0815, 0.8025, 0.5217, 0.9041],
[-0.0346, -0.2396, -0.5831, 0.7591, 0.6138, 0.9649]]
df['embedding'] = data
df.rename(columns={'INCIDENT_NUMBER': 'incident'}, inplace=True)
print(df)
incident embedding
0 INC000030884498 [0.6993, -0.1427, -0.1532, 0.8386, 0.5151, 0.8906]
1 INC000029956111 [0.7382, -0.8497, 0.1363, 0.8054, 0.5432, 0.9082]
2 INC000029555353 [0.0835, -0.2431, -0.0815, 0.8025, 0.5217, 0.9041]
3 INC000029555338 [-0.0346, -0.2396, -0.5831, 0.7591, 0.6138, 0.9649]
我的数据框看起来像
INCIDENT_NUMBER
0 INC000030884498
1 INC000029956111
2 INC000029555353
3 INC000029555338
对于上述四个事件,我也有一个二维张量
sample_concatenated_embedding=
tensor(
[[ 0.6993, -0.1427, -0.1532, ..., 0.8386, 0.5151, 0.8906],
[ 0.7382, -0.8497, 0.1363, ..., 0.8054, 0.5432, 0.9082],
[ 0.0835, -0.2431, -0.0815, ..., 0.8025, 0.5217, 0.9041],
[-0.0346, -0.2396, -0.5831, ..., 0.7591, 0.6138, 0.9649]],
grad_fn=<ViewBackward>)
嵌入的大小为 [4, 161280]
我想在我的 Dataframe 的连续四行中插入张量
最后的 Dataframe 应该是这样的
INCIDENT_NUMBER embedding
0 INC000030884498 [ 0.6993, -0.1427, -0.1532, ..., 0.8386, 0.5151, 0.8906]
1 INC000029956111 [ 0.7382, -0.8497, 0.1363, ..., 0.8054, 0.5432, 0.9082]
2 INC000029555353 [ 0.0835, -0.2431, -0.0815, ..., 0.8025, 0.5217, 0.9041]
3 INC000029555338 [-0.0346, -0.2396, -0.5831, ..., 0.7591, 0.6138, 0.9649]
如果张量是级数,我可以简单地使用下面的命令
my_dataframe['embedding'] = sample_concatenated_embedding
我可以使用 for 循环并像
这样轻松地插入到数据框中 empty_dataframe = pd.DataFrame(columns=['incident','embedding'])
for item in range(0,4):
INCIDENT_NUMBER = my_dataframe['INCIDENT_NUMBER'].iloc[item]
temp_df = pd.DataFrame([[INCIDENT_NUMBER, sample_concatenated_embedding[item]], columns=['incident','embedding'])
frames = [empty_dataframe, temp_df]
empty_dataframe = pd.concat(frames)
但是 for 循环是低效的。有没有更短的方法来达到最终目标
如果INCIDENT_NUMBER
的值索引和sample_concatenated_embedding
的值索引匹配。您可以将 sample_concatenated_embedding
转换为列表,然后将其分配给新列,如
import pandas as pd
df = pd.DataFrame({'INCIDENT_NUMBER': ['INC000030884498', 'INC000029956111', 'INC000029555353', 'INC000029555338']})
data = [[ 0.6993, -0.1427, -0.1532, 0.8386, 0.5151, 0.8906],
[ 0.7382, -0.8497, 0.1363, 0.8054, 0.5432, 0.9082],
[ 0.0835, -0.2431, -0.0815, 0.8025, 0.5217, 0.9041],
[-0.0346, -0.2396, -0.5831, 0.7591, 0.6138, 0.9649]]
df['embedding'] = data
df.rename(columns={'INCIDENT_NUMBER': 'incident'}, inplace=True)
print(df)
incident embedding
0 INC000030884498 [0.6993, -0.1427, -0.1532, 0.8386, 0.5151, 0.8906]
1 INC000029956111 [0.7382, -0.8497, 0.1363, 0.8054, 0.5432, 0.9082]
2 INC000029555353 [0.0835, -0.2431, -0.0815, 0.8025, 0.5217, 0.9041]
3 INC000029555338 [-0.0346, -0.2396, -0.5831, 0.7591, 0.6138, 0.9649]