如何将对象元素的 numpy 数组编码为 ASCII?
How can I encode a numpy array of object elements into ASCII?
假设,我有四个不同数据类型的列表。我也有一个二维矩阵。我想按列合并它们。
比如说,在下面的源代码中:
train_x_111 == ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
train_y_111 == ['abcd', 'bcde', 'cdef', 'defg', 'efgh', 'fghi', 'ghij', 'hijk', 'ijkl', 'jklm']
train_z_111 == [[0.0, 0.1, 0.2, 0.3],
[0.1, 0.2, 0.3, 0.4],
[0.2, 0.3, 0.4, 0.5],
[0.3, 0.4, 0.5, 0.6],
[0.4, 0.5, 0.6, 0.7],
[0.5, 0.6, 0.7, 0.8],
[0.6, 0.7, 0.8, 0.9],
[ 0.7, 0.8, 0.9, 1.0],
[0.8, 0.9, 1.0, 1.1],
[0.9, 1.0, 1.1, 1.2]]
我想在文本文件中输出以下内容:
1 a abcd 0.0 0.1 0.2 0.3
2 b bcde 0.1 0.2 0.3 0.4
3 c cdef 0.2 0.3 0.4 0.5
4 d defg 0.3 0.4 0.5 0.6
5 e efgh 0.4 0.5 0.6 0.7
6 f fghi 0.5 0.6 0.7 0.8
7 g ghij 0.6 0.7 0.8 0.9
8 h hijk 0.7 0.8 0.9 1.0
9 i ijkl 0.8 0.9 1.0 1.1
0 j jklm 0.9 1.0 1.1 1.2
source_code.py
if __name__ == "__main__":
train_x_111, train_y_111, train_z_111 = load_data() # load_data() returns three TF tensors
features_data_int_2d = np.array(train_x_111, dtype=int)
sum_int_1d = np.sum(features_data_int_2d, axis=1)
sum_int_1d = sum_int_1d.reshape(-1, 1)
sum_data_1d_obj = sum_int_1d.astype(np.object_)
features_data_2d_obj = np.array(train_x_111, dtype=np.object_)
classes_data_1d_obj = np.array(train_y_111, dtype=np.object_)
classes_data_1d_obj = classes_data_1d_obj.reshape(10,1)
classes_string_1d_obj = np.array(train_z_111, dtype=np.object_)
classes_string_1d_obj = classes_string_1d_obj.reshape(10, 1)
sum_matrix = np.concatenate((sum_data_1d_obj, classes_data_1d_obj), axis=-1)
sum_matrix = np.concatenate((sum_matrix, classes_string_1d_obj), axis=-1)
sum_matrix = np.concatenate((sum_matrix, features_data_int_2d), axis=-1)
sum_matrix = sum_matrix.encode('ascii')
print(sum_matrix)
np.savetxt("my_file.txt", sum_matrix, fmt='%s', delimiter='\t')
错误输出
C:\ProgramData\Miniconda3\python.exe C:/Users/pc/source/repos/my_project/data_hashing.py
Traceback (most recent call last):
File "C:\Users\pc\source\repos\my_project\data_hashing.py", line 151, in <module>
sum_matrix = sum_matrix.encode('ascii')
AttributeError: 'numpy.ndarray' object has no attribute 'encode'
Process finished with exit code 1
如何将对象元素的 numpy 数组编码为 ASCII?
.encode('ascii') 仅适用于字符串,您应该将有问题的行替换为 :
newArray = []
for i in range(len(sum_matrix)) :
newLine = []
for j in range(len(sum_matrix[0])) :
newLine.append(str(sum_matrix[i][j]).encode('ascii'))
newArray.append(newLine)
sum_matrix = np.array(newArray)
这基本上遍历您的数组并对每个元素进行编码,然后将其全部放回数组中。可能有一种方法可以向量化编码函数,但我不知道如何使用它。
我通过添加以下行解决了这个问题
sum_matrix = sum_matrix.astype('U')
代替
sum_matrix = sum_matrix.encode('ascii')
假设,我有四个不同数据类型的列表。我也有一个二维矩阵。我想按列合并它们。
比如说,在下面的源代码中:
train_x_111 == ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
train_y_111 == ['abcd', 'bcde', 'cdef', 'defg', 'efgh', 'fghi', 'ghij', 'hijk', 'ijkl', 'jklm']
train_z_111 == [[0.0, 0.1, 0.2, 0.3],
[0.1, 0.2, 0.3, 0.4],
[0.2, 0.3, 0.4, 0.5],
[0.3, 0.4, 0.5, 0.6],
[0.4, 0.5, 0.6, 0.7],
[0.5, 0.6, 0.7, 0.8],
[0.6, 0.7, 0.8, 0.9],
[ 0.7, 0.8, 0.9, 1.0],
[0.8, 0.9, 1.0, 1.1],
[0.9, 1.0, 1.1, 1.2]]
我想在文本文件中输出以下内容:
1 a abcd 0.0 0.1 0.2 0.3
2 b bcde 0.1 0.2 0.3 0.4
3 c cdef 0.2 0.3 0.4 0.5
4 d defg 0.3 0.4 0.5 0.6
5 e efgh 0.4 0.5 0.6 0.7
6 f fghi 0.5 0.6 0.7 0.8
7 g ghij 0.6 0.7 0.8 0.9
8 h hijk 0.7 0.8 0.9 1.0
9 i ijkl 0.8 0.9 1.0 1.1
0 j jklm 0.9 1.0 1.1 1.2
source_code.py
if __name__ == "__main__":
train_x_111, train_y_111, train_z_111 = load_data() # load_data() returns three TF tensors
features_data_int_2d = np.array(train_x_111, dtype=int)
sum_int_1d = np.sum(features_data_int_2d, axis=1)
sum_int_1d = sum_int_1d.reshape(-1, 1)
sum_data_1d_obj = sum_int_1d.astype(np.object_)
features_data_2d_obj = np.array(train_x_111, dtype=np.object_)
classes_data_1d_obj = np.array(train_y_111, dtype=np.object_)
classes_data_1d_obj = classes_data_1d_obj.reshape(10,1)
classes_string_1d_obj = np.array(train_z_111, dtype=np.object_)
classes_string_1d_obj = classes_string_1d_obj.reshape(10, 1)
sum_matrix = np.concatenate((sum_data_1d_obj, classes_data_1d_obj), axis=-1)
sum_matrix = np.concatenate((sum_matrix, classes_string_1d_obj), axis=-1)
sum_matrix = np.concatenate((sum_matrix, features_data_int_2d), axis=-1)
sum_matrix = sum_matrix.encode('ascii')
print(sum_matrix)
np.savetxt("my_file.txt", sum_matrix, fmt='%s', delimiter='\t')
错误输出
C:\ProgramData\Miniconda3\python.exe C:/Users/pc/source/repos/my_project/data_hashing.py
Traceback (most recent call last):
File "C:\Users\pc\source\repos\my_project\data_hashing.py", line 151, in <module>
sum_matrix = sum_matrix.encode('ascii')
AttributeError: 'numpy.ndarray' object has no attribute 'encode'
Process finished with exit code 1
如何将对象元素的 numpy 数组编码为 ASCII?
.encode('ascii') 仅适用于字符串,您应该将有问题的行替换为 :
newArray = []
for i in range(len(sum_matrix)) :
newLine = []
for j in range(len(sum_matrix[0])) :
newLine.append(str(sum_matrix[i][j]).encode('ascii'))
newArray.append(newLine)
sum_matrix = np.array(newArray)
这基本上遍历您的数组并对每个元素进行编码,然后将其全部放回数组中。可能有一种方法可以向量化编码函数,但我不知道如何使用它。
我通过添加以下行解决了这个问题
sum_matrix = sum_matrix.astype('U')
代替
sum_matrix = sum_matrix.encode('ascii')