对 DataFrame 的数组操作
Array Manipulation to DataFrame
我有以下数组:
(array([[5.8205872e+07, 2.0200601e+07, 1.6700000e+02, 2.1500000e+02,
5.0000000e+01, 5.0000000e+00],
[5.7929117e+07, 2.0200601e+07, 1.6700000e+02, 1.5000000e+02,
5.0000000e+01, 5.0000000e+00],
[5.8178782e+07, 2.0200601e+07, 1.6700000e+02, 1.5750000e+02,
5.0000000e+01, 5.0000000e+00],
[5.7936230e+07, 2.0210228e+07, 1.6700000e+02, 1.8000000e+02,
4.0000000e+01, 5.0000000e+00],
[5.8213574e+07, 2.0210228e+07, 1.6700000e+02, 6.9500000e+02,
4.0000000e+01, 5.0000000e+00],
[2.5693916e+07, 2.0210228e+07, 1.6700000e+02, 4.8518000e+02,
4.0000000e+01, 5.0000000e+00]]),
array([[ 0.46666667, 7.16666667],
[ 0.51724138, 5.17241379],
[ 0.73333333, 5.25 ],
[ 0.34285714, 5.14285714],
[ 1.18918919, 18.78378378],
[ 1.26315789, 12.76789474]]))
我想将它转换为一个总共有 8 列和 6 行的数据框。
我尝试这样做:pd.Dataframe(my_array)
但结果只有两行,如下所示:
0 [[58205872.0, 20200601.0, 167.0, 30.0, 1.0, 10...
1 [[0.4666666666666667, 7.166666666666667], [0.5...
我怎样才能达到上述目的?
看起来您想连接两个数组(实际上您确实有两个数组分配给 my_array
),然后将结果转换为数据框。首先使用 numpy.hstack
怎么样
>>> your_two_arrays = (..., ...)
>>> a = np.hstack(your_two_arrays)
>>> a.shape
(6, 8)
>>> pd.DataFrame(data=a)
0 1 2 3 4 5 6 7
0 58205872.0 20200601.0 167.0 215.00 50.0 5.0 0.466667 7.166667
1 57929117.0 20200601.0 167.0 150.00 50.0 5.0 0.517241 5.172414
2 58178782.0 20200601.0 167.0 157.50 50.0 5.0 0.733333 5.250000
3 57936230.0 20210228.0 167.0 180.00 40.0 5.0 0.342857 5.142857
4 58213574.0 20210228.0 167.0 695.00 40.0 5.0 1.189189 18.783784
5 25693916.0 20210228.0 167.0 485.18 40.0 5.0 1.263158 12.767895
[...] the result is just two rows like this: [...]
您在执行 pd.Dataframe(my_array)
时提供给 pd.Dataframe
的数据是两个对象的 tuple。因此你得到两行(和一列),即每个数组一个。
我有以下数组:
(array([[5.8205872e+07, 2.0200601e+07, 1.6700000e+02, 2.1500000e+02,
5.0000000e+01, 5.0000000e+00],
[5.7929117e+07, 2.0200601e+07, 1.6700000e+02, 1.5000000e+02,
5.0000000e+01, 5.0000000e+00],
[5.8178782e+07, 2.0200601e+07, 1.6700000e+02, 1.5750000e+02,
5.0000000e+01, 5.0000000e+00],
[5.7936230e+07, 2.0210228e+07, 1.6700000e+02, 1.8000000e+02,
4.0000000e+01, 5.0000000e+00],
[5.8213574e+07, 2.0210228e+07, 1.6700000e+02, 6.9500000e+02,
4.0000000e+01, 5.0000000e+00],
[2.5693916e+07, 2.0210228e+07, 1.6700000e+02, 4.8518000e+02,
4.0000000e+01, 5.0000000e+00]]),
array([[ 0.46666667, 7.16666667],
[ 0.51724138, 5.17241379],
[ 0.73333333, 5.25 ],
[ 0.34285714, 5.14285714],
[ 1.18918919, 18.78378378],
[ 1.26315789, 12.76789474]]))
我想将它转换为一个总共有 8 列和 6 行的数据框。
我尝试这样做:pd.Dataframe(my_array)
但结果只有两行,如下所示:
0 [[58205872.0, 20200601.0, 167.0, 30.0, 1.0, 10...
1 [[0.4666666666666667, 7.166666666666667], [0.5...
我怎样才能达到上述目的?
看起来您想连接两个数组(实际上您确实有两个数组分配给 my_array
),然后将结果转换为数据框。首先使用 numpy.hstack
>>> your_two_arrays = (..., ...)
>>> a = np.hstack(your_two_arrays)
>>> a.shape
(6, 8)
>>> pd.DataFrame(data=a)
0 1 2 3 4 5 6 7
0 58205872.0 20200601.0 167.0 215.00 50.0 5.0 0.466667 7.166667
1 57929117.0 20200601.0 167.0 150.00 50.0 5.0 0.517241 5.172414
2 58178782.0 20200601.0 167.0 157.50 50.0 5.0 0.733333 5.250000
3 57936230.0 20210228.0 167.0 180.00 40.0 5.0 0.342857 5.142857
4 58213574.0 20210228.0 167.0 695.00 40.0 5.0 1.189189 18.783784
5 25693916.0 20210228.0 167.0 485.18 40.0 5.0 1.263158 12.767895
[...] the result is just two rows like this: [...]
您在执行 pd.Dataframe(my_array)
时提供给 pd.Dataframe
的数据是两个对象的 tuple。因此你得到两行(和一列),即每个数组一个。