4D array numpy into pandas
4D array numpy into pandas
我想转换我的 numpy 数组 ( shape=(27, 77, 77) ) :
[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 2., 2., 2.],
[1., 1., 1., ..., 2., 2., 2.],
[1., 1., 1., ..., 1., 2., 2.]],
...,
[[1., 1., 1., ..., 1., 1., 0.],
[1., 1., 1., ..., 1., 1., 0.],
[1., 1., 1., ..., 1., 1., 0.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]])
进入一个 pandas 数据帧,其中列 'x' = 索引 2(右),'y' = 索引 1(下),'z' = 索引 0(27 “不同的”数组)和 'v' = 其中的值。 df.columns=['x','y','z','v']
我对 python 比较陌生,你知道我应该如何编码吗?
谢谢!
这是一种原始的方式。
import numpy as np
import pandas as pd
data = np.ones( (27,77,77) )
rows = []
for i,plane in enumerate(data):
for j,row in enumerate(plane):
for k,col in enumerate(row):
rows.append( [k,j,i,col] )
df = pd.DataFrame( rows, columns=['x','y','z','val'])
print(df)
输出:
C:\tmp>python x.py
x y z val
0 0 0 0 1.0
1 1 0 0 1.0
2 2 0 0 1.0
3 3 0 0 1.0
4 4 0 0 1.0
... .. .. .. ...
160078 72 76 26 1.0
160079 73 76 26 1.0
160080 74 76 26 1.0
160081 75 76 26 1.0
160082 76 76 26 1.0
[160083 rows x 4 columns]
C:\tmp>
作为任意维数的“简单”单线:
>>> import itertools as it; import numpy as np; import pandas as pd
# analogous test data
>>> arr = np.random.rand(27, 77, 77)
# np.nditer(arr) + v.item() using no additional memory
# arr.flatten() is slightly faster but uses additional memory
>>> df = pd.DataFrame(data=[(*axes, v.item()) for axes, v in zip(it.product(*[range(i) for i in arr.shape]), np.nditer(arr))], columns=tuple('xyzv'))
>>> df
x y z v
0 0 0 0 0.375027
1 0 0 1 0.511405
2 0 0 2 0.645937
3 0 0 3 0.229538
4 0 0 4 0.274867
... .. .. .. ...
160078 26 76 72 0.404251
160079 26 76 73 0.010852
160080 26 76 74 0.048079
160081 26 76 75 0.426528
160082 26 76 76 0.723565
我想转换我的 numpy 数组 ( shape=(27, 77, 77) ) :
[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 2., 2., 2.],
[1., 1., 1., ..., 2., 2., 2.],
[1., 1., 1., ..., 1., 2., 2.]],
...,
[[1., 1., 1., ..., 1., 1., 0.],
[1., 1., 1., ..., 1., 1., 0.],
[1., 1., 1., ..., 1., 1., 0.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]])
进入一个 pandas 数据帧,其中列 'x' = 索引 2(右),'y' = 索引 1(下),'z' = 索引 0(27 “不同的”数组)和 'v' = 其中的值。 df.columns=['x','y','z','v']
我对 python 比较陌生,你知道我应该如何编码吗?
谢谢!
这是一种原始的方式。
import numpy as np
import pandas as pd
data = np.ones( (27,77,77) )
rows = []
for i,plane in enumerate(data):
for j,row in enumerate(plane):
for k,col in enumerate(row):
rows.append( [k,j,i,col] )
df = pd.DataFrame( rows, columns=['x','y','z','val'])
print(df)
输出:
C:\tmp>python x.py
x y z val
0 0 0 0 1.0
1 1 0 0 1.0
2 2 0 0 1.0
3 3 0 0 1.0
4 4 0 0 1.0
... .. .. .. ...
160078 72 76 26 1.0
160079 73 76 26 1.0
160080 74 76 26 1.0
160081 75 76 26 1.0
160082 76 76 26 1.0
[160083 rows x 4 columns]
C:\tmp>
作为任意维数的“简单”单线:
>>> import itertools as it; import numpy as np; import pandas as pd
# analogous test data
>>> arr = np.random.rand(27, 77, 77)
# np.nditer(arr) + v.item() using no additional memory
# arr.flatten() is slightly faster but uses additional memory
>>> df = pd.DataFrame(data=[(*axes, v.item()) for axes, v in zip(it.product(*[range(i) for i in arr.shape]), np.nditer(arr))], columns=tuple('xyzv'))
>>> df
x y z v
0 0 0 0 0.375027
1 0 0 1 0.511405
2 0 0 2 0.645937
3 0 0 3 0.229538
4 0 0 4 0.274867
... .. .. .. ...
160078 26 76 72 0.404251
160079 26 76 73 0.010852
160080 26 76 74 0.048079
160081 26 76 75 0.426528
160082 26 76 76 0.723565