从 fits rec 转换为 ndaray 时丢失信息
Lost information when converting from fits rec to ndaray
我加载了一个适合的文件并将 fitsrec
数据转换为 numpy ndarray
:
import pyfits
import os, numpy as np
dataPath ='irac1_dataset.fits'
hduTab=pyfits.open(dataPath)
data_rec = np.array(hduTab[1].data)
data=data_rec.view(np.float64).reshape(data_rec.shape + (-1,))
我发现在数据中有一些 nan
在 rec:
中不存在
data_rec[3664]
(2.52953742092, 3.636058484, -3.0, 1.16584000133, 0.13033115092, 0.0545114121049, 0.0977915267677, 0.0861630982921, 0.0935291710016)
data[3664]
array([ 8.01676073e+230, -1.68253090e-183, 1.10670705e-320,
-5.38247269e-235, nan, 3.19504591e+186,
-6.19704421e+125, -1.40287783e+079, 1.94744862e+094])
而且,如您所见,值发生了显着变化,这怎么可能?
关于 hduTab[1].data:
data_rec = hduTab[1].data
>>> data_rec.dtype
dtype((numpy.record, [('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'), ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('stdDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')]))
是一个 numpy 记录
是 `>f8' 把你搞砸了。
In [380]: dt= [('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'),
...: ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('st
...: dDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')]
In [382]: np.dtype(dt)
Out[382]: dtype([('entr_35_1', '>f8'),....('pixVal', '>f8')])
In [383]: np.array([(2.52953742092, 3.636058484, -3.0, 1.16584000133, 0.13033115
...: 092, 0.0545114121049, 0.0977915267677, 0.0861630982921, 0.093529171001
...: 6)],dtype=dt)
Out[383]:
array([ ( 2.52953742, 3.63605848, -3., 1.16584, 0.13033115, 0.05451141, 0.09779153, 0.0861631, 0.09352917)],
dtype=[('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'), ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('stdDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')])
In [384]: x=_
float
视图具有 nan
和无法识别的值:
In [385]: x.view(float)
Out[385]:
array([ 8.01676073e+230, -1.68253090e-183, 1.10670705e-320,
-5.38247269e-235, nan, 3.19504591e+186,
-6.19704421e+125, -1.40287783e+079, 1.94744862e+094])
但是 >f8
的视图匹配输入:
In [386]: x.view('>f8')
Out[386]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])
然后我可以使用astype
转换为float
,(显然是<f8
):
In [387]: _.astype(float)
Out[387]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])
In [389]: np.dtype('<f8')
Out[389]: dtype('float64')
In [390]: np.dtype('>f8')
Out[390]: dtype('>f8')
使用 astype
可能会很棘手,但如果我保持字段布局不变,我似乎可以直接使用它。所以我可以用它来改变'>f8' to
In [407]: dt1= [('entr_35_1', '<f8'), ('kurt_5_1', '<f8'), ('skew_23_1', '<f8'),
...: ('skew_35_1', '<f8'), ('mean_23_2', '<f8'), ('mean_35_2', '<f8'), ('s
...: tdDev_23_1', '<f8'), ('stdDev_35_1', '<f8'), ('pixVal', '<f8')]
In [408]: x.astype(dt1)
Out[408]:
array([ ( 2.52953742, 3.63605848, -3., 1.16584, 0.13033115, 0.05451141, 0.09779153, 0.0861631, 0.09352917)],
dtype=[('entr_35_1', '<f8'), ('kurt_5_1', '<f8'), ('skew_23_1', '<f8'), ('skew_35_1', '<f8'), ('mean_23_2', '<f8'), ('mean_35_2', '<f8'), ('stdDev_23_1', '<f8'), ('stdDev_35_1', '<f8'), ('pixVal', '<f8')])
我还需要用view
来改变字段数:
In [409]: x.astype(dt1).view(float)
Out[409]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])
我加载了一个适合的文件并将 fitsrec
数据转换为 numpy ndarray
:
import pyfits
import os, numpy as np
dataPath ='irac1_dataset.fits'
hduTab=pyfits.open(dataPath)
data_rec = np.array(hduTab[1].data)
data=data_rec.view(np.float64).reshape(data_rec.shape + (-1,))
我发现在数据中有一些 nan
在 rec:
data_rec[3664]
(2.52953742092, 3.636058484, -3.0, 1.16584000133, 0.13033115092, 0.0545114121049, 0.0977915267677, 0.0861630982921, 0.0935291710016)
data[3664]
array([ 8.01676073e+230, -1.68253090e-183, 1.10670705e-320,
-5.38247269e-235, nan, 3.19504591e+186,
-6.19704421e+125, -1.40287783e+079, 1.94744862e+094])
而且,如您所见,值发生了显着变化,这怎么可能?
关于 hduTab[1].data:
data_rec = hduTab[1].data
>>> data_rec.dtype
dtype((numpy.record, [('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'), ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('stdDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')]))
是一个 numpy 记录
是 `>f8' 把你搞砸了。
In [380]: dt= [('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'),
...: ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('st
...: dDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')]
In [382]: np.dtype(dt)
Out[382]: dtype([('entr_35_1', '>f8'),....('pixVal', '>f8')])
In [383]: np.array([(2.52953742092, 3.636058484, -3.0, 1.16584000133, 0.13033115
...: 092, 0.0545114121049, 0.0977915267677, 0.0861630982921, 0.093529171001
...: 6)],dtype=dt)
Out[383]:
array([ ( 2.52953742, 3.63605848, -3., 1.16584, 0.13033115, 0.05451141, 0.09779153, 0.0861631, 0.09352917)],
dtype=[('entr_35_1', '>f8'), ('kurt_5_1', '>f8'), ('skew_23_1', '>f8'), ('skew_35_1', '>f8'), ('mean_23_2', '>f8'), ('mean_35_2', '>f8'), ('stdDev_23_1', '>f8'), ('stdDev_35_1', '>f8'), ('pixVal', '>f8')])
In [384]: x=_
float
视图具有 nan
和无法识别的值:
In [385]: x.view(float)
Out[385]:
array([ 8.01676073e+230, -1.68253090e-183, 1.10670705e-320,
-5.38247269e-235, nan, 3.19504591e+186,
-6.19704421e+125, -1.40287783e+079, 1.94744862e+094])
但是 >f8
的视图匹配输入:
In [386]: x.view('>f8')
Out[386]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])
然后我可以使用astype
转换为float
,(显然是<f8
):
In [387]: _.astype(float)
Out[387]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])
In [389]: np.dtype('<f8')
Out[389]: dtype('float64')
In [390]: np.dtype('>f8')
Out[390]: dtype('>f8')
使用 astype
可能会很棘手,但如果我保持字段布局不变,我似乎可以直接使用它。所以我可以用它来改变'>f8' to
In [407]: dt1= [('entr_35_1', '<f8'), ('kurt_5_1', '<f8'), ('skew_23_1', '<f8'),
...: ('skew_35_1', '<f8'), ('mean_23_2', '<f8'), ('mean_35_2', '<f8'), ('s
...: tdDev_23_1', '<f8'), ('stdDev_35_1', '<f8'), ('pixVal', '<f8')]
In [408]: x.astype(dt1)
Out[408]:
array([ ( 2.52953742, 3.63605848, -3., 1.16584, 0.13033115, 0.05451141, 0.09779153, 0.0861631, 0.09352917)],
dtype=[('entr_35_1', '<f8'), ('kurt_5_1', '<f8'), ('skew_23_1', '<f8'), ('skew_35_1', '<f8'), ('mean_23_2', '<f8'), ('mean_35_2', '<f8'), ('stdDev_23_1', '<f8'), ('stdDev_35_1', '<f8'), ('pixVal', '<f8')])
我还需要用view
来改变字段数:
In [409]: x.astype(dt1).view(float)
Out[409]:
array([ 2.52953742, 3.63605848, -3. , 1.16584 , 0.13033115,
0.05451141, 0.09779153, 0.0861631 , 0.09352917])