tar.gz里面的文件不解压可以读写吗?
can I read and write files in tar.gz without decompression?
问题概要:我们可以在不解压的情况下读写tar.gz中的文件吗?
我有很多 tar.gz
个文件,名称类似于 GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810.tar.gz
每个 tar.gz
文件都包含如下文件:
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.tiff
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.xml
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.rpb
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.jpg
我想在不解压的情况下将 tiff
读取到 numpy 数组,所以我需要获取 tiff
的完整路径,但我使用 tarfile package 失败了。
以下是我尝试过的代码:
inpath = 'H:\alongKKH IMAGES1\'
def ReadTars(inpath):
tar_files = os.listdir(inpath)
for tar in tar_files:
if tar.split('_')[1] == 'PMS1':
print tar
tarname = tar
tar = tarfile.open(os.path.join(inpath, tar), "r:gz")
for file_name in tar.getnames():
if file_name[-4:]=='tiff':
print file_name
rasterpath = os.path.join(inpath, tarname + '\' + file_name)
array = raster2array(rasterpath)
break
else:
tar = tarfile.open(os.path.join(inpath, tar), "r:gz")
for file_name in tar.getnames():
if file_name[-4:]=='tiff':
#array = raster2array(os.path.join(inpath, tar, file_name))
break
raster2array
是将图像读取到 numpy 数组的函数。
def raster2array(rasterfn):
raster = gdal.Open(rasterfn)
array = raster.ReadAsArray()
return array
然后它抛出以下错误:
ERROR 4: `H:\alongKKH IMAGES1\GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810.tar.gz\GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810-MSS1.tiff' does not exist in the file system,
and is not recognized as a supported dataset name.
谁能帮帮我,不胜感激,谢谢。我用 python 代替 windows.
(inpath, tarname + '\' + file_name) -- 只是一个路径,不是真正的文件,raster2array 支持 tar?如果不能,那么 "does not exist in the file system".
tar文件没有读取(),zip文件有,所以:
import zipfile
file = zipfile.ZipFile(inpath+'GF1_PMS1_E72.zip', "r")
for name in file.namelist():
data = file.read(name)
print name, len(data), repr(data[:10])
如果您搜索并获得 tar文件的 read(),如上。
"rasterfn" 不是物理文件,然后发生错误。
GDALOpen,支持 VSI 虚拟文件 API 的驱动程序,可以在 .tar/.tar.gz/.tgz 存档中打开文件(请参阅 VSIInstallTarFileHandler()):
VSIInstallTarFileHandler()
问题概要:我们可以在不解压的情况下读写tar.gz中的文件吗?
我有很多 tar.gz
个文件,名称类似于 GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810.tar.gz
每个 tar.gz
文件都包含如下文件:
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.tiff
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.xml
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.rpb
GF1_PMS1_E72.6_N33.6_20160511_L1A0001576267-MSS1.jpg
我想在不解压的情况下将 tiff
读取到 numpy 数组,所以我需要获取 tiff
的完整路径,但我使用 tarfile package 失败了。
以下是我尝试过的代码:
inpath = 'H:\alongKKH IMAGES1\'
def ReadTars(inpath):
tar_files = os.listdir(inpath)
for tar in tar_files:
if tar.split('_')[1] == 'PMS1':
print tar
tarname = tar
tar = tarfile.open(os.path.join(inpath, tar), "r:gz")
for file_name in tar.getnames():
if file_name[-4:]=='tiff':
print file_name
rasterpath = os.path.join(inpath, tarname + '\' + file_name)
array = raster2array(rasterpath)
break
else:
tar = tarfile.open(os.path.join(inpath, tar), "r:gz")
for file_name in tar.getnames():
if file_name[-4:]=='tiff':
#array = raster2array(os.path.join(inpath, tar, file_name))
break
raster2array
是将图像读取到 numpy 数组的函数。
def raster2array(rasterfn):
raster = gdal.Open(rasterfn)
array = raster.ReadAsArray()
return array
然后它抛出以下错误:
ERROR 4: `H:\alongKKH IMAGES1\GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810.tar.gz\GF1_PMS1_E72.0_N33.6_20160507_L1A0001568810-MSS1.tiff' does not exist in the file system,
and is not recognized as a supported dataset name.
谁能帮帮我,不胜感激,谢谢。我用 python 代替 windows.
(inpath, tarname + '\' + file_name) -- 只是一个路径,不是真正的文件,raster2array 支持 tar?如果不能,那么 "does not exist in the file system".
tar文件没有读取(),zip文件有,所以:
import zipfile
file = zipfile.ZipFile(inpath+'GF1_PMS1_E72.zip', "r")
for name in file.namelist():
data = file.read(name)
print name, len(data), repr(data[:10])
如果您搜索并获得 tar文件的 read(),如上。
"rasterfn" 不是物理文件,然后发生错误。 GDALOpen,支持 VSI 虚拟文件 API 的驱动程序,可以在 .tar/.tar.gz/.tgz 存档中打开文件(请参阅 VSIInstallTarFileHandler()): VSIInstallTarFileHandler()