Rasterio“在文件系统中不存在,并且不被识别为受支持的数据集名称。”
Rasterio " does not exist in the file system, and is not recognized as a supported dataset name."
学习本教程:https://www.usgs.gov/media/files/landsat-cloud-direct-access-requester-pays-tutorial
import boto3
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
aws_session = AWSSession(boto3.Session())
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
我收到以下错误:
rasterio.errors.RasterioIOError: '/vsis3/usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF' does not exist in the file system, and is not recognized as a supported dataset name.
在 AWS CloudShell 中,如果我 运行:
```
aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/
```
我得到:
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
我运行 EC2实例中的cloudshell命令,同样的错误。
我需要在文档中说明我是请求者,这是可行的:
aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/ --request-payer requ
ester
使用boto3还是不行。
我对 运行ning boto3 的用户拥有管理员权限。在 CloudShell 中遇到与 boto 用户和 root 相同的错误。我之前使用过访问密钥和秘密密钥,它可以很好地从“landsat-pds”存储桶(只有 L8 图像)和“sentinel-s2-l1c”存储桶下载。似乎只是“usgs-landsat”存储桶的问题 (https://registry.opendata.aws/usgs-landsat/)
还尝试使用 s3.list_objects 访问 usgs-landsat 存储桶:
landsat = resources.Bucket("usgs-landsat")
all_objects = s3.list_objects(Bucket = 'usgs-landsat')
得到类似的错误:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
一些用户在查看其他解决方案后发现:
os.environ["AWS_REQUEST_PAYER"] = "requester"
os.environ["CURL_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt"
要解决他们的问题,它对我不起作用。
这对我有用
s3sr = boto3.resource('s3')
bucket='usgs-landsat'
prefix = 'collection02/'
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/', RequestPayer='requester'):
keys = [content['Key'] for content in page.get('Contents')]
keys_list.extend(keys)
len(keys_list)
# keys_list
['collection02/catalog.json',
'collection02/landsat-c2l1.json',
'collection02/landsat-c2l2-sr.json',
'collection02/landsat-c2l2-st.json',
'collection02/landsat-c2l2alb-bt.json',
'collection02/landsat-c2l2alb-sr.json',
'collection02/landsat-c2l2alb-st.json',
'collection02/landsat-c2l2alb-ta.json']
# getting the catalog.json
response = boto3.client('s3').get_object(Bucket=bucket, Key='collection02/catalog.json', RequestPayer='requester')
jsondata = response['Body'].read().decode()
正如您正确指出的那样,usgs-landsat
S3 存储桶由请求者付费,因此您需要正确配置 rasterio
才能处理该问题。
如您所见 here,rasterio.session.AWSSession
有一个 requester_pays
参数,您可以将其设置为 True
以执行此操作。
我还可以指出以下几行:
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
您的代码片段中不需要,因为您以后不会重复使用 s3
和 resources
变量。
事实上,如果您的凭据正确位于 ~/.aws/
文件夹中 - 这可以通过 运行 awscli
python package (see documentation 提供的命令行实用程序 aws configure
来完成]) - 您根本不需要导入 boto3
,rasterio
会为您完成。
因此您的代码片段可以修改为:
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
aws_session = AWSSession(requester_pays=True)
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
在我的机器上运行正常。
学习本教程:https://www.usgs.gov/media/files/landsat-cloud-direct-access-requester-pays-tutorial
import boto3
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
aws_session = AWSSession(boto3.Session())
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
我收到以下错误:
rasterio.errors.RasterioIOError: '/vsis3/usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF' does not exist in the file system, and is not recognized as a supported dataset name.
我得到:
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
我运行 EC2实例中的cloudshell命令,同样的错误。
我需要在文档中说明我是请求者,这是可行的:
aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/ --request-payer requ
ester
使用boto3还是不行。
我对 运行ning boto3 的用户拥有管理员权限。在 CloudShell 中遇到与 boto 用户和 root 相同的错误。我之前使用过访问密钥和秘密密钥,它可以很好地从“landsat-pds”存储桶(只有 L8 图像)和“sentinel-s2-l1c”存储桶下载。似乎只是“usgs-landsat”存储桶的问题 (https://registry.opendata.aws/usgs-landsat/)
还尝试使用 s3.list_objects 访问 usgs-landsat 存储桶:
landsat = resources.Bucket("usgs-landsat")
all_objects = s3.list_objects(Bucket = 'usgs-landsat')
得到类似的错误:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
一些用户在查看其他解决方案后发现:
os.environ["AWS_REQUEST_PAYER"] = "requester"
os.environ["CURL_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt"
要解决他们的问题,它对我不起作用。
这对我有用
s3sr = boto3.resource('s3')
bucket='usgs-landsat'
prefix = 'collection02/'
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/', RequestPayer='requester'):
keys = [content['Key'] for content in page.get('Contents')]
keys_list.extend(keys)
len(keys_list)
# keys_list
['collection02/catalog.json',
'collection02/landsat-c2l1.json',
'collection02/landsat-c2l2-sr.json',
'collection02/landsat-c2l2-st.json',
'collection02/landsat-c2l2alb-bt.json',
'collection02/landsat-c2l2alb-sr.json',
'collection02/landsat-c2l2alb-st.json',
'collection02/landsat-c2l2alb-ta.json']
# getting the catalog.json
response = boto3.client('s3').get_object(Bucket=bucket, Key='collection02/catalog.json', RequestPayer='requester')
jsondata = response['Body'].read().decode()
正如您正确指出的那样,usgs-landsat
S3 存储桶由请求者付费,因此您需要正确配置 rasterio
才能处理该问题。
如您所见 here,rasterio.session.AWSSession
有一个 requester_pays
参数,您可以将其设置为 True
以执行此操作。
我还可以指出以下几行:
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
您的代码片段中不需要,因为您以后不会重复使用 s3
和 resources
变量。
事实上,如果您的凭据正确位于 ~/.aws/
文件夹中 - 这可以通过 运行 awscli
python package (see documentation 提供的命令行实用程序 aws configure
来完成]) - 您根本不需要导入 boto3
,rasterio
会为您完成。
因此您的代码片段可以修改为:
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
aws_session = AWSSession(requester_pays=True)
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
在我的机器上运行正常。