无法连接到 PrestoDB 中的 S3:无法从服务端点加载凭据
Can't connect to S3 in PrestoDB: Unable to load credentials from service endpoint
我正在将 S3 Buckets 连接到 Apache Hive,以便我可以直接通过 PrestoDB 查询 S3 中的 Parquet
文件。我正在为 Teradata 的 PrestoDB 使用 HDP VM。
为此,我配置了 hive-site.xml
文件并在 /etc/hive/conf/hive-site.xml
文件中添加了我的 AWS 访问密钥和密钥,例如:
<property>
<name>hive.s3.aws-access-key</name>
<value>something</value>
</property>
<property>
<name>hive.s3.aws-secret-key</name>
<value>some-other-thing</value>
</property>
现在,我的 Parquet
文件所在的 S3 存储桶 URL 路径如下所示:
https://s3.console.aws.amazon.com/s3/buckets/sb.mycompany.com/someFolder/anotherFolder/?region=us-east-2&tab=overview
在创建外部 table 时,我在查询中将 S3 的位置指定为:
CREATE TABLE hive.project.data (... schema ...)
WITH ( format = 'PARQUET',
external_location = 's3://sb.mycompany.com/someFolder/anotherFolder/?region=us-east-2&tab=overview')
Apache Hive 无法连接 到 S3 Buckets 并使用 --debug
标志给出此错误:
Query 20180316_112407_00005_aj9x6 failed: Unable to load credentials from service endpoint
========= TECHNICAL DETAILS =========
[ Error message ]
Unable to load credentials from service endpoint
[ Session information ]
ClientSession{server=http://localhost:8080, user=presto, clientInfo=null, catalog=null, schema=null, timeZone=Zulu, locale=en_US, properties={}, transactionId=null, debug=true, quiet=false}
[ Stack trace ]
com.amazonaws.AmazonClientException: Unable to load credentials from service endpoint
at com.amazonaws.auth.EC2CredentialsFetcher.handleError(EC2CredentialsFetcher.java:180)
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:159)
at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:104)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4016)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:4478)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:4452)
at com.amazonaws.services.s3.AmazonS3Client.resolveServiceEndpoint(AmazonS3Client.java:4426)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1167)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1152)
at com.facebook.presto.hive.PrestoS3FileSystem.lambda$getS3ObjectMetadata(PrestoS3FileSystem.java:552)
at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138)
at com.facebook.presto.hive.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:549)
at com.facebook.presto.hive.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:305)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1439)
at com.facebook.presto.hive.HiveMetadata.getExternalPath(HiveMetadata.java:719)
at com.facebook.presto.hive.HiveMetadata.createTable(HiveMetadata.java:690)
at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorMetadata.createTable(ClassLoaderSafeConnectorMetadata.java:218)
at com.facebook.presto.metadata.MetadataManager.createTable(MetadataManager.java:505)
at com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:148)
at com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:57)
at com.facebook.presto.execution.DataDefinitionExecution.start(DataDefinitionExecution.java:111)
at com.facebook.presto.execution.QueuedExecution.lambda$start(QueuedExecution.java:63)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Network is unreachable
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:47)
at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:106)
at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:77)
at com.amazonaws.auth.InstanceProfileCredentialsProvider$InstanceMetadataCredentialsEndpointProvider.getCredentialsEndpoint(InstanceProfileCredentialsProvider.java:117)
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:121)
... 24 more
========= TECHNICAL DETAILS END =========
我什至在添加密钥后重新启动了我的 PrestDB 服务器。接下来,我尝试将我的属性添加到 /home/presto/.prestoadmin/catalog/hive.properties
:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://localhost:9083
hive.allow-drop-table=true
hive.allow-rename-table=true
hive.time-zone=UTC
hive.metastore-cache-ttl=0s
hive.s3.use-instance-credentials=false
hive.s3.aws-access-key=something
hive.s3.aws-secret-key=some-other-thing
再次重启PrestoDB服务器,问题依旧。
然后我仅使用存储桶名称修改查询中的 S3 存储桶位置:
external_location = 's3://sb.mycompany.com'
还有 s3a
方案:
external_location = 's3a://sb.mycompany.com'
但同样的问题仍然存在。我做错了什么?
这很尴尬。在我使用的 VM 上,网络适配器出现问题,因此 VM 无法连接到 Internet。我更正了适配器,它现在可以正常工作了。
我正在将 S3 Buckets 连接到 Apache Hive,以便我可以直接通过 PrestoDB 查询 S3 中的 Parquet
文件。我正在为 Teradata 的 PrestoDB 使用 HDP VM。
为此,我配置了 hive-site.xml
文件并在 /etc/hive/conf/hive-site.xml
文件中添加了我的 AWS 访问密钥和密钥,例如:
<property>
<name>hive.s3.aws-access-key</name>
<value>something</value>
</property>
<property>
<name>hive.s3.aws-secret-key</name>
<value>some-other-thing</value>
</property>
现在,我的 Parquet
文件所在的 S3 存储桶 URL 路径如下所示:
https://s3.console.aws.amazon.com/s3/buckets/sb.mycompany.com/someFolder/anotherFolder/?region=us-east-2&tab=overview
在创建外部 table 时,我在查询中将 S3 的位置指定为:
CREATE TABLE hive.project.data (... schema ...)
WITH ( format = 'PARQUET',
external_location = 's3://sb.mycompany.com/someFolder/anotherFolder/?region=us-east-2&tab=overview')
Apache Hive 无法连接 到 S3 Buckets 并使用 --debug
标志给出此错误:
Query 20180316_112407_00005_aj9x6 failed: Unable to load credentials from service endpoint
========= TECHNICAL DETAILS =========
[ Error message ]
Unable to load credentials from service endpoint
[ Session information ]
ClientSession{server=http://localhost:8080, user=presto, clientInfo=null, catalog=null, schema=null, timeZone=Zulu, locale=en_US, properties={}, transactionId=null, debug=true, quiet=false}
[ Stack trace ]
com.amazonaws.AmazonClientException: Unable to load credentials from service endpoint
at com.amazonaws.auth.EC2CredentialsFetcher.handleError(EC2CredentialsFetcher.java:180)
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:159)
at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:104)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4016)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:4478)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:4452)
at com.amazonaws.services.s3.AmazonS3Client.resolveServiceEndpoint(AmazonS3Client.java:4426)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1167)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1152)
at com.facebook.presto.hive.PrestoS3FileSystem.lambda$getS3ObjectMetadata(PrestoS3FileSystem.java:552)
at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138)
at com.facebook.presto.hive.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:549)
at com.facebook.presto.hive.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:305)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1439)
at com.facebook.presto.hive.HiveMetadata.getExternalPath(HiveMetadata.java:719)
at com.facebook.presto.hive.HiveMetadata.createTable(HiveMetadata.java:690)
at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorMetadata.createTable(ClassLoaderSafeConnectorMetadata.java:218)
at com.facebook.presto.metadata.MetadataManager.createTable(MetadataManager.java:505)
at com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:148)
at com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:57)
at com.facebook.presto.execution.DataDefinitionExecution.start(DataDefinitionExecution.java:111)
at com.facebook.presto.execution.QueuedExecution.lambda$start(QueuedExecution.java:63)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Network is unreachable
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:47)
at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:106)
at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:77)
at com.amazonaws.auth.InstanceProfileCredentialsProvider$InstanceMetadataCredentialsEndpointProvider.getCredentialsEndpoint(InstanceProfileCredentialsProvider.java:117)
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:121)
... 24 more
========= TECHNICAL DETAILS END =========
我什至在添加密钥后重新启动了我的 PrestDB 服务器。接下来,我尝试将我的属性添加到 /home/presto/.prestoadmin/catalog/hive.properties
:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://localhost:9083
hive.allow-drop-table=true
hive.allow-rename-table=true
hive.time-zone=UTC
hive.metastore-cache-ttl=0s
hive.s3.use-instance-credentials=false
hive.s3.aws-access-key=something
hive.s3.aws-secret-key=some-other-thing
再次重启PrestoDB服务器,问题依旧。
然后我仅使用存储桶名称修改查询中的 S3 存储桶位置:
external_location = 's3://sb.mycompany.com'
还有 s3a
方案:
external_location = 's3a://sb.mycompany.com'
但同样的问题仍然存在。我做错了什么?
这很尴尬。在我使用的 VM 上,网络适配器出现问题,因此 VM 无法连接到 Internet。我更正了适配器,它现在可以正常工作了。