Spark 读取 Open Street Map 数据并选择条目

Spark reading Open Street Map data and selecting entries

我有一个 .orc 的 OpenStreetMap (OSM) 数据存储在一个国家的 var nlorc 中,我正在尝试读取特定城市的数据。据我所知,城市实体在 OSM 中被定义为 'relation'。 nlorc.printSchema() 我的数据 returns 如下:

root
 |-- id: long (nullable = true)
 |-- type: string (nullable = true)
 |-- tags: map (nullable = true)
 |    |-- key: string
 |    |-- value: string (valueContainsNull = true)
 |-- lat: decimal(9,7) (nullable = true)
 |-- lon: decimal(10,7) (nullable = true)
 |-- nds: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- ref: long (nullable = true)
 |-- members: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- type: string (nullable = true)
 |    |    |-- ref: long (nullable = true)
 |    |    |-- role: string (nullable = true)
 |-- changeset: long (nullable = true)
 |-- timestamp: timestamp (nullable = true)
 |-- uid: long (nullable = true)
 |-- user: string (nullable = true)
 |-- version: long (nullable = true)
 |-- visible: boolean (nullable = true)

例如,https://www.openstreetmap.org/relation/47798#map=13/51.4373/4.8888 显示城市名称是“标签”的一部分。如何访问标签和 select 个特定城市的密钥?

您可以使用getItem访问地图的元素:

df = ...
df.filter(df("tags").getItem("name")==="Baarle-Nassau").show()