Spark 读取 Open Street Map 数据并选择条目
Spark reading Open Street Map data and selecting entries
我有一个 .orc 的 OpenStreetMap (OSM) 数据存储在一个国家的 var nlorc
中,我正在尝试读取特定城市的数据。据我所知,城市实体在 OSM 中被定义为 'relation'。 nlorc.printSchema()
我的数据 returns 如下:
root
|-- id: long (nullable = true)
|-- type: string (nullable = true)
|-- tags: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- lat: decimal(9,7) (nullable = true)
|-- lon: decimal(10,7) (nullable = true)
|-- nds: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- ref: long (nullable = true)
|-- members: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- type: string (nullable = true)
| | |-- ref: long (nullable = true)
| | |-- role: string (nullable = true)
|-- changeset: long (nullable = true)
|-- timestamp: timestamp (nullable = true)
|-- uid: long (nullable = true)
|-- user: string (nullable = true)
|-- version: long (nullable = true)
|-- visible: boolean (nullable = true)
例如,https://www.openstreetmap.org/relation/47798#map=13/51.4373/4.8888 显示城市名称是“标签”的一部分。如何访问标签和 select 个特定城市的密钥?
您可以使用getItem
访问地图的元素:
df = ...
df.filter(df("tags").getItem("name")==="Baarle-Nassau").show()
我有一个 .orc 的 OpenStreetMap (OSM) 数据存储在一个国家的 var nlorc
中,我正在尝试读取特定城市的数据。据我所知,城市实体在 OSM 中被定义为 'relation'。 nlorc.printSchema()
我的数据 returns 如下:
root
|-- id: long (nullable = true)
|-- type: string (nullable = true)
|-- tags: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- lat: decimal(9,7) (nullable = true)
|-- lon: decimal(10,7) (nullable = true)
|-- nds: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- ref: long (nullable = true)
|-- members: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- type: string (nullable = true)
| | |-- ref: long (nullable = true)
| | |-- role: string (nullable = true)
|-- changeset: long (nullable = true)
|-- timestamp: timestamp (nullable = true)
|-- uid: long (nullable = true)
|-- user: string (nullable = true)
|-- version: long (nullable = true)
|-- visible: boolean (nullable = true)
例如,https://www.openstreetmap.org/relation/47798#map=13/51.4373/4.8888 显示城市名称是“标签”的一部分。如何访问标签和 select 个特定城市的密钥?
您可以使用getItem
访问地图的元素:
df = ...
df.filter(df("tags").getItem("name")==="Baarle-Nassau").show()