GTFS 数据库 - SQL "Revenue Mileage" 和 "Revenue Hours" 的查询
GTFS Database - SQL Queries for "Revenue Mileage" and "Revenue Hours"
我正在尝试按日、月和年查找 "Route" 的收入 miles/kilometers;通过查询具有此处描述的结构的 GTFS 数据库:
https://developers.google.com/transit/gtfs/reference
并在此处查看非常清晰的结构草图:
http://blog.openplans.org/2012/08/the-openplans-guide-to-gtfs-data/
"Revenue distance traveled" definition:
("Available for passengers to use" distance)
The number of miles/kilometers traveled from the first actual bus stop
where a passenger can board, to the last drop-off at the last bus
stop, for that particular route and bus run. (then aggregated together
for all service runs taken by all buses for that particular route)
-
"Revenue hours" definition:
("Available for passengers to use" time span)
The number of hours from the moment the vehicle arrives at the first
bus stop, until the moment it drops off its last passenger at the last
bus stop. (then aggregated together for all service runs taken by all
buses for that particular route)
我正在使用 SQL Server/MSSQL。尽管 SQL Lite,或 MySQL,或任何 SQL 示例都很好。
基本上,我需要能够SELECT一条路线,然后关联routes
、calendar_dates
、calendar
、[=16=中的数据], stops
, 和 trips
tables 来查找从第一站开始覆盖了多少 miles/kilometers (stop_times
和 stops
tables) 到最后,经过了多少小时,然后为特定的 service_id
(在 trips
和 calendar
tables 中)找到这个,然后也为所有service_id
s 用于特定路线,并能够针对特定 date
(在 calendar_dates
table)或日期跨度(天、月、3 -月、年等)。
如果需要几个不同的查询,那很好。每条路线行驶的收入距离和每条路线的收入小时数可以单独查询。
有没有人曾经这样做过,愿意分享他们为此的查询结构,或者有没有人想出这个?有没有关于如何编写此查询的示例?几个星期以来我一直在网上到处寻找。
这是我创建的数据库的图表图像,其中详细显示了所有关系:
我已经为预定的公里数完成了这个,作者:
- 正在通过 GTFS SQL importer and PostGIS
将 GTFS 加载到数据库中
- 使形状 table 空间化
- Calculate distance for each shape
- 汇总如下(请参阅有关服务 ID 的注释)。
select t.route_id as id, r.route_short_name as route, sum(l.shape_dist/1000) as sched_kms
from gtfs_shape_lengths l
inner join gtfs_trips t on t.shape_id = l.shape_id
inner join gtfs_routes r on r.route_id = t.route_id
inner join gtfs_calendar c on t.service_id = c.service_id
where c.service_id ilike '%sat%'
group by t.route_id, r.route_short_name
union all
select 'total' as id, 'total_' as name,
sum(l.shape_dist/1000) as sched_kms
from gtfs_shape_lengths l
inner join gtfs_trips t on t.shape_id = l.shape_id
inner join gtfs_calendar c on t.service_id = c.service_id
where c.service_id ilike '%sat%'
order by sched_kms desc
原文在这里:
http://transitdata.net/using-gtfs-and-postgis-to-calculate-levels-of-scheduled-service/
好的,我想出了以下方法来获得 服务时间。在我的示例中,stop_times
table 中的 arrival_time
和 departure_time
列是整数数据类型,其中存储的数字数据表示 "minutes since midnight"(例如“29 小时和45 分钟后午夜”将是“1785 分钟”......午夜是从服务日中午开始测量的,减去 12 小时——根据规范要求。这也是最好的方法。另请注意:我将 trip_date
列添加到 trips
table 中,因为我将此 GTFS 数据库用于 operational/internal 联邦报告用途,而不仅仅是用于服务提要public;所以有必要知道个人旅行日期(我不想像某些机构那样为此目的在 calendar_dates
中输入每一天)。此示例适用于 MSSQL/SQL 服务器:
-- FIRST/LAST TRIPS OF THE DAY AND SPAN OF SERVICE
SELECT
joinedTables.service_id AS 'Service Number',
joinedTables.trip_date AS 'Date',
joinedTables.route_id AS 'Route',
MIN ( joinedTables.starting_departure ) AS 'First Departure in Minutes',
MAX ( joinedTables.ending_arrival ) AS 'Last Departure in Minutes',
-- Decimal hours of minutes integers.
CAST (
(
(
MAX (ending_arrival) - MIN (starting_departure)
) / 60.00
) AS DECIMAL (9, 2)
) AS 'Service Hours'
FROM
(
SELECT
SelectedTripsColumns.service_id,
SelectedTripsColumns.trip_id,
SelectedTripsColumns.route_id,
SelectedTripsColumns.trip_date,
MIN (departure_time) AS starting_departure,
MAX (arrival_time) AS ending_arrival
FROM
stop_times AS stopTimesTable
JOIN (
SELECT
service_id,
trip_id,
route_id,
trip_date
FROM
trips
) AS SelectedTripsColumns
ON stopTimesTable.trip_id = SelectedTripsColumns.trip_id
JOIN routes
ON SelectedTripsColumns.route_id = routes.route_id
GROUP BY
SelectedTripsColumns.service_id,
SelectedTripsColumns.trip_id,
SelectedTripsColumns.route_id,
SelectedTripsColumns.trip_date
) AS joinedTables
-- WHERE trip_date = '2015-07-27'
GROUP BY
service_id,
route_id,
trip_date
ORDER BY
service_id,
route_id,
trip_date;
我正在尝试按日、月和年查找 "Route" 的收入 miles/kilometers;通过查询具有此处描述的结构的 GTFS 数据库:
https://developers.google.com/transit/gtfs/reference
并在此处查看非常清晰的结构草图:
http://blog.openplans.org/2012/08/the-openplans-guide-to-gtfs-data/
"Revenue distance traveled" definition:
("Available for passengers to use" distance)
The number of miles/kilometers traveled from the first actual bus stop where a passenger can board, to the last drop-off at the last bus stop, for that particular route and bus run. (then aggregated together for all service runs taken by all buses for that particular route)
-
"Revenue hours" definition:
("Available for passengers to use" time span)
The number of hours from the moment the vehicle arrives at the first bus stop, until the moment it drops off its last passenger at the last bus stop. (then aggregated together for all service runs taken by all buses for that particular route)
我正在使用 SQL Server/MSSQL。尽管 SQL Lite,或 MySQL,或任何 SQL 示例都很好。
基本上,我需要能够SELECT一条路线,然后关联routes
、calendar_dates
、calendar
、[=16=中的数据], stops
, 和 trips
tables 来查找从第一站开始覆盖了多少 miles/kilometers (stop_times
和 stops
tables) 到最后,经过了多少小时,然后为特定的 service_id
(在 trips
和 calendar
tables 中)找到这个,然后也为所有service_id
s 用于特定路线,并能够针对特定 date
(在 calendar_dates
table)或日期跨度(天、月、3 -月、年等)。
如果需要几个不同的查询,那很好。每条路线行驶的收入距离和每条路线的收入小时数可以单独查询。
有没有人曾经这样做过,愿意分享他们为此的查询结构,或者有没有人想出这个?有没有关于如何编写此查询的示例?几个星期以来我一直在网上到处寻找。
这是我创建的数据库的图表图像,其中详细显示了所有关系:
我已经为预定的公里数完成了这个,作者:
- 正在通过 GTFS SQL importer and PostGIS 将 GTFS 加载到数据库中
- 使形状 table 空间化
- Calculate distance for each shape
- 汇总如下(请参阅有关服务 ID 的注释)。
select t.route_id as id, r.route_short_name as route, sum(l.shape_dist/1000) as sched_kms
from gtfs_shape_lengths l
inner join gtfs_trips t on t.shape_id = l.shape_id
inner join gtfs_routes r on r.route_id = t.route_id
inner join gtfs_calendar c on t.service_id = c.service_id
where c.service_id ilike '%sat%'
group by t.route_id, r.route_short_name
union all
select 'total' as id, 'total_' as name,
sum(l.shape_dist/1000) as sched_kms
from gtfs_shape_lengths l
inner join gtfs_trips t on t.shape_id = l.shape_id
inner join gtfs_calendar c on t.service_id = c.service_id
where c.service_id ilike '%sat%'
order by sched_kms desc
原文在这里: http://transitdata.net/using-gtfs-and-postgis-to-calculate-levels-of-scheduled-service/
好的,我想出了以下方法来获得 服务时间。在我的示例中,stop_times
table 中的 arrival_time
和 departure_time
列是整数数据类型,其中存储的数字数据表示 "minutes since midnight"(例如“29 小时和45 分钟后午夜”将是“1785 分钟”......午夜是从服务日中午开始测量的,减去 12 小时——根据规范要求。这也是最好的方法。另请注意:我将 trip_date
列添加到 trips
table 中,因为我将此 GTFS 数据库用于 operational/internal 联邦报告用途,而不仅仅是用于服务提要public;所以有必要知道个人旅行日期(我不想像某些机构那样为此目的在 calendar_dates
中输入每一天)。此示例适用于 MSSQL/SQL 服务器:
-- FIRST/LAST TRIPS OF THE DAY AND SPAN OF SERVICE
SELECT
joinedTables.service_id AS 'Service Number',
joinedTables.trip_date AS 'Date',
joinedTables.route_id AS 'Route',
MIN ( joinedTables.starting_departure ) AS 'First Departure in Minutes',
MAX ( joinedTables.ending_arrival ) AS 'Last Departure in Minutes',
-- Decimal hours of minutes integers.
CAST (
(
(
MAX (ending_arrival) - MIN (starting_departure)
) / 60.00
) AS DECIMAL (9, 2)
) AS 'Service Hours'
FROM
(
SELECT
SelectedTripsColumns.service_id,
SelectedTripsColumns.trip_id,
SelectedTripsColumns.route_id,
SelectedTripsColumns.trip_date,
MIN (departure_time) AS starting_departure,
MAX (arrival_time) AS ending_arrival
FROM
stop_times AS stopTimesTable
JOIN (
SELECT
service_id,
trip_id,
route_id,
trip_date
FROM
trips
) AS SelectedTripsColumns
ON stopTimesTable.trip_id = SelectedTripsColumns.trip_id
JOIN routes
ON SelectedTripsColumns.route_id = routes.route_id
GROUP BY
SelectedTripsColumns.service_id,
SelectedTripsColumns.trip_id,
SelectedTripsColumns.route_id,
SelectedTripsColumns.trip_date
) AS joinedTables
-- WHERE trip_date = '2015-07-27'
GROUP BY
service_id,
route_id,
trip_date
ORDER BY
service_id,
route_id,
trip_date;