使用 gps 坐标进行数据分析
data analysis using gps coordinates
我有这样的数据,包含Timestamp,longitude,latitude和tripId,
我是否可以仅从这些数据中找到交叉路口的等待时间,或者我需要其他数据?
我可以从这类数据中得到哪些信息?
"timestamp","tripId","longitude","latitude"
"2021-07-05 10:35:04","1866491","8.167035","53.160473"
"2021-07-05 10:35:03","1866491","8.167023","53.160469"
"2021-07-05 10:35:02","1866491","8.167007","53.160459"
"2021-07-05 10:35:01","1866491","8.166987","53.160455"
"2021-07-05 10:35:00","1866491","8.166956","53.160448"
"2021-07-05 10:34:20","1866491","8.167286","53.15919"
"2021-07-05 10:34:19","1866491","8.167328","53.15918"
"2021-07-05 10:34:18","1866491","8.16735","53.159165"
"2021-07-05 10:34:17","1866491","8.167371","53.159148"
"2021-07-05 10:34:16","1866491","8.167388","53.159124"
"2021-07-05 10:34:15","1866491","8.167399","53.159105"
"2021-06-30 20:25:30","1862861","8.211288","53.150848"
"2021-06-30 20:25:29","1862861","8.211264","53.150851"
"2021-06-30 20:25:28","1862861","8.211269","53.150842"
"2021-06-30 20:25:27","1862861","8.211273","53.150836"
"2021-06-30 20:25:26","1862861","8.211279","53.150836"
"2021-06-30 20:25:25","1862861","8.211259","53.150848"
"2021-06-30 20:25:24","1862861","8.211263","53.15085"
"2021-06-30 20:25:21","1862861","8.211455","53.150782"
"2021-06-30 20:25:20","1862861","8.211453","53.150786"
"2021-06-30 20:25:19","1862861","8.211449","53.150792"
这个问题的答案:
which informations can i get from this kind of data
你有一个timestamp
、一个tripId
和一个坐标(longitude
和latitude
)。
因此,您知道此人或车辆在 his/its 行程 1866491
期间位于 8.166987 / 53.160455
2021-07-05 10:35:01
.此外,您还可以计算行程持续时间。
您还可以通过按时间戳顺序连接行程的所有坐标来创建 线 。然后,您可以找出这些行程在哪个位置相互交叉。
根据线路特征,您可以计算出它们的长度(行程的距离)。连同行程的持续时间,您还可以计算出平均 步行或行驶速度。
(尽管您的示例数据没有相互交叉的行程。我不确定您所说的等待时间是什么意思。)
下面是一个示例脚本,说明如何将点转换为线并找到行程与彼此路径交叉的点:
from itertools import combinations
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
# Read CSV File
df = pd.read_csv("trips.csv")
# Create Points
points = gpd.GeoDataFrame(
df,
geometry=gpd.points_from_xy(df.longitude, df.latitude),
crs="EPSG:4326"
)
points = points.drop(columns=["latitude", "longitude"])
# Make sure Points are ordered (important)
points = points.sort_values(["tripId", "timestamp"])
# Create Lines
tolist = lambda x: LineString(x.tolist())
lines = points.groupby(["tripId"], as_index=False)["geometry"].apply(tolist)
lines = gpd.GeoDataFrame(lines, geometry="geometry", crs="EPSG:4326")
查找行程相互交叉的点:
# Get Intersection Points
template={"tripA":[], "tripB":[], "geometry":[]}
intersection_points = gpd.GeoDataFrame(template, geometry="geometry")
for index in combinations(lines.index, 2):
combination = lines.loc[index,:]
geometries = combination["geometry"].tolist()
point = geometries[0].intersection(geometries[1])
if point: # LINESTRING EMPTY evaluates to false
trips = combination["tripId"].tolist()
row = pd.Series([trips[0], trips[1], point], index=intersection_points.columns)
intersection_points = intersection_points.append(row, ignore_index=True)
将intersection_points
写入CSV文件:
intersection_points["longitude"] = intersection_points.geometry.x
intersection_points["latitude"] = intersection_points.geometry.y
columns = ["tripA", "tripB", "latitude", "longitude"]
intersection_points.to_csv("intersections.csv", columns=columns)
将创建的线和交点写入形状文件:
# Write Shape Files
lines.to_file("trips.shp")
intersection_points.to_file("intersections.shp")
两个行程相互交叉的示例数据(基于问题的示例数据):
"timestamp","tripId","longitude","latitude"
"2021-07-05 10:35:04","1866491","8.167035","53.160473"
"2021-07-05 10:35:03","1866491","8.167023","53.160469"
"2021-07-05 10:35:02","1866491","8.167007","53.160459"
"2021-07-05 10:35:01","1866491","8.166987","53.160455"
"2021-07-05 10:35:00","1866491","8.166956","53.160448"
"2021-07-05 10:34:20","1866491","8.167286","53.15919"
"2021-07-05 10:34:19","1866491","8.167328","53.15918"
"2021-07-05 10:34:18","1866491","8.16735","53.159165"
"2021-07-05 10:34:17","1866491","8.167371","53.159148"
"2021-07-05 10:34:16","1866491","8.167388","53.159124"
"2021-07-05 10:34:15","1866491","8.167399","53.159105"
"2021-06-30 20:25:30","1862861","8.211288","53.150848"
"2021-06-30 20:25:29","1862861","8.211264","53.150851"
"2021-06-30 20:25:28","1862861","8.211269","53.150842"
"2021-06-30 20:25:27","1862861","8.211273","53.150836"
"2021-06-30 20:25:26","1862861","8.211279","53.150836"
"2021-06-30 20:25:25","1862861","8.211259","53.150848"
"2021-06-30 20:25:24","1862861","8.211263","53.15085"
"2021-06-30 20:25:21","1862861","8.211455","53.150782"
"2021-06-30 20:25:19","1862861","8.211449","53.150792"
"2021-06-30 20:25:20","1862861","8.211453","53.150786"
"2021-06-30 20:25:18","1862861","8.166607","53.159654"
我有这样的数据,包含Timestamp,longitude,latitude和tripId,
我是否可以仅从这些数据中找到交叉路口的等待时间,或者我需要其他数据? 我可以从这类数据中得到哪些信息?
"timestamp","tripId","longitude","latitude" "2021-07-05 10:35:04","1866491","8.167035","53.160473" "2021-07-05 10:35:03","1866491","8.167023","53.160469" "2021-07-05 10:35:02","1866491","8.167007","53.160459" "2021-07-05 10:35:01","1866491","8.166987","53.160455" "2021-07-05 10:35:00","1866491","8.166956","53.160448" "2021-07-05 10:34:20","1866491","8.167286","53.15919" "2021-07-05 10:34:19","1866491","8.167328","53.15918" "2021-07-05 10:34:18","1866491","8.16735","53.159165" "2021-07-05 10:34:17","1866491","8.167371","53.159148" "2021-07-05 10:34:16","1866491","8.167388","53.159124" "2021-07-05 10:34:15","1866491","8.167399","53.159105" "2021-06-30 20:25:30","1862861","8.211288","53.150848" "2021-06-30 20:25:29","1862861","8.211264","53.150851" "2021-06-30 20:25:28","1862861","8.211269","53.150842" "2021-06-30 20:25:27","1862861","8.211273","53.150836" "2021-06-30 20:25:26","1862861","8.211279","53.150836" "2021-06-30 20:25:25","1862861","8.211259","53.150848" "2021-06-30 20:25:24","1862861","8.211263","53.15085" "2021-06-30 20:25:21","1862861","8.211455","53.150782" "2021-06-30 20:25:20","1862861","8.211453","53.150786" "2021-06-30 20:25:19","1862861","8.211449","53.150792"
这个问题的答案:
which informations can i get from this kind of data
你有一个timestamp
、一个tripId
和一个坐标(longitude
和latitude
)。
因此,您知道此人或车辆在 his/its 行程 1866491
期间位于 8.166987 / 53.160455
2021-07-05 10:35:01
.此外,您还可以计算行程持续时间。
您还可以通过按时间戳顺序连接行程的所有坐标来创建 线 。然后,您可以找出这些行程在哪个位置相互交叉。
根据线路特征,您可以计算出它们的长度(行程的距离)。连同行程的持续时间,您还可以计算出平均 步行或行驶速度。
(尽管您的示例数据没有相互交叉的行程。我不确定您所说的等待时间是什么意思。)
下面是一个示例脚本,说明如何将点转换为线并找到行程与彼此路径交叉的点:
from itertools import combinations
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
# Read CSV File
df = pd.read_csv("trips.csv")
# Create Points
points = gpd.GeoDataFrame(
df,
geometry=gpd.points_from_xy(df.longitude, df.latitude),
crs="EPSG:4326"
)
points = points.drop(columns=["latitude", "longitude"])
# Make sure Points are ordered (important)
points = points.sort_values(["tripId", "timestamp"])
# Create Lines
tolist = lambda x: LineString(x.tolist())
lines = points.groupby(["tripId"], as_index=False)["geometry"].apply(tolist)
lines = gpd.GeoDataFrame(lines, geometry="geometry", crs="EPSG:4326")
查找行程相互交叉的点:
# Get Intersection Points
template={"tripA":[], "tripB":[], "geometry":[]}
intersection_points = gpd.GeoDataFrame(template, geometry="geometry")
for index in combinations(lines.index, 2):
combination = lines.loc[index,:]
geometries = combination["geometry"].tolist()
point = geometries[0].intersection(geometries[1])
if point: # LINESTRING EMPTY evaluates to false
trips = combination["tripId"].tolist()
row = pd.Series([trips[0], trips[1], point], index=intersection_points.columns)
intersection_points = intersection_points.append(row, ignore_index=True)
将intersection_points
写入CSV文件:
intersection_points["longitude"] = intersection_points.geometry.x
intersection_points["latitude"] = intersection_points.geometry.y
columns = ["tripA", "tripB", "latitude", "longitude"]
intersection_points.to_csv("intersections.csv", columns=columns)
将创建的线和交点写入形状文件:
# Write Shape Files
lines.to_file("trips.shp")
intersection_points.to_file("intersections.shp")
两个行程相互交叉的示例数据(基于问题的示例数据):
"timestamp","tripId","longitude","latitude"
"2021-07-05 10:35:04","1866491","8.167035","53.160473"
"2021-07-05 10:35:03","1866491","8.167023","53.160469"
"2021-07-05 10:35:02","1866491","8.167007","53.160459"
"2021-07-05 10:35:01","1866491","8.166987","53.160455"
"2021-07-05 10:35:00","1866491","8.166956","53.160448"
"2021-07-05 10:34:20","1866491","8.167286","53.15919"
"2021-07-05 10:34:19","1866491","8.167328","53.15918"
"2021-07-05 10:34:18","1866491","8.16735","53.159165"
"2021-07-05 10:34:17","1866491","8.167371","53.159148"
"2021-07-05 10:34:16","1866491","8.167388","53.159124"
"2021-07-05 10:34:15","1866491","8.167399","53.159105"
"2021-06-30 20:25:30","1862861","8.211288","53.150848"
"2021-06-30 20:25:29","1862861","8.211264","53.150851"
"2021-06-30 20:25:28","1862861","8.211269","53.150842"
"2021-06-30 20:25:27","1862861","8.211273","53.150836"
"2021-06-30 20:25:26","1862861","8.211279","53.150836"
"2021-06-30 20:25:25","1862861","8.211259","53.150848"
"2021-06-30 20:25:24","1862861","8.211263","53.15085"
"2021-06-30 20:25:21","1862861","8.211455","53.150782"
"2021-06-30 20:25:19","1862861","8.211449","53.150792"
"2021-06-30 20:25:20","1862861","8.211453","53.150786"
"2021-06-30 20:25:18","1862861","8.166607","53.159654"