按距离减少 GPS 数据集

Reduce GPS data set by distance

我有一组 GPS 坐标,由 GPS 传感器和 Raspberry Pi 创建。我以 10hz 的频率对传感器进行极化,并将数据记录到 Pi 上的 SQL DB 中。该系统位于我的汽车顶部(也是建筑行业房屋扫描工具的一部分)。问题是我以不同的速度行驶。在某些情况下,我必须停下来让其他车辆通过,同时以 10hz 记录 GPS 位置。

记录数据后,我想 post 处理 GPS 数据并输出简化的坐标列表,以便我的位置相距大约 1 米。

我知道我也许可以使用 Pandas 来做这个,但不知道从哪里开始。

这是一个示例数据集:

51.80359349246259,-4.741180850463812
51.80361005410784,-4.740873766196046
51.80351890237921,-4.7415190658979895
51.803152371942325,-4.74057836870229
51.80352232936482,-4.740392650792621
51.80361261925252,-4.740896906964529
51.803487420307796,-4.7402764541541265
51.80353017387817,-4.74136689657748
51.80287372471039,-4.741218904144232
51.80326530703784,-4.740193742088211

非常感谢任何帮助。

library(data.table)
library(hutils)
setDT(gpsdata)
setDT(busdata.data)

gps_orig <- copy(gpsdata)
busdata.orig <- copy(busdata.data)

setkey(gpsdata, lat)

# Just to take note of the originals
gpsdata[, gps_lat := lat + 0]
gpsdata[, gps_lon := lon + 0]

busdata.data[, lat := latitude_bustops + 0]
busdata.data[, lon := longitude_bustops + 0]


setkey(busdata.data, lat)

gpsID_by_lat <- 
  gpsdata[, .(id), keyby = "lat"]


By_latitude <- 
  busdata.data[gpsdata, 
               on = "lat",

               # within 0.5 degrees of latitude
               roll = 0.5, 
               # +/-
               rollends = c(TRUE, TRUE),

               # and remove those beyond 0.5 degrees
               nomatch=0L] %>%
  .[, .(id_lat = id,
        name_lat = name,
        bus_lat = latitude_bustops,
        bus_lon = longitude_bustops,
        gps_lat,
        gps_lon),
    keyby = .(lon = gps_lon)]

setkey(busdata.data, lon)

By_latlon <-
  busdata.data[By_latitude,
               on = c("name==name_lat", "lon"),

               # within 0.5 degrees of latitude
               roll = 0.5, 
               # +/-
               rollends = c(TRUE, TRUE),
               # and remove those beyond 0.5 degrees
               nomatch=0L]

By_latlon[, distance := haversine_distance(lat1 = gps_lat, 
                                           lon1 = gps_lon,
                                           lat2 = bus_lat,
                                           lon2 = bus_lon)]

By_latlon[distance < 0.2]

如何使用 geohash 来减少相同的位置。

http://en.wikipedia.org/wiki/Geohash

关于精度: https://gis.stackexchange.com/questions/115280/what-is-the-precision-of-geohash

#   (maximum X axis error, in km)     
1   ± 2500
2   ± 630
3   ± 78
4   ± 20
5   ± 2.4
6   ± 0.61
7   ± 0.076
8   ± 0.019
9   ± 0.0024
10  ± 0.00060
11  ± 0.000074
# !pip install pygeodesy
from pygeodesy import geohash
def df_add_geohash(df, precision=7, col_lat='lat', col_lng='lon', geo_col='geo'):
    df_to_convert = df.copy()
    cond = df_to_convert[col_lat].notnull()
    df_to_convert.loc[cond, geo_col] = (df_to_convert[cond].apply(lambda x: geohash.encode(
                        x[col_lat], x[col_lng], precision=precision) 
                       ,axis=1))
    return df_to_convert


# apply the function
dfn = df_add_geohash(df, 7, 'lat', 'lon')
# filter the continuous same geo
cond = dfn['geo'] == dfn['geo'].shift(1)
print(dfn[~cond])

#          lat       lon      geo
# 0  51.803593 -4.741181  gchwsne
# 3  51.803152 -4.740578  gchwsnk
# 4  51.803522 -4.740393  gchwsns
# 5  51.803613 -4.740897  gchwsne
# 6  51.803487 -4.740276  gchwsns
# 7  51.803530 -4.741367  gchwsne
# 8  51.802874 -4.741219  gchwsn7
# 9  51.803265 -4.740194  gchwsnk

如果想得到更精确的结果,可以计算附近记录点之间的距离,过滤小于1m的距离。

df = pd.DataFrame(
    [{'lat': 51.803593492462596, 'lon': -4.741180850463811},
     {'lat': 51.80361005410785, 'lon': -4.740873766196046},
     {'lat': 51.80351890237921, 'lon': -4.7415190658979895},
     {'lat': 51.80315237194233, 'lon': -4.74057836870229},
     {'lat': 51.803522329364824, 'lon': -4.7403926507926215},
     {'lat': 51.80361261925252, 'lon': -4.740896906964529},
     {'lat': 51.803487420307796, 'lon': -4.740276454154127},
     {'lat': 51.80353017387817, 'lon': -4.74136689657748},
     {'lat': 51.80287372471039, 'lon': -4.741218904144231},
     {'lat': 51.80326530703784, 'lon': -4.740193742088211}]
)

df['lat_pre'] =  df['lat'].shift(1)
df['lon_pre'] =  df['lon'].shift(1)

# !pip install geopy
# https://geopy.readthedocs.io/en/stable/#installation
from geopy.distance import geodesic
cond = df['lat_pre'].notnull()
df.loc[cond, 'distance'] = df[cond].apply(lambda row: geodesic((row.lat, row.lon),
                                                               (row.lat_pre, row.lon_pre)).m
                                             , axis=1)

cond = df['distance'] < 1
print(df[~cond])

    #          lat       lon    lat_pre   lon_pre   distance
    # 0  51.803593 -4.741181        NaN       NaN        NaN
    # 1  51.803610 -4.740874  51.803593 -4.741181  21.262108
    # 2  51.803519 -4.741519  51.803610 -4.740874  45.652403
    # 3  51.803152 -4.740578  51.803519 -4.741519  76.639257
    # 4  51.803522 -4.740393  51.803152 -4.740578  43.110166
    # 5  51.803613 -4.740897  51.803522 -4.740393  36.204379
    # 6  51.803487 -4.740276  51.803613 -4.740897  45.007709
    # 7  51.803530 -4.741367  51.803487 -4.740276  75.367133
    # 8  51.802874 -4.741219  51.803530 -4.741367  73.748842
    # 9  51.803265 -4.740194  51.802874 -4.741219  83.059036

我根据找到@Ferris 建议的距离制定了一个解决方案。 'mpu.haversine_distance' 函数 returns 两个 lat/lng 对之间的距离,以公里为单位。我乘以 1000 以显示为米。然后我将这些距离相加,如果它超过 1 米,我会报告 lat/lng。这个可以调整到3米等等

import mpu

def processTheSet(batch):
    mycursorll = mydb.cursor()
    sqlll = "SELECT latt, longg FROM interPol WHERE batchID = %s ORDER BY `fileTime`"
    batchI = (batch,)
    mycursorll.execute(sqlll, batchI)
    firstResult = mycursorll.fetchone()
    firstLat = float(firstResult[0])
    firstLng = float(firstResult[1])
    myresultll = mycursorll.fetchall()
    i = 0
    count = 0
    counter = 0
    dist = 0
    for x in myresultll:
        i = i + 1
        thisLat = float(x[0])
        thisLong = float(x[1])
        dist = mpu.haversine_distance((firstLat, firstLng), (thisLat, thisLong)) * 1000
        firstLat = thisLat
        firstLng = thisLong
        counter = counter + dist
        if counter > 1:
            count = count + 1
            counter = 0
            print(thisLong, ",", thisLat)