最近的时间戳价格 - Python 中的现成数据结构？

Question

价格插值。 Python 高效未命中搜索的数据结构？

我有价格数据

[1427837961000.0, 243.586], [1427962162000.0, 245.674], [1428072262000.0, 254.372], [1428181762000.0, 253.366], ...

第一个维度是时间戳，第二个维度是价格。

现在我想知道最接近给定时间戳的价格，例如至 1427854534654.

什么是最好的 Python 容器、数据结构或每秒解决数百或数千次的算法？这是一个标准问题，在很多应用中都需要解决，所以应该有一个现成的优化解决方案。

我用 Google 搜索过，只找到了一些我可以构建的点点滴滴 - 但我想这个问题很常见，整个数据结构应该作为一个模块准备好了吗？

编辑：已解决。

我用了 with my 个日期。

表演很棒：

3000000 次调用耗时 12.82 秒，因此每次调用 0.00000427（数据长度 = 1143）。

非常感谢！ Whosebug 太棒了，你们是最棒的帮手！

Answer 1

试试这个以获得最接近的值

l = [ [1427837961000.0, 243.586], [1427962162000.0, 245.674], [1428072262000.0, 254.372], [1428181762000.0, 253.366]]
check_value = 1427854534654
>>>min(l, key=lambda x:abs(x[0]-check_value))[0]
1427837961000.0

Answer 2

这个问题很常见，即按时间戳值对数据进行排序，然后对每个可能的查询进行二进制搜索。可以使用 bisect module:

执行二进制搜索

data = [
    [1427837961000.0, 243.586], 
    [1427962162000.0, 245.674], 
    [1428072262000.0, 254.372], 
    [1428181762000.0, 253.366]
]


data.sort(key=lambda l: l[0]) # Sort by timestamp
timestamps = [l[0] for l in data] # Extract timestamps

import bisect

def find_closest(t):
    idx = bisect.bisect_left(timestamps, t) # Find insertion point

    # Check which timestamp with idx or idx - 1 is closer
    if idx > 0 and abs(timestamps[idx] - t) > abs(timestamps[idx - 1] - t):
         idx -= 1

    return data[idx][1] # Return price

我们可以这样测试：

>>> find_closest(1427854534654)
243.586

如果我们有 n 个查询和 m 个时间戳值，那么每个查询需要 O(log m) 个时间。所以总共需要的时间是 O(n * log m).

在上面的算法中，我们在两个索引之间进行搜索。如果我们只使用时间戳间隔的中点，我们可以进一步简化并创建更快的搜索：

midpoints = [(a + b) / 2 for a, b in zip(timestamps, timestamps[1:])]
def find_closest_through_midpoints(t):
    return data[bisect.bisect_left(midpoints, t)][1]

Answer 3

已解决！

我用了 with my 个日期。

表演很棒：

3000000 次调用耗时 12.82 秒，因此每次调用 0.00000427（数据长度 = 1143）。

非常感谢！ Whosebug 很棒，你们帮手最棒！

最近的时间戳价格 - Python 中的现成数据结构？

nearest timestamp price - ready data structure in Python?

python

containers

timestamp

finance

closest