python 两个数据集的生成器表达式
python generator expressions for two data sets
#Find values that are in range
in_range = [lo_lim <= v <= hi_lim for v in values]
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]
#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)
我正在尝试扩充此代码以采用两组值和 2 对 hi/lo 限制,
要计算在这些限制内花费的总时间,当 'in limit' 对于同一点为真时的组合限制,即
如果有100个数据点(两个数据集长度相同,检查每个点,
if values_1[45] and values_2[45] are in their respective limits
然后算在范围内。
本质上是将此 if 转换为生成器表达式:
if lo_lim_1<=Data_Points_1[i]<=hi_lim_1 and lo_lim_2<=Data_Points_2[i]<=hi_lim_2:
计数 运行s,如果 运行 长度是一个数据点,则应用缓冲区,否则应用采样率转换。
如果我理解你的问题,这应该可行。基本思想是将两个序列 zip
成对对应的值,然后使用 and
操作来查找它们都在对应范围内的情况:
#Find values that are in range
in_range = [lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2)]
# code is unchanged from here
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v] # this is the same as yours
#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)
在你的情况下,如果你使用 python 2.x,你可以使用 itertools.izip
而不是 zip
来节省一些内存,并且对于 python 2.x 和 3.x 你可以使用生成器表达式来保存更多:
#Find values that are in range
in_range = (lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2))
#Find runs of in-range values
runs = (sum(1 for _ in group) for v, group in groupby(in_range) if v) # this is the same as yours
#Estimate total time spent in-range
defval = Buffer_Value*sample_rate
total_time = sum(v if v > 1 else defval for v in runs)
#Find values that are in range
in_range = [lo_lim <= v <= hi_lim for v in values]
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]
#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)
我正在尝试扩充此代码以采用两组值和 2 对 hi/lo 限制, 要计算在这些限制内花费的总时间,当 'in limit' 对于同一点为真时的组合限制,即
如果有100个数据点(两个数据集长度相同,检查每个点,
if values_1[45] and values_2[45] are in their respective limits
然后算在范围内。 本质上是将此 if 转换为生成器表达式:
if lo_lim_1<=Data_Points_1[i]<=hi_lim_1 and lo_lim_2<=Data_Points_2[i]<=hi_lim_2:
计数 运行s,如果 运行 长度是一个数据点,则应用缓冲区,否则应用采样率转换。
如果我理解你的问题,这应该可行。基本思想是将两个序列 zip
成对对应的值,然后使用 and
操作来查找它们都在对应范围内的情况:
#Find values that are in range
in_range = [lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2)]
# code is unchanged from here
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v] # this is the same as yours
#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)
在你的情况下,如果你使用 python 2.x,你可以使用 itertools.izip
而不是 zip
来节省一些内存,并且对于 python 2.x 和 3.x 你可以使用生成器表达式来保存更多:
#Find values that are in range
in_range = (lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2))
#Find runs of in-range values
runs = (sum(1 for _ in group) for v, group in groupby(in_range) if v) # this is the same as yours
#Estimate total time spent in-range
defval = Buffer_Value*sample_rate
total_time = sum(v if v > 1 else defval for v in runs)