我可以使用 groupby 来解决 python 中的这种情况吗?
May I use groupby to solve this case in python?
我有一个 redis 数据库,它每十秒从 Arduino 接收一次数据。
现在,我想用6个10秒的数据计算1个60秒的数据,然后得到6个10秒的数据的avg,max,min如下。
import json
a = [u'{"id":"proximity_sensor1","tstamp":1440643570238,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":10}',
u'{"id":"proximity_sensor1","tstamp":1440643580307,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":8}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}']
a = map(lambda x: json.loads(x), a)
#print a
def abc(aaa):
for index in range(0, len(aaa), 6):
abc = aaa[index:(index+6)]
tuples = [('avg', 'max', 'min')]
avg = sum(map(lambda x: x['avg'], abc))/6
min_ = min(map(lambda x: x['min'], abc))
max_ = max(map(lambda x: x['max'], abc))
yield [avg, max_, min_]
print list(abc(a))
我在想有没有更好的方法解决。如果我使用itertools.groupby
,我可以更快地解决它吗?或者谁有简化计算过程的好主意?
通常 itertools.groupby 与某些条件一起使用以对元素进行分组,但由于在您的情况下您没有任何此类条件,相反您只想将每 6 个元素分组在一起,我不认为使用 itertools.groupby 会带来任何好处。
话虽这么说,但我可以建议一些其他改进 -
您可以将 'key' 参数用于 max/min 函数,而不是当前的 map/lambda 方法,示例 -
max_ = max(abc, key= lambda x:x['max'])['max']
与 min() 函数类似。
此外,我认为对 sum() 而不是 map/lambda 进行列表理解会更具可读性。例子-
avg = sum([x['avg'] for x in abc])/6
我有一个 redis 数据库,它每十秒从 Arduino 接收一次数据。
现在,我想用6个10秒的数据计算1个60秒的数据,然后得到6个10秒的数据的avg,max,min如下。
import json
a = [u'{"id":"proximity_sensor1","tstamp":1440643570238,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":10}',
u'{"id":"proximity_sensor1","tstamp":1440643580307,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":8}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}',
u'{"id":"proximity_sensor1","tstamp":1440643590242,"avg":15.0,"coefVariation":0.0,"anom":0,"max":15.0,"min":15.0,"sample_size":9}']
a = map(lambda x: json.loads(x), a)
#print a
def abc(aaa):
for index in range(0, len(aaa), 6):
abc = aaa[index:(index+6)]
tuples = [('avg', 'max', 'min')]
avg = sum(map(lambda x: x['avg'], abc))/6
min_ = min(map(lambda x: x['min'], abc))
max_ = max(map(lambda x: x['max'], abc))
yield [avg, max_, min_]
print list(abc(a))
我在想有没有更好的方法解决。如果我使用itertools.groupby
,我可以更快地解决它吗?或者谁有简化计算过程的好主意?
通常 itertools.groupby 与某些条件一起使用以对元素进行分组,但由于在您的情况下您没有任何此类条件,相反您只想将每 6 个元素分组在一起,我不认为使用 itertools.groupby 会带来任何好处。
话虽这么说,但我可以建议一些其他改进 -
您可以将 'key' 参数用于 max/min 函数,而不是当前的 map/lambda 方法,示例 -
max_ = max(abc, key= lambda x:x['max'])['max']
与 min() 函数类似。
此外,我认为对 sum() 而不是 map/lambda 进行列表理解会更具可读性。例子-
avg = sum([x['avg'] for x in abc])/6