将普罗米修斯时间序列 JSON 指标转换为 pandas 数据帧
convert prometheus time series JSON metrics to pandas dataframes
我正在尝试将普罗米修斯时间序列 JSON 指标转换为 pandas 数据帧。
仅供参考...
#!/usr/bin/env python3
import requests
import copy
import pandas as pd
URL = "http://x.x.x.x:30000/api/v1/query?query=node_memory_MemFree_bytes{instance=\"10.56.0.4:9100\",job=\"node-exporter\"}[1d]"
r = requests.get(url = URL)
data = r.json()
data_dict={}
metric_list = []
# print(data['data']['result']['values'])
# exit()
for i in data['data']['result']:
for j in i['values']:
data_dict = copy.deepcopy(i['metric'])
data_dict['time'] = j[0]
data_dict['value'] = j[1]
metric_list.append(data_dict)
df_metric = pd.DataFrame(metric_list)
df1 = df_metric[['time', 'value']]
df1['time'] = pd.to_datetime(df1['time'],unit='s')
df1 = df1.set_index('time')
df1["value"] = df1.values.astype(float)
### before resampling
print(df1)
### after resampling to 15mins
df1 = df1.resample('15min').mean()
print(df1)
print(type(df1))
但我得到的输出是单列输出,其值为第 header 列。相反,我希望输出包含两个 cloumns,例如 date 和 value.
当前输出:
value
time
2021-08-12 16:15:00 1.309109e+08
2021-08-12 16:30:00 1.376348e+08
2021-08-12 16:45:00 1.376068e+08
2021-08-12 17:00:00 1.354043e+08
2021-08-12 17:15:00 1.355775e+08
预期输出:
ds y
2021-08-12 16:15:00 1.309109e+08
2021-08-12 16:30:00 1.376348e+08
2021-08-12 16:45:00 1.376068e+08
2021-08-12 17:00:00 1.354043e+08
2021-08-12 17:15:00 1.355775e+08
所以现在我可以应用不同的时间序列预测模型,例如 Facebook Propher
目前你已经设置了time
作为索引,所以如果你想要它是一个列,你可以使用reset_index()
.
df1 = df1.reset_index()
然后,您可以通过以下方式重命名列标签:
df1.columns = ['ds', 'y']
或通过:
df1 = df1.rename({'time': 'ds', 'value': 'y'}, axis=1)
结果:
print(df1)
ds y
0 2021-08-12 16:15:00 130910900.0
1 2021-08-12 16:30:00 137634800.0
2 2021-08-12 16:45:00 137606800.0
3 2021-08-12 17:00:00 135404300.0
4 2021-08-12 17:15:00 135577500.0
我正在尝试将普罗米修斯时间序列 JSON 指标转换为 pandas 数据帧。
仅供参考...
#!/usr/bin/env python3
import requests
import copy
import pandas as pd
URL = "http://x.x.x.x:30000/api/v1/query?query=node_memory_MemFree_bytes{instance=\"10.56.0.4:9100\",job=\"node-exporter\"}[1d]"
r = requests.get(url = URL)
data = r.json()
data_dict={}
metric_list = []
# print(data['data']['result']['values'])
# exit()
for i in data['data']['result']:
for j in i['values']:
data_dict = copy.deepcopy(i['metric'])
data_dict['time'] = j[0]
data_dict['value'] = j[1]
metric_list.append(data_dict)
df_metric = pd.DataFrame(metric_list)
df1 = df_metric[['time', 'value']]
df1['time'] = pd.to_datetime(df1['time'],unit='s')
df1 = df1.set_index('time')
df1["value"] = df1.values.astype(float)
### before resampling
print(df1)
### after resampling to 15mins
df1 = df1.resample('15min').mean()
print(df1)
print(type(df1))
但我得到的输出是单列输出,其值为第 header 列。相反,我希望输出包含两个 cloumns,例如 date 和 value.
当前输出:
value
time
2021-08-12 16:15:00 1.309109e+08
2021-08-12 16:30:00 1.376348e+08
2021-08-12 16:45:00 1.376068e+08
2021-08-12 17:00:00 1.354043e+08
2021-08-12 17:15:00 1.355775e+08
预期输出:
ds y
2021-08-12 16:15:00 1.309109e+08
2021-08-12 16:30:00 1.376348e+08
2021-08-12 16:45:00 1.376068e+08
2021-08-12 17:00:00 1.354043e+08
2021-08-12 17:15:00 1.355775e+08
所以现在我可以应用不同的时间序列预测模型,例如 Facebook Propher
目前你已经设置了time
作为索引,所以如果你想要它是一个列,你可以使用reset_index()
.
df1 = df1.reset_index()
然后,您可以通过以下方式重命名列标签:
df1.columns = ['ds', 'y']
或通过:
df1 = df1.rename({'time': 'ds', 'value': 'y'}, axis=1)
结果:
print(df1)
ds y
0 2021-08-12 16:15:00 130910900.0
1 2021-08-12 16:30:00 137634800.0
2 2021-08-12 16:45:00 137606800.0
3 2021-08-12 17:00:00 135404300.0
4 2021-08-12 17:15:00 135577500.0