将 opensensors json 文件转换为数据框(或 table)
convert opensensors json file to data frame (or table)
低成本空气质量传感器 (AQE) 将其数据发送到 opensensors.io 服务器。每 x 秒发送一串信息(时间戳、污染物浓度等)。可以检索结构为 json 文件的数据。 Opensensors 术语使用 devices
、topics
、organizations
和 payloads
。我已经弄清楚如何设置 curl 句柄并使用 curl 包下载 csv 文件。这是代码
curl_download(url = myURL2, destfile = "curlDownloadTest.csv", mode = "w", handle = myCurlHandle)
已下载数据的示例位于 https://github.com/GeraldCNelson/AQEAnalysis/commit/c6ee29545d07835c5a920bf2b37625adb78462aa
我用jsonlite包里的fromJSON
改造这个
temp <- fromJSON("curlDownloadTest.csv", simplifyDataFrame = FALSE)
输出 (temp
) 是一个包含 2 个元素的大列表 - messages 和 next。 messages
包含所有数据; next
是一个 link 用于获取下一组数据(并非一次全部下载)。
消息列表由多组列表组成(每组上传数据一组);每个集合有五个元素 - device
、owner
、topic
、date
和 payload
。有效负载是 3 的列表 - encoding
(始终为 chr utf-8)、content-type
(始终为 chr "application/json")和 text
。文本列表看起来像 json 格式(这是一个字符串片段 - "{\"serial-number\":\"egg00802aaa019b0111\",\"converted-value\":69.52,\"converted-units\" :\"degF\")
我想将此数据重组为一个数据框(或数据 table)
日期信息作为一列,来自负载的测试信息作为其余列(serial-number
、converted value
等...
我不知道如何将有效负载列表中的文本列表从其当前(json?)结构转换为我可以绑定到数据框的内容。
谢天谢地,一切都很统一:
library(jsonlite)
library(dplyr)
df <- fromJSON("curlDownloadTest.csv")
bind_cols(
select(df$messages, device, owner, topic, date),
stream_in(textConnection(df$messages$payload$text), flatten=TRUE)
) -> df
glimpse(df)
## Observations: 742
## Variables: 14
## $ device <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ owner <chr> "wickeddevice", "wickeddevice", "wickeddevice", "wickeddevice", "wicke...
## $ topic <chr> "/orgs/wd/aqe/temperature/egg00802aaa019b0111", "/orgs/wd/aqe/humidity...
## $ date <chr> "2016-10-10T17:02:09.507Z", "2016-10-10T17:02:09.811Z", "2016-10-10T17...
## $ serial-number <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ converted-value <dbl> 63.20, 43.31, 0.52, -25.20, 63.70, 42.85, 0.53, -13.32, 64.01, 42.58, ...
## $ converted-units <chr> "degF", "percent", "ppb", "ppb", "degF", "percent", "ppb", "ppb", "deg...
## $ raw-value <dbl> 63.200000, 43.310000, 0.221252, -0.827832, 63.700000, 42.850000, 0.221...
## $ raw-instant-value <dbl> 63.48000, 43.07000, 0.22149, -0.82785, 63.91000, 42.66000, 0.22073, -0...
## $ raw-units <chr> "degF", "percent", "volt", "volt", "degF", "percent", "volt", "volt", ...
## $ sensor-part-number <chr> "SHT25", "SHT25", "NO2-B4-ISB", "3SP-O3-20-PCB", "SHT25", "SHT25", "NO...
## $ raw-value2 <dbl> NA, NA, 0.222732, NA, NA, NA, 0.222797, NA, NA, NA, 0.222460, NA, NA, ...
## $ raw-instant-value2 <dbl> NA, NA, 0.22330, NA, NA, NA, 0.22273, NA, NA, NA, 0.22341, NA, NA, NA,...
## $ compensated-value <dbl> NA, NA, 0.62, -25.25, NA, NA, 0.63, -13.37, NA, NA, 0.02, -18.08, NA, ...
低成本空气质量传感器 (AQE) 将其数据发送到 opensensors.io 服务器。每 x 秒发送一串信息(时间戳、污染物浓度等)。可以检索结构为 json 文件的数据。 Opensensors 术语使用 devices
、topics
、organizations
和 payloads
。我已经弄清楚如何设置 curl 句柄并使用 curl 包下载 csv 文件。这是代码
curl_download(url = myURL2, destfile = "curlDownloadTest.csv", mode = "w", handle = myCurlHandle)
已下载数据的示例位于 https://github.com/GeraldCNelson/AQEAnalysis/commit/c6ee29545d07835c5a920bf2b37625adb78462aa
我用jsonlite包里的fromJSON
改造这个
temp <- fromJSON("curlDownloadTest.csv", simplifyDataFrame = FALSE)
输出 (temp
) 是一个包含 2 个元素的大列表 - messages 和 next。 messages
包含所有数据; next
是一个 link 用于获取下一组数据(并非一次全部下载)。
消息列表由多组列表组成(每组上传数据一组);每个集合有五个元素 - device
、owner
、topic
、date
和 payload
。有效负载是 3 的列表 - encoding
(始终为 chr utf-8)、content-type
(始终为 chr "application/json")和 text
。文本列表看起来像 json 格式(这是一个字符串片段 - "{\"serial-number\":\"egg00802aaa019b0111\",\"converted-value\":69.52,\"converted-units\" :\"degF\")
我想将此数据重组为一个数据框(或数据 table)
日期信息作为一列,来自负载的测试信息作为其余列(serial-number
、converted value
等...
我不知道如何将有效负载列表中的文本列表从其当前(json?)结构转换为我可以绑定到数据框的内容。
谢天谢地,一切都很统一:
library(jsonlite)
library(dplyr)
df <- fromJSON("curlDownloadTest.csv")
bind_cols(
select(df$messages, device, owner, topic, date),
stream_in(textConnection(df$messages$payload$text), flatten=TRUE)
) -> df
glimpse(df)
## Observations: 742
## Variables: 14
## $ device <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ owner <chr> "wickeddevice", "wickeddevice", "wickeddevice", "wickeddevice", "wicke...
## $ topic <chr> "/orgs/wd/aqe/temperature/egg00802aaa019b0111", "/orgs/wd/aqe/humidity...
## $ date <chr> "2016-10-10T17:02:09.507Z", "2016-10-10T17:02:09.811Z", "2016-10-10T17...
## $ serial-number <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ converted-value <dbl> 63.20, 43.31, 0.52, -25.20, 63.70, 42.85, 0.53, -13.32, 64.01, 42.58, ...
## $ converted-units <chr> "degF", "percent", "ppb", "ppb", "degF", "percent", "ppb", "ppb", "deg...
## $ raw-value <dbl> 63.200000, 43.310000, 0.221252, -0.827832, 63.700000, 42.850000, 0.221...
## $ raw-instant-value <dbl> 63.48000, 43.07000, 0.22149, -0.82785, 63.91000, 42.66000, 0.22073, -0...
## $ raw-units <chr> "degF", "percent", "volt", "volt", "degF", "percent", "volt", "volt", ...
## $ sensor-part-number <chr> "SHT25", "SHT25", "NO2-B4-ISB", "3SP-O3-20-PCB", "SHT25", "SHT25", "NO...
## $ raw-value2 <dbl> NA, NA, 0.222732, NA, NA, NA, 0.222797, NA, NA, NA, 0.222460, NA, NA, ...
## $ raw-instant-value2 <dbl> NA, NA, 0.22330, NA, NA, NA, 0.22273, NA, NA, NA, 0.22341, NA, NA, NA,...
## $ compensated-value <dbl> NA, NA, 0.62, -25.25, NA, NA, 0.63, -13.37, NA, NA, 0.02, -18.08, NA, ...