如何从 r 中的 .ini 文件中提取数据?
How to extract data from .ini file in r?
我将每天的温度数据存储在 ini 文件中。我需要使用 r 读取这些数据来创建时间序列。我使用 ini 包来读取数据,但数据在列表中。我只需要提取 * 字符之间的温度数据。
我使用了 unlist() 函数和 data.frame() 但无法获取包含温度数据的数据框。
library(ini)
filename="C:/Research/Time_series/WT/2018/01/RAWWT20180101.ini"
# https://www.dropbox.com/s/vyinidvs947mw9g/RAWWT20180101.ini?dl=0
data <- read.ini(filename, encoding = getOption("encoding"))
str(data)
List of 1
$ Historical Data:List of 25
..$ 00H: chr "* 29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 "| __truncated__
..$ 01H: chr "* 29.6 29.6 29.5 29.5 29.5 29.6 29.6 29.5 29.5 29.5 29.5 29.5 29.5 29.5 29.6 29.5 29.5 29.5 29.5 29.5 29.5 "| __truncated__
..$ 02H: chr "* 29.4 29.4 29.4 29.3 29.3 29.4 29.3 29.3 29.3 29.4 29.3 29.3 29.3 29.3 29.3 29.4 29.3 29.3 29.3 29.3 29.3 "| __truncated__
..$ 03H: chr "* 29.2 29.2 29.3 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.1 29.2 29.1 "| __truncated__
..$ 04H: chr "* 29.2 29.2 29.1 29.2 29.1 29.2 29.2 29.1 29.1 29.1 29.1 29.2 29.2 29.1 29.2 29.2 29.1 29.2 29.2 29.1 29.1 "| __truncated__
..$ 05H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 06H: chr "* 29.1 29.0 29.0 29.0 29.1 29.0 29.1 29.1 29.1 29.0 29.1 29.1 29.1 29.1 29.1 29.0 29.1 29.1 29.0 29.0 29.0 "| __truncated__
..$ 07H: chr "* 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 28.9 29.0 29.0 29.0 29.0 29.0 "| __truncated__
..$ 08H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 09H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 10H: chr "* 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 "| __truncated__
..$ 11H: chr "* 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.5 29.5 29.5 29.5 29.6 29.6 29.6 29.5 29.5 29.5 29.5 29.5 29.5 29.5 "| __truncated__
..$ 12H: chr "* 29.6 29.6 29.4 29.3 29.3 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.3 29.3 29.3 29.4 29.4 29.4 29.3 29.3 "| __truncated__
..$ 13H: chr "* 29.5 29.4 29.6 29.6 29.6 29.6 29.6 29.6 29.4 29.4 29.4 29.4 29.6 29.6 29.6 29.6 29.7 29.8 29.8 29.8 29.8 "| __truncated__
..$ 14H: chr "* 29.9 29.9 29.9 30.0 30.0 30.0 29.9 30.0 30.0 30.0 30.0 30.0 30.0 30.1 30.1 30.1 30.1 30.0 30.1 30.0 30.1 "| __truncated__
..$ 15H: chr "* 30.1 30.1 30.1 30.1 30.1 30.1 30.2 30.2 30.2 30.1 30.1 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 "| __truncated__
..$ 16H: chr "* 30.3 30.2 30.2 30.2 30.3 30.3 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 "| __truncated__
..$ 17H: chr "* 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 "| __truncated__
..$ 18H: chr "* 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 "| __truncated__
..$ 19H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 20H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 21H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.7 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 22H: chr "* 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 "| __truncated__
..$ 23H: chr "* 29.6 29.5 29.6 29.6 29.5 29.5 29.6 29.6 29.6 29.6 29.6 29.5 29.5 29.6 29.5 29.6 29.6 29.6 29.6 29.6 29.6 "| __truncated__
..$ 24H: chr "* 30.3 1528 28.9 0641*"
s <-unlist(data)
ss <-data.frame(s,,stringsAsFactors=FALSE)
我需要创建类似数值的输出,如下所示。
29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7
也许这样行得通?
library(ini)
filename<-choose.files()
data <- read.ini(filename, encoding = getOption("encoding"))
str(data)
s <-unlist(data)
ss <-data.frame(s,stringsAsFactors=FALSE)
newdata<-apply(ss,1,function(x){as.numeric(unlist(str_match_all(x,"\b\-*\d+\.*\d*\b")))})
> str(newdata)
List of 25
$ Historical Data.00H: num [1:60] 29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7 ...
$ Historical Data.01H: num [1:60] 29.6 29.6 29.5 29.5 29.5 29.6 29.6 29.5 29.5 29.5 ...
仅使用基本 R 读取 ini 文件的代码段:
blank = "^\s*$"
header = "^\[(.*)\]$"
key_value = "^.*=.*$"
extract = function(regexp, x) regmatches(x, regexec(regexp, x))[[1]][2]
read_ini = function(fn) {
lines = readLines(fn)
ini = list()
for (l in lines) {
if (grepl(blank, l)) next
if (grepl(header, l)) {
section = extract(header, l)
ini[[section]] = list()
}
if (grepl(key_value, l)) {
kv = strsplit(l, "\s*=\s*")[[1]]
ini[[section]][[kv[1]]] = kv[2]
}
}
ini
}
这将被读入列表的列表,headers部分作为外部列表的名称。 stringr
的正则表达式函数会更清晰,但我认为对于短脚本来说值得付出努力。
我将每天的温度数据存储在 ini 文件中。我需要使用 r 读取这些数据来创建时间序列。我使用 ini 包来读取数据,但数据在列表中。我只需要提取 * 字符之间的温度数据。
我使用了 unlist() 函数和 data.frame() 但无法获取包含温度数据的数据框。
library(ini)
filename="C:/Research/Time_series/WT/2018/01/RAWWT20180101.ini"
# https://www.dropbox.com/s/vyinidvs947mw9g/RAWWT20180101.ini?dl=0
data <- read.ini(filename, encoding = getOption("encoding"))
str(data)
List of 1
$ Historical Data:List of 25
..$ 00H: chr "* 29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 29.7 "| __truncated__
..$ 01H: chr "* 29.6 29.6 29.5 29.5 29.5 29.6 29.6 29.5 29.5 29.5 29.5 29.5 29.5 29.5 29.6 29.5 29.5 29.5 29.5 29.5 29.5 "| __truncated__
..$ 02H: chr "* 29.4 29.4 29.4 29.3 29.3 29.4 29.3 29.3 29.3 29.4 29.3 29.3 29.3 29.3 29.3 29.4 29.3 29.3 29.3 29.3 29.3 "| __truncated__
..$ 03H: chr "* 29.2 29.2 29.3 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.2 29.1 29.2 29.1 "| __truncated__
..$ 04H: chr "* 29.2 29.2 29.1 29.2 29.1 29.2 29.2 29.1 29.1 29.1 29.1 29.2 29.2 29.1 29.2 29.2 29.1 29.2 29.2 29.1 29.1 "| __truncated__
..$ 05H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 06H: chr "* 29.1 29.0 29.0 29.0 29.1 29.0 29.1 29.1 29.1 29.0 29.1 29.1 29.1 29.1 29.1 29.0 29.1 29.1 29.0 29.0 29.0 "| __truncated__
..$ 07H: chr "* 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 29.0 28.9 29.0 29.0 29.0 29.0 29.0 "| __truncated__
..$ 08H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 09H: chr "* 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 29.1 "| __truncated__
..$ 10H: chr "* 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.3 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 "| __truncated__
..$ 11H: chr "* 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.5 29.5 29.5 29.5 29.6 29.6 29.6 29.5 29.5 29.5 29.5 29.5 29.5 29.5 "| __truncated__
..$ 12H: chr "* 29.6 29.6 29.4 29.3 29.3 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.4 29.3 29.3 29.3 29.4 29.4 29.4 29.3 29.3 "| __truncated__
..$ 13H: chr "* 29.5 29.4 29.6 29.6 29.6 29.6 29.6 29.6 29.4 29.4 29.4 29.4 29.6 29.6 29.6 29.6 29.7 29.8 29.8 29.8 29.8 "| __truncated__
..$ 14H: chr "* 29.9 29.9 29.9 30.0 30.0 30.0 29.9 30.0 30.0 30.0 30.0 30.0 30.0 30.1 30.1 30.1 30.1 30.0 30.1 30.0 30.1 "| __truncated__
..$ 15H: chr "* 30.1 30.1 30.1 30.1 30.1 30.1 30.2 30.2 30.2 30.1 30.1 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 "| __truncated__
..$ 16H: chr "* 30.3 30.2 30.2 30.2 30.3 30.3 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 30.2 "| __truncated__
..$ 17H: chr "* 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 30.1 "| __truncated__
..$ 18H: chr "* 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 29.9 "| __truncated__
..$ 19H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 20H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 21H: chr "* 29.8 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.7 29.8 29.8 29.8 29.8 29.8 "| __truncated__
..$ 22H: chr "* 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 29.6 "| __truncated__
..$ 23H: chr "* 29.6 29.5 29.6 29.6 29.5 29.5 29.6 29.6 29.6 29.6 29.6 29.5 29.5 29.6 29.5 29.6 29.6 29.6 29.6 29.6 29.6 "| __truncated__
..$ 24H: chr "* 30.3 1528 28.9 0641*"
s <-unlist(data)
ss <-data.frame(s,,stringsAsFactors=FALSE)
我需要创建类似数值的输出,如下所示。
29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7
也许这样行得通?
library(ini)
filename<-choose.files()
data <- read.ini(filename, encoding = getOption("encoding"))
str(data)
s <-unlist(data)
ss <-data.frame(s,stringsAsFactors=FALSE)
newdata<-apply(ss,1,function(x){as.numeric(unlist(str_match_all(x,"\b\-*\d+\.*\d*\b")))})
> str(newdata)
List of 25
$ Historical Data.00H: num [1:60] 29.7 29.8 29.8 29.8 29.8 29.8 29.7 29.8 29.8 29.7 ...
$ Historical Data.01H: num [1:60] 29.6 29.6 29.5 29.5 29.5 29.6 29.6 29.5 29.5 29.5 ...
仅使用基本 R 读取 ini 文件的代码段:
blank = "^\s*$"
header = "^\[(.*)\]$"
key_value = "^.*=.*$"
extract = function(regexp, x) regmatches(x, regexec(regexp, x))[[1]][2]
read_ini = function(fn) {
lines = readLines(fn)
ini = list()
for (l in lines) {
if (grepl(blank, l)) next
if (grepl(header, l)) {
section = extract(header, l)
ini[[section]] = list()
}
if (grepl(key_value, l)) {
kv = strsplit(l, "\s*=\s*")[[1]]
ini[[section]][[kv[1]]] = kv[2]
}
}
ini
}
这将被读入列表的列表,headers部分作为外部列表的名称。 stringr
的正则表达式函数会更清晰,但我认为对于短脚本来说值得付出努力。