在 R 中将字符串转换为 table
Convert string to table in R
我正在尝试将以前的中央银行官方报表转换为 table 格式。我有以下刮板:
library(rvest)
library(dplyr)
url <- "http://nationalbank.kz/?docid=105&cmomdate=2009-05-15&switch=english"
p <- url %>%
read_html() %>%
html_nodes(xpath='//table[1]') %>%
html_table(fill = T)
gh = p[[11]]
str(gh)
txt = gh[, 1]
产生:
[1] "GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009\r\n
GOVERNMENT SECURITIES PLACEMENT RESULT\r\n\r\nThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:\r\n\r\nType of security\tNotes
NBK\r\nNIN\tKZW1KD281882\r\nMaturity\t28 days\r\nType of
placement\tAuction\r\nDate of placement\t15.05.2009\r\nSettlement
date\t15.05.2009\r\nRedemption date\t12.06.2009\r\nActual amount of
placement\t24 999 999 991.30 tenge\r\n\t251 003 524
(quantity)\r\nDemand\t127 493 096 130.40 tenge\r\n\t1 280 053 174
(quantity)\r\nWeighted-averaged price\t99.60 tenge\r\nCut price\t99.59
tenge\r\nYield (coupon)\t5.24 %"
我正在寻求帮助将此字符串转换为以下 table 格式:
Type of security NIN Maturity Type of placement Date of placement Settlement date Redemption date Actual amount of placement Demand Weighted-averaged price Cut price Yield (coupon)
Notes NBK KZW1KD281882 28 days Auction 15.05.2009 15.05.2009 12.06.2009 24 999 999 991.30 tenge 1 280 053 174 (quantity) 127 493 096 130.40 tenge 1 280 053 174 (quantity) 99.60 tenge 99.59 tenge 5.24%
我已经使用 gsub()
尝试了一些功能,但无法接近所需的输出。
以下是否足够?
ans <- lapply(strsplit("GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009\r\n
GOVERNMENT SECURITIES PLACEMENT RESULT\r\n\r\nThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:\r\n\r\nType of security\tNotes
NBK\r\nNIN\tKZW1KD281882\r\nMaturity\t28 days\r\nType of
placement\tAuction\r\nDate of placement\t15.05.2009\r\nSettlement
date\t15.05.2009\r\nRedemption date\t12.06.2009\r\nActual amount of
placement\t24 999 999 991.30 tenge\r\n\t251 003 524
(quantity)\r\nDemand\t127 493 096 130.40 tenge\r\n\t1 280 053 174
(quantity)\r\nWeighted-averaged price\t99.60 tenge\r\nCut price\t99.59
tenge\r\nYield (coupon)\t5.24 %", "\r\n", fixed=TRUE),
function(x) strsplit(x, split="\t", fixed=TRUE))
do.call(rbind, lapply(ans[[1]], function(x) {
if(length(x)==2) {
return(x)
}
return(NULL)
}))
# [,1] [,2]
# [1,] "Type of security" "Notes\nNBK"
# [2,] "NIN" "KZW1KD281882"
# [3,] "Maturity" "28 days"
# [4,] "Type of\nplacement" "Auction"
# [5,] "Date of placement" "15.05.2009"
# [6,] "Settlement\ndate" "15.05.2009"
# [7,] "Redemption date" "12.06.2009"
# [8,] "Actual amount of\nplacement" "24 999 999 991.30 tenge"
# [9,] "" "251 003 524\n(quantity)"
# [10,] "Demand" "127 493 096 130.40 tenge"
# [11,] "" "1 280 053 174\n(quantity)"
# [12,] "Weighted-averaged price" "99.60 tenge"
# [13,] "Cut price" "99.59\ntenge"
# [14,] "Yield (coupon)" "5.24 %"
我正在尝试将以前的中央银行官方报表转换为 table 格式。我有以下刮板:
library(rvest)
library(dplyr)
url <- "http://nationalbank.kz/?docid=105&cmomdate=2009-05-15&switch=english"
p <- url %>%
read_html() %>%
html_nodes(xpath='//table[1]') %>%
html_table(fill = T)
gh = p[[11]]
str(gh)
txt = gh[, 1]
产生:
[1] "GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009\r\n
GOVERNMENT SECURITIES PLACEMENT RESULT\r\n\r\nThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:\r\n\r\nType of security\tNotes
NBK\r\nNIN\tKZW1KD281882\r\nMaturity\t28 days\r\nType of
placement\tAuction\r\nDate of placement\t15.05.2009\r\nSettlement
date\t15.05.2009\r\nRedemption date\t12.06.2009\r\nActual amount of
placement\t24 999 999 991.30 tenge\r\n\t251 003 524
(quantity)\r\nDemand\t127 493 096 130.40 tenge\r\n\t1 280 053 174
(quantity)\r\nWeighted-averaged price\t99.60 tenge\r\nCut price\t99.59
tenge\r\nYield (coupon)\t5.24 %"
我正在寻求帮助将此字符串转换为以下 table 格式:
Type of security NIN Maturity Type of placement Date of placement Settlement date Redemption date Actual amount of placement Demand Weighted-averaged price Cut price Yield (coupon)
Notes NBK KZW1KD281882 28 days Auction 15.05.2009 15.05.2009 12.06.2009 24 999 999 991.30 tenge 1 280 053 174 (quantity) 127 493 096 130.40 tenge 1 280 053 174 (quantity) 99.60 tenge 99.59 tenge 5.24%
我已经使用 gsub()
尝试了一些功能,但无法接近所需的输出。
以下是否足够?
ans <- lapply(strsplit("GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009\r\n
GOVERNMENT SECURITIES PLACEMENT RESULT\r\n\r\nThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:\r\n\r\nType of security\tNotes
NBK\r\nNIN\tKZW1KD281882\r\nMaturity\t28 days\r\nType of
placement\tAuction\r\nDate of placement\t15.05.2009\r\nSettlement
date\t15.05.2009\r\nRedemption date\t12.06.2009\r\nActual amount of
placement\t24 999 999 991.30 tenge\r\n\t251 003 524
(quantity)\r\nDemand\t127 493 096 130.40 tenge\r\n\t1 280 053 174
(quantity)\r\nWeighted-averaged price\t99.60 tenge\r\nCut price\t99.59
tenge\r\nYield (coupon)\t5.24 %", "\r\n", fixed=TRUE),
function(x) strsplit(x, split="\t", fixed=TRUE))
do.call(rbind, lapply(ans[[1]], function(x) {
if(length(x)==2) {
return(x)
}
return(NULL)
}))
# [,1] [,2]
# [1,] "Type of security" "Notes\nNBK"
# [2,] "NIN" "KZW1KD281882"
# [3,] "Maturity" "28 days"
# [4,] "Type of\nplacement" "Auction"
# [5,] "Date of placement" "15.05.2009"
# [6,] "Settlement\ndate" "15.05.2009"
# [7,] "Redemption date" "12.06.2009"
# [8,] "Actual amount of\nplacement" "24 999 999 991.30 tenge"
# [9,] "" "251 003 524\n(quantity)"
# [10,] "Demand" "127 493 096 130.40 tenge"
# [11,] "" "1 280 053 174\n(quantity)"
# [12,] "Weighted-averaged price" "99.60 tenge"
# [13,] "Cut price" "99.59\ntenge"
# [14,] "Yield (coupon)" "5.24 %"