R将字典迭代为多个值的函数
R iterate over a dictionary into a function for multiple values
我正在尝试创建一个循环(或最有效的方法)来迭代 R 中的一系列日历(或 Python!)到 post 所有假期(理想情况下是所有工作日,但似乎我可以为此设计两个部分——因为我希望标记周末)。目标是拥有一个如下所示的数据框:
Country | ISO Code (if available) | Dates
United States of America| US| 12.24.2020
United States of America| US| 12.25.2020
United States of America| US| 01.01.2021
United Kingdom| UK| 12.24.2020
United Kingdom| UK| 12.25.2020
United Kingdom| UK| 01.01.2021
我目前拥有的:
require("lattice")
require("reticulate")
require("RcppQuantuccia")
require("tidyverse")
require("tidytable")
fun_Holidays <- function(cal) {
setCalendar(cal)
getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
}
cal_dic <- data.table(calendar=calendars)
as.list(cal_dic)
cal_dic 是 RcppQuantuccia 上所有可用日历的列表,但如果我 运行:
fun_Holidays(cal_dic)
我得到的只是错误(因为它不是迭代的):
ERROR: Error in setCalendar(cal): Expecting a single string value: [type=list; extent=1].
我也在 Python 中使用 Holidays 包对此进行了尝试,并取得了进一步的进展,但 ISO 代码没有正确附加:
all_holidays = []
country_list = ['Angola','Argentina','Aruba','Australia','Austria','Bangladesh','Belarus','Belgium','Botswana','Brazil',
'Bulgaria','Burundi','Canada','Chile','China','Colombia','Croatia','Curacao','Czechia','Denmark','Djibouti','DominicanRepublic',
'Egypt','England','Estonia','Finland','France','Georgia','Germany','Greece','Honduras','HongKong','Hungary','Iceland','India','Ireland','IsleOfMan',
'Israel','Italy','Jamaica','Japan','Kenya','Korea','Latvia','Lesotho','Lithuania','Luxembourg','Malaysia','Malawi','Mexico','Morocco','Mozambique','Netherlands',
'Namibia','NewZealand','Nicaragua','Nigeria','NorthernIreland','Norway','Paraguay','Peru','Poland','Portugal','PortugalExt','Romania','Russia','SaudiArabia','Scotland',
'Serbia','Singapore','Slovakia','Slovenia','SouthAfrica','Spain','Swaziland','Sweden','Switzerland','Turkey','Ukraine','UnitedArabEmirates','UnitedKingdom',
'UnitedStates','Venezuela','Vietnam','Wales','Zambia','Zimbabwe']
for country in country_list:
for holiday in holidays.CountryHoliday(country, years = np.arange(2018,2030,1)).items():
all_holidays.append({'date' : holiday[0], 'holiday' : holiday[1], 'country': country, 'code': code})
all_holidays = pd.DataFrame(all_holidays)
all_holidays
date holiday country code
0 2018-09-17 Dia do Herói Nacional Angola NZ
1 2018-01-01 Ano novo Angola NZ
2 2018-03-30 Sexta-feira Santa Angola NZ
3 2018-02-13 Carnaval Angola NZ
4 2018-02-04 Dia do Início da Luta Armada Angola NZ
... ... ... ... ...
14386 2029-08-15 Zimbabwe Heroes' Day Zimbabwe NZ
14387 2029-08-13 Defense Forces Day Zimbabwe NZ
14388 2029-12-22 Unity Day Zimbabwe NZ
14389 2029-12-25 Christmas Day Zimbabwe NZ
14390 2029-12-26 Boxing Day Zimbabwe NZ
14391 rows × 4 columns
我觉得很奇怪,在 csv 中没有按日期按国家/地区列出的假期主列表或类似的列表来帮助处理时间序列 - 但也许只有我这样! :)
谢谢!
编辑:我也一直在看:https://workalendar.github.io/workalendar/
因为它有最大的国家列表,但它比假期更难处理 - 但如果有人有解决方案从 workaldendar 中获取“主日历”,那将是惊人的!
使用 lapply
获取 calendars
中每个值的日期列表。
library(RcppQuantuccia)
fun_Holidays <- function(cal) {
setCalendar(cal)
getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
}
lapply(calendars, fun_Holidays)
要创建包含国家名称和日期的单个数据框,您可以使用 -
do.call(rbind, lapply(calendars, function(x) {
dates <- fun_Holidays(x)
if(length(dates))
data.frame(country = x, dates)
})) -> result
head(result)
# country dates
#1 TARGET 2019-01-01
#2 TARGET 2019-04-19
#3 TARGET 2019-04-22
#4 TARGET 2019-05-01
#5 TARGET 2019-12-25
#6 TARGET 2019-12-26
或者用 purrr
-
purrr::map_df(calendars, function(x) {
dates <- fun_Holidays(x)
if(length(dates))
data.frame(country = x, dates)
}) -> result
我正在尝试创建一个循环(或最有效的方法)来迭代 R 中的一系列日历(或 Python!)到 post 所有假期(理想情况下是所有工作日,但似乎我可以为此设计两个部分——因为我希望标记周末)。目标是拥有一个如下所示的数据框:
Country | ISO Code (if available) | Dates
United States of America| US| 12.24.2020
United States of America| US| 12.25.2020
United States of America| US| 01.01.2021
United Kingdom| UK| 12.24.2020
United Kingdom| UK| 12.25.2020
United Kingdom| UK| 01.01.2021
我目前拥有的:
require("lattice")
require("reticulate")
require("RcppQuantuccia")
require("tidyverse")
require("tidytable")
fun_Holidays <- function(cal) {
setCalendar(cal)
getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
}
cal_dic <- data.table(calendar=calendars)
as.list(cal_dic)
cal_dic 是 RcppQuantuccia 上所有可用日历的列表,但如果我 运行:
fun_Holidays(cal_dic)
我得到的只是错误(因为它不是迭代的):
ERROR: Error in setCalendar(cal): Expecting a single string value: [type=list; extent=1].
我也在 Python 中使用 Holidays 包对此进行了尝试,并取得了进一步的进展,但 ISO 代码没有正确附加:
all_holidays = []
country_list = ['Angola','Argentina','Aruba','Australia','Austria','Bangladesh','Belarus','Belgium','Botswana','Brazil',
'Bulgaria','Burundi','Canada','Chile','China','Colombia','Croatia','Curacao','Czechia','Denmark','Djibouti','DominicanRepublic',
'Egypt','England','Estonia','Finland','France','Georgia','Germany','Greece','Honduras','HongKong','Hungary','Iceland','India','Ireland','IsleOfMan',
'Israel','Italy','Jamaica','Japan','Kenya','Korea','Latvia','Lesotho','Lithuania','Luxembourg','Malaysia','Malawi','Mexico','Morocco','Mozambique','Netherlands',
'Namibia','NewZealand','Nicaragua','Nigeria','NorthernIreland','Norway','Paraguay','Peru','Poland','Portugal','PortugalExt','Romania','Russia','SaudiArabia','Scotland',
'Serbia','Singapore','Slovakia','Slovenia','SouthAfrica','Spain','Swaziland','Sweden','Switzerland','Turkey','Ukraine','UnitedArabEmirates','UnitedKingdom',
'UnitedStates','Venezuela','Vietnam','Wales','Zambia','Zimbabwe']
for country in country_list:
for holiday in holidays.CountryHoliday(country, years = np.arange(2018,2030,1)).items():
all_holidays.append({'date' : holiday[0], 'holiday' : holiday[1], 'country': country, 'code': code})
all_holidays = pd.DataFrame(all_holidays)
all_holidays
date holiday country code
0 2018-09-17 Dia do Herói Nacional Angola NZ
1 2018-01-01 Ano novo Angola NZ
2 2018-03-30 Sexta-feira Santa Angola NZ
3 2018-02-13 Carnaval Angola NZ
4 2018-02-04 Dia do Início da Luta Armada Angola NZ
... ... ... ... ...
14386 2029-08-15 Zimbabwe Heroes' Day Zimbabwe NZ
14387 2029-08-13 Defense Forces Day Zimbabwe NZ
14388 2029-12-22 Unity Day Zimbabwe NZ
14389 2029-12-25 Christmas Day Zimbabwe NZ
14390 2029-12-26 Boxing Day Zimbabwe NZ
14391 rows × 4 columns
我觉得很奇怪,在 csv 中没有按日期按国家/地区列出的假期主列表或类似的列表来帮助处理时间序列 - 但也许只有我这样! :)
谢谢!
编辑:我也一直在看:https://workalendar.github.io/workalendar/
因为它有最大的国家列表,但它比假期更难处理 - 但如果有人有解决方案从 workaldendar 中获取“主日历”,那将是惊人的!
使用 lapply
获取 calendars
中每个值的日期列表。
library(RcppQuantuccia)
fun_Holidays <- function(cal) {
setCalendar(cal)
getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
}
lapply(calendars, fun_Holidays)
要创建包含国家名称和日期的单个数据框,您可以使用 -
do.call(rbind, lapply(calendars, function(x) {
dates <- fun_Holidays(x)
if(length(dates))
data.frame(country = x, dates)
})) -> result
head(result)
# country dates
#1 TARGET 2019-01-01
#2 TARGET 2019-04-19
#3 TARGET 2019-04-22
#4 TARGET 2019-05-01
#5 TARGET 2019-12-25
#6 TARGET 2019-12-26
或者用 purrr
-
purrr::map_df(calendars, function(x) {
dates <- fun_Holidays(x)
if(length(dates))
data.frame(country = x, dates)
}) -> result