R中不同的行数

Differing number of rows in R

我正在尝试将四个数据帧合并为一个数据帧,但我没能成功,因为行数似乎有误。每个数据框包含从 2021-02-28 到 2022-04-01 的股票价格信息。

#Loading stock data
install.packages("quantmod")
library(quantmod)


#Getting stock data for: Toyota (TYO), Renault (RNO.PA), Honda (HMC), Hyundai (HYMTF)
 
 getSymbols(c("HMC", "HYMTF", "RNO.PA", "TYO" ), na.rm = TRUE)


#Creating individual data frames. 
 
# Stock Honda. 
stock_honda <- expand.grid("HMC" = HMC$HMC.Close) %>%
  mutate(Date = row.names(as.data.frame(HMC))) %>%
  mutate(across(Date, ~ . %>% str_remove("^X") %>% ymd())) %>% 
  subset(Date >"2021-02-28" & Date < "2022-04-01") %>%
  rename(Close = HMC)
  
head(stock_honda)


#Stock Toyota
stock_tyo <- expand.grid("Toyota" = TYO$TYO.Close) %>%
mutate(Date = row.names(as.data.frame(TYO))) %>%
  mutate(across(Date, ~. %>% str_remove("^X") %>% ymd()))%>%
  subset(Date > "2021-02-28" & Date < "2022-04-01") %>%
  rename(Close = Toyota)


head(stock_tyo)


# Stock Renault
stock_rno <- expand.grid("Rneault" = RNO.PA$RNO.PA.Close) %>%
  mutate(Date = row.names(as.data.frame(RNO.PA))) %>%
  mutate(across(Date, ~. %>% str_remove("^X") %>% ymd()))%>%
  subset(Date > "2021-02-28" & Date < "2022-04-01") 

head(stock_rno)


#Stock Hyundai
stock_hyundai <- expand.grid("HYMTF" = HYMTF$HYMTF.Close) %>%
  mutate(Date = row.names(as.data.frame(HYMTF))) %>%
  mutate(across(Date, ~ . %>% str_remove("^X") %>% ymd())) %>% 
  subset(Date >"2021-02-28" & Date < "2022-04-01") %>%
  rename(Close = HYMTF)

head(stock_hyundai)


#Merging stocks data into one data frame
stocks <- data.frame("Honda" = stock_honda, "Hyundai" = stock_hyundai, 
                     "Renault" = stock_rno, "Toyota" = stock_tyo)

head(stocks)

这给我以下错误:

Error in data.frame(Honda = stock_honda, Hyundai = stock_hyundai, Renault = stock_rno,  : 
  arguments imply differing number of rows: 276, 282
```

行数不同。最好按 'Date' 列加入,因为 data.frame 要求所有列的长度相同。也可以通过在末尾填充 NA 来完成,但这可能会导致错误,因为我们假设所有数据集都具有相同的日期序列,没有任何中断。相反,连接将确保我们获得对应于相同 'Date' 的行,如果不存在,则由 NA

填充
library(zoo)
library(dplyr)
library(purrr)
library(stringr)
library(quantmod)
# keep the datasets in a list
out <- list(HMC, HYMTF, RNO.PA, TYO) %>%
  # loop over the list with `map`
  # convert each of the zoo objects to data.frame with `fortify.zoo`
   map(~ fortify.zoo(.x) %>% 
          # select the Index and Close columns
          select(Date = Index, ends_with('Close')) %>% 
          # remove the suffix Close if needed
          rename_with(~ str_remove(.x, "\.Close"), ends_with("Close")) %>%
          # filter the rows based on the Date column
          filter(between(Date, as.Date("2021-02-28"), 
                               as.Date("2022-04-01")))) %>% 
  # finally reduce the list of data.frames to a single data.frame by joining
  reduce(full_join, by = 'Date')

-输出

> str(out)
'data.frame':   284 obs. of  5 variables:
 $ Date  : Date, format: "2021-03-01" "2021-03-02" "2021-03-03" "2021-03-04" ...
 $ HMC   : num  28.5 28.4 28.9 28.6 29.4 ...
 $ HYMTF : num  49.3 48 48.9 46.7 45.7 ...
 $ RNO.PA: num  37.6 37.4 39.4 39.2 38.5 ...
 $ TYO   : num  8.94 8.8 9 9.1 9.22 9.29 9.15 9.1 9.07 9.26 ...

连接是这里最直接的选择,但转换为宽 formight 也可以解决问题..

一种data.table方法

library(data.table)
library(tibble)
# put everyting into a list, tibble::lst() uses the names 
#  of the objects added to the list as it's names... this comes
#  in handy when we rowbind the list two code-lines down...
L <- tibble::lst(stock_honda, stock_hyundai, stock_rno, stock_tyo)
# convert to data.tables
L <- lapply(L, as.data.table)
# rowbind together
DT <- data.table::rbindlist(L, use.names = FALSE, idcol = "stock")
# cast to wide
final <- dcast(DT, Date ~ stock, value.var = "Close")

#          Date stock_honda stock_hyundai stock_rno stock_tyo
# 1: 2021-03-01       28.48         49.26    37.635      8.94
# 2: 2021-03-02       28.40         48.03    37.420      8.80
# 3: 2021-03-03       28.88         48.89    39.370      9.00
# 4: 2021-03-04       28.60         46.70    39.175      9.10
# 5: 2021-03-05       29.39         45.74    38.550      9.22
# 6: 2021-03-08       29.48         45.50    40.075      9.29
# ...

在 运行ning getSymbols(其中 returns 字符向量,代码,代码名称并将数据放置在全局环境中这些名称的对象中)之后 运行 mget 将代码数据放入列表中并从中提取收盘价——您可能需要调整后的收盘价而不是收盘价,在这种情况下使用 Ad 而不是 Cl——然后合并它们并设置列名。 stocks 将是一个 xts 对象,每个代码有一列。如果您需要数据框,请保持原样或使用 fortify.zoo(stocks)。

library(quantmod)
tickers <- getSymbols(c("HMC", "HYMTF", "RNO.PA", "TYO" ))

stocks <- tickers |>
  mget() |>
  Map(f = Cl) |>
  do.call(what = "merge") |>
  setNames(tickers)