去掉数字月份前多余的0

Question

我有一个 df 列，其中的日期以 character 格式存储，我想为其提取月份。为此，我使用以下内容：

mutate(
    
    Date = as.Date(
      
      str_remove(Timestamp, "_.*")
      
      ),
    
    Month = month(
      
      Date, 
      
      label = F)
    
  )

但是，October、November 和 December 在存储时会在月份前面多加一个零。 lubridate 库无法识别它。我如何调整上面的代码来解决这个问题？这是我的 Timestamp 专栏：

c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m", 
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
)

Answer 1

一种方法是使用 strsplit 提取第二个元素：

month.abb[readr::parse_number(sapply(strsplit(x, split = '-'), "[[", 2))]

这将 return:

#"Oct" "Oct" "Oct" "Oct" "Oct" "Oct"

数据：

c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m", 
  "2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
) -> x

Answer 2

首先将值转换为日期并使用 format 从中获取月份。

format(as.Date(x, '%Y-0%m-%d'), '%b')
#[1] "Oct" "Oct" "Oct" "Oct" "Oct" "Oct"

%b给出缩写的月份名称，您也可以根据自己的选择使用%B或%m。

format(as.Date(x, '%Y-0%m-%d'), '%B')
#[1] "October" "October" "October" "October" "October" "October"

format(as.Date(x, '%Y-0%m-%d'), '%m')
#[1] "10" "10" "10" "10" "10" "10"

去掉数字月份前多余的0

Remove extra 0 in front of numeric month

r

date

lubridate