在 R 中查找所有数字并将其转换为相应的名称
Finding and converting all numbers into their corresponding names in R
我有一个单列数据框,其中每一行都是一条语句。这些语句主要是字母字符,但也有一些数字字符。我试图找到所有数字字符并将它们替换为相应的字母字符。
基本上,我想从这里开始
"I looked at the watermelons around 12 today"
"There is a dog on the bench"
"the year is 2017"
"I am not hungry"
"He turned 1 today"
进入(或类似的东西)
"I looked at the watermelons around twelve today"
"There is a dog on the bench"
"the year is two thousand seventeen"
"I am not hungry"
"He turned one today"
有我熟悉的将数字转换为单词的函数,例如 xfun 包中的 numbers_to_words 函数,但我不知道如何系统地为整个数据框执行此操作。
实际上我不知道一个简单的函数或类似的东西,但我有一个可能有点糟糕的解决方案给你:
library(xfun)
a <- "I looked at the watermelons around 12 today"
y <- numeric(nchar(a))
for(i in 1:nchar(a))
{
y[i]<-as.numeric(substr(a,i,i))
}
x <- n2w(as.numeric(paste(na.omit(y), collapse="")))
z <- which(y != "NA")
paste(c(substr(a, 1, z[1]-1), x, substr(a, z[length(z)] + 1, nchar(a))), collapse = "")
目前只对一个数字有效
这是 stringr
和 english
包的一种方法。
library(stringr)
library(english)
data<- c("I looked at the watermelons around 12 today", "There is a dog on the bench", "the year is 2017", "I am not hungry", "He turned 1 today")
Replacement <- lapply(str_extract_all(data,"[0-9]+"),function(x){
as.character(as.english(as.numeric(x)))})
sapply(seq_along(data),
function(i){
ifelse(grepl('[0-9]+',data[i]),
str_replace_all(data[i],"[0-9]+",Replacement[[i]]),
data[i])})
[1] "I looked at the watermelons around twelve today" "There is a dog on the bench"
[3] "the year is two thousand seventeen" "I am not hungry"
[5] "He turned one today"
我有一个单列数据框,其中每一行都是一条语句。这些语句主要是字母字符,但也有一些数字字符。我试图找到所有数字字符并将它们替换为相应的字母字符。
基本上,我想从这里开始
"I looked at the watermelons around 12 today"
"There is a dog on the bench"
"the year is 2017"
"I am not hungry"
"He turned 1 today"
进入(或类似的东西)
"I looked at the watermelons around twelve today"
"There is a dog on the bench"
"the year is two thousand seventeen"
"I am not hungry"
"He turned one today"
有我熟悉的将数字转换为单词的函数,例如 xfun 包中的 numbers_to_words 函数,但我不知道如何系统地为整个数据框执行此操作。
实际上我不知道一个简单的函数或类似的东西,但我有一个可能有点糟糕的解决方案给你:
library(xfun)
a <- "I looked at the watermelons around 12 today"
y <- numeric(nchar(a))
for(i in 1:nchar(a))
{
y[i]<-as.numeric(substr(a,i,i))
}
x <- n2w(as.numeric(paste(na.omit(y), collapse="")))
z <- which(y != "NA")
paste(c(substr(a, 1, z[1]-1), x, substr(a, z[length(z)] + 1, nchar(a))), collapse = "")
目前只对一个数字有效
这是 stringr
和 english
包的一种方法。
library(stringr)
library(english)
data<- c("I looked at the watermelons around 12 today", "There is a dog on the bench", "the year is 2017", "I am not hungry", "He turned 1 today")
Replacement <- lapply(str_extract_all(data,"[0-9]+"),function(x){
as.character(as.english(as.numeric(x)))})
sapply(seq_along(data),
function(i){
ifelse(grepl('[0-9]+',data[i]),
str_replace_all(data[i],"[0-9]+",Replacement[[i]]),
data[i])})
[1] "I looked at the watermelons around twelve today" "There is a dog on the bench"
[3] "the year is two thousand seventeen" "I am not hungry"
[5] "He turned one today"