R:如何将数据框列中用分号分割的每个单词的首字母大写?
R: How to uppercase first letter of each word split by semicolon in data frame column?
假设我有一个数据框df
。
> df <- data.frame(Disease = c('Disease Entry1; disease Entry2', 'disease Entry4','disease Entry5; disease entry6'), ID = c(1,2,3))
> df
Disease ID
1 Disease Entry1; disease Entry2 1
2 disease Entry4 2
3 disease Entry5; disease entry6 3
我如何操作它,使每个疾病条目除了每个条目的第一个字母外都是小写的?即
> df
Disease ID
1 Disease entry1; Disease entry2 1
2 Disease entry4 2
3 Disease entry5; Disease entry6 3
我假设我会以某种方式使用 tolower
函数,但是如何考虑分号?
可以先把所有字母转成小写,然后用gsub
把BOS处或;
后面的字母一起用后向引用\1
转成大写\U
:
df$Disease <- gsub("(?<=^|; )([a-z])", "\U\1", tolower(df$Disease), perl = T)
df
# Disease ID
#1 Disease entry1; Disease entry2 1
#2 Disease entry4 2
#3 Disease entry5; Disease entry6 3
假设我有一个数据框df
。
> df <- data.frame(Disease = c('Disease Entry1; disease Entry2', 'disease Entry4','disease Entry5; disease entry6'), ID = c(1,2,3))
> df
Disease ID
1 Disease Entry1; disease Entry2 1
2 disease Entry4 2
3 disease Entry5; disease entry6 3
我如何操作它,使每个疾病条目除了每个条目的第一个字母外都是小写的?即
> df
Disease ID
1 Disease entry1; Disease entry2 1
2 Disease entry4 2
3 Disease entry5; Disease entry6 3
我假设我会以某种方式使用 tolower
函数,但是如何考虑分号?
可以先把所有字母转成小写,然后用gsub
把BOS处或;
后面的字母一起用后向引用\1
转成大写\U
:
df$Disease <- gsub("(?<=^|; )([a-z])", "\U\1", tolower(df$Disease), perl = T)
df
# Disease ID
#1 Disease entry1; Disease entry2 1
#2 Disease entry4 2
#3 Disease entry5; Disease entry6 3