来自键盘的输入不 return 正确的 Unicode 字符
Input from keyboard does not return correct Unicode character
我的值在 UTF-8 CSV 文件中包含 Unicode 字符 U+0103 ă
。这个和越南语中的其他 UTF-8 字符在数据框中正确显示。
ID Subject
1 Ngữ văn
2 Toán
3 Địa lí
但是,当我过滤数据框时,这有效:
df %>% filter(Subject == "Toán")
# A tibble: 1 x 2
ID Subject
<dbl> <chr>
1 Toán
但不是这个:
df %>% filter(Subject == "Ngữ văn")
# A tibble: 0 x 2
# ... with 2 variables: ID <dbl>, Subject <chr>
我比较了字符串"Ngữ văn"
和手动指定ă
的字符串:
> "Ngữ văn"
[1] "Ngữ van"
> paste("Ngữ v","\u0103", "n", sep = "")
[1] "Ngữ văn"
> paste("Ngữ v","\u0103", "n", sep = "") == "Ngữ văn"
[1] FALSE
为什么输入字母 ă
returns a
我该如何解决?
我的会话信息:
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
工作正常
library(dplyr)
df %>%
filter(Subject == "Ngữ văn" )
# ID Subject
#1 1 Ngữ văn
数据
df <- structure(list(ID = 1:3, Subject = c("Ngữ văn", "Toán", "Địa lí"
)), class = "data.frame", row.names = c(NA, -3L))
我的值在 UTF-8 CSV 文件中包含 Unicode 字符 U+0103 ă
。这个和越南语中的其他 UTF-8 字符在数据框中正确显示。
ID Subject
1 Ngữ văn
2 Toán
3 Địa lí
但是,当我过滤数据框时,这有效:
df %>% filter(Subject == "Toán")
# A tibble: 1 x 2
ID Subject
<dbl> <chr>
1 Toán
但不是这个:
df %>% filter(Subject == "Ngữ văn")
# A tibble: 0 x 2
# ... with 2 variables: ID <dbl>, Subject <chr>
我比较了字符串"Ngữ văn"
和手动指定ă
的字符串:
> "Ngữ văn"
[1] "Ngữ van"
> paste("Ngữ v","\u0103", "n", sep = "")
[1] "Ngữ văn"
> paste("Ngữ v","\u0103", "n", sep = "") == "Ngữ văn"
[1] FALSE
为什么输入字母 ă
returns a
我该如何解决?
我的会话信息:
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
工作正常
library(dplyr)
df %>%
filter(Subject == "Ngữ văn" )
# ID Subject
#1 1 Ngữ văn
数据
df <- structure(list(ID = 1:3, Subject = c("Ngữ văn", "Toán", "Địa lí"
)), class = "data.frame", row.names = c(NA, -3L))