如何导入乌尔都语数据集
How to import Urdu data-set
如何导入写有乌尔都语的 CSV 文件的数据集?我想做
以这种方式,但得到错误。我做错了什么吗?
代码:
library(rio)
Sys.setlocale("LC_ALL","Urdu")
fil <- read.csv("D:/PycharmProjects/shiny-examples-master/shiny-examples-master/Data_set.csv",encoding='UTF-8')
Data_set.csv:
Reg No. address
13 Nazim ud Din Road, F-11, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
45 Street 34, F-7/1, F-7, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
5564 Lane 11, DHA Phase II, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
错误:
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 1 appears to contain embedded nulls
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 2 appears to contain embedded nulls
3: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 3 appears to contain embedded nulls
4: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 4 appears to contain embedded nulls
5: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 5 appears to contain embedded nulls
6: In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'D:/PycharmProjects/shiny-examples-master/shiny-examples-master/12000.csv'
尝试使用 skipNul
flag 看看是否有效。
read.csv("Data_set.csv", header = TRUE, sep="\t", encoding="UTF-8", skipNul = TRUE)
Reg.No......address
1 13 Nazim ud Din Road, F-11, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
2 45 Street 34, F-7/1, F-7, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
3 5564 Lane 11, DHA Phase II, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
或将编码设置为 UTF-16E
,但 UTF-8
看起来正确。
read.csv("test.csv", header = TRUE, sep="\t", encoding="UTF-16E", skipNul = TRUE)
Reg.No......address
1 13 Nazim ud Din Road, F-11, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†â€ژ
2 45 Street 34, F-7/1, F-7, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†â€ژ
3 5564 Lane 11, DHA Phase II, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†
如何导入写有乌尔都语的 CSV 文件的数据集?我想做 以这种方式,但得到错误。我做错了什么吗?
代码:
library(rio)
Sys.setlocale("LC_ALL","Urdu")
fil <- read.csv("D:/PycharmProjects/shiny-examples-master/shiny-examples-master/Data_set.csv",encoding='UTF-8')
Data_set.csv:
Reg No. address
13 Nazim ud Din Road, F-11, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
45 Street 34, F-7/1, F-7, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
5564 Lane 11, DHA Phase II, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
错误:
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 1 appears to contain embedded nulls
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 2 appears to contain embedded nulls
3: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 3 appears to contain embedded nulls
4: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 4 appears to contain embedded nulls
5: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 5 appears to contain embedded nulls
6: In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'D:/PycharmProjects/shiny-examples-master/shiny-examples-master/12000.csv'
尝试使用 skipNul
flag 看看是否有效。
read.csv("Data_set.csv", header = TRUE, sep="\t", encoding="UTF-8", skipNul = TRUE)
Reg.No......address
1 13 Nazim ud Din Road, F-11, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
2 45 Street 34, F-7/1, F-7, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
3 5564 Lane 11, DHA Phase II, ICT, دارالحکومت اسلام آباد, 44000, پاکستان
或将编码设置为 UTF-16E
,但 UTF-8
看起来正确。
read.csv("test.csv", header = TRUE, sep="\t", encoding="UTF-16E", skipNul = TRUE)
Reg.No......address
1 13 Nazim ud Din Road, F-11, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†â€ژ
2 45 Street 34, F-7/1, F-7, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†â€ژ
3 5564 Lane 11, DHA Phase II, ICT, ط¯ط§ط±ط§ظ„طع©ظˆظ…طھ ط§ط³ظ„ط§ظ… ط¢ط¨ط§ط¯, 44000, â€ڈظ¾ط§ع©ط³طھط§ظ†