R:gtools mixedsort 的意外自然排序
R: unexpected natural sorting by gtools mixedsort
我只是在 gtools::mixedsort
中发现一些意想不到的行为,看看我的输出是如何自然排序的。
我有一个这样的例子:
aa=c("CD57","CD58","CD158","CD158b","CD158e","CD158e1","CD319","CD335")
gtools::mixedsort(aa)
我的预期结果是:
[1] "CD57" "CD58" "CD158" "CD158b" "CD158e" "CD158e1" "CD319"
[8] "CD335"
然而我得到这个:
[1] "CD57" "CD58" "CD158" "CD158b" "CD158e" "CD319" "CD335"
[8] "CD158e1"
这是正确的吗?这是什么原因?
CD158e1
在这里被当作1580
,因为:
>>> 158e1
1580.0
>>>
158e1
是一个欧拉数 e
,所以它多给一个 0
,e2
会多给 2 等等...
这就是为什么它被解析为列表中的最后一个。
如the documentation of mixedsort
所述:
These functions sort or order character strings containing embedded
numbers so that the numbers are numerically sorted rather than sorted
by character value. I.e. "Aspirin 50mg" will come before "Aspirin
100mg". In addition, case of character strings is ignored so that "a",
will come before "B" and "C".
我只是在 gtools::mixedsort
中发现一些意想不到的行为,看看我的输出是如何自然排序的。
我有一个这样的例子:
aa=c("CD57","CD58","CD158","CD158b","CD158e","CD158e1","CD319","CD335")
gtools::mixedsort(aa)
我的预期结果是:
[1] "CD57" "CD58" "CD158" "CD158b" "CD158e" "CD158e1" "CD319"
[8] "CD335"
然而我得到这个:
[1] "CD57" "CD58" "CD158" "CD158b" "CD158e" "CD319" "CD335"
[8] "CD158e1"
这是正确的吗?这是什么原因?
CD158e1
在这里被当作1580
,因为:
>>> 158e1
1580.0
>>>
158e1
是一个欧拉数 e
,所以它多给一个 0
,e2
会多给 2 等等...
这就是为什么它被解析为列表中的最后一个。
如the documentation of mixedsort
所述:
These functions sort or order character strings containing embedded numbers so that the numbers are numerically sorted rather than sorted by character value. I.e. "Aspirin 50mg" will come before "Aspirin 100mg". In addition, case of character strings is ignored so that "a", will come before "B" and "C".