按数字列排序字符矩阵

Ordering a character matrix by numerical column

我正在处理从包含数字和字符的 csv 中读入的矩阵。这是一个较小的矩阵,但基本上是我正在使用的矩阵:

[,1] [,2] [,3]         [,4]    [,5]    [,6]    [,7]    [,8]    [,9]
V2  "A"  "1"  "Sample X1"  "34712" "39390" "38858" "38574" "38660" 
V3  "A"  "2"  "Sample X2"  "35333" "39940" "40533" "39936" "40669" 
V4  "A"  "3"  "Sample X3"  "33612" "39601" "38658" "39220" "39465" 
V5  "A"  "4"  "Sample X4"  "34309" "39200" "38597" "39820" "40081" 
V6  "A"  "5"  "Sample X5"  "33637" "39404" "40497" "39388" "40033" 
V7  "A"  "6"  "Sample X6"  "35314" "39522" "40345" "38624" "40306" 
V8  "A"  "7"  "Sample X7"  "35548" "39000" "41408" "38310" "39849" 
V9  "A"  "8"  "Sample X8"  "33972" "39930" "39777" "39582" "39570" 
V10 "A"  "9"  "Sample X9"  "34808" "39857" "39252" "39248" "38465" 
V11 "A"  "10" "Sample X10" "34316" "39798" "39776" "39516" "38812" 
V12 "A"  "11" "Sample X11" "34476" "38581" "39672" "38997" "38794" 
V13 "A"  "12" "Sample X12" "36246" "38809" "37872" "38100" "36925" 
V14 "B"  "1"  "Sample X13" "33642" "40201" "40202" "39320" "40426" 
V15 "B"  "2"  "Sample X14" "33381" "40624" "40349" "41350" "40490" 
V16 "B"  "3"  "Sample X15" "34465" "42096" "41194" "40613" "40416" 
V17 "B"  "4"  "Sample X16" "33957" "41905" "42273" "40710" "40681" 
V18 "B"  "5"  "Sample X17" "33877" "42040" "42226" "40788" "41261" 
V19 "B"  "6"  "Sample X18" "33970" "41860" "41149" "41093" "40877" 
V20 "B"  "7"  "Sample X19" "34745" "42040" "40186" "40862" "41044" 
V21 "B"  "8"  "Sample X20" "34140" "41274" "39880" "40356" "40496" 
V22 "B"  "9"  "Sample X21" "33929" "40652" "41410" "40760" "40718" 
V23 "B"  "10" "Sample X22" "33684" "39220" "40478" "41500" "40094"
V24 "B"  "11" "Sample X23" "33141" "41446" "41121" "40726" "41020"
V25 "B"  "12" "Sample X24" "33405" "38481" "37716" "38562" "38218" 
V26 "C"  "1"  "Sample X25" "71560" "86402" "85614" "84273" "83264" 
V27 "C"  "2"  "Sample X26" "72144" "86266" "88082" "87672" "87356" 
V28 "C"  "3"  "Sample X27" "71946" "90201" "89156" "88386" "88006" 
V29 "C"  "4"  "Sample X28" "71758" "89108" "88225" "86006" "88654" 
V30 "C"  "5"  "Sample X29" "71144" "86558" "88614" "87028" "88809" 
V31 "C"  "6"  "Sample X30" "70504" "89230" "88869" "86653" "86356" 
V32 "C"  "7"  "Sample X31" "67874" "88405" "84878" "84914" "85425" 
V33 "C"  "8"  "Sample X32" "70273" "87865" "87529" "87945" "86172" 

我想按没有 headers 的第二列对矩阵进行排序,所以它是:

A 1 . . .
B 1
C 1
A 2
B 2
C 2
A 3
. 
.
.
A 12
B 12
C 12 . . .

我环顾四周,发现您可以使用命令:

data <- data[order(data[,2],]

但结果是这样的:

A 1 . . .
B 1
c 1
A 10
B 10
C 10
A 11
B 11
C 11
A 12
B 12
C 12
A 2
B 2
C 2
.
.
.
A 9
B 9
C 9 . . .

是不是因为这个矩阵是字符矩阵?我如何才能只将第二列设为数字,以便我可以根据它对其进行排序?

谢谢

当您想跨列混合 类(例如数字和字符)时,将数据放在矩阵中是个坏主意。相反,您应该使用数据框。

理想情况下,使用 read.csvread.table 将数据读入数据帧。否则,使用 as.data.frame.

将矩阵强制转换为数据帧

给定矩阵m(在你的情况下data):

d <- as.data.frame(m, stringsAsFactors=FALSE)
d[, 3] <- as.numeric(d[, 3]) # coerce the relevant column to numeric
d[order(d[, 3]), ]

请注意,您 可以 使用 m[order(as.numeric(m[, 3])), ] 根据需要对矩阵进行排序,但结果列仍将全部为 character.

注意:您看到的排序行为的解释是,对于字符向量,任何以 1(例如 10)开头的内容都在 2 之前。