从 R 中的 csv 创建应急 table

Create contingency table from csv in R

我正在使用 ca 包来执行对应分析。我已经使用 author 数据进行分析,效果非常好。

library(ca)
head(author[,1:5])
                               a   b   c   d    e
three daughters (buck)       550 116 147 374 1015
drifters (michener)          515 109 172 311  827
lost world (clark)           590 112 181 265  940
east wind (buck)             557 129 128 343  996
farewell to arms (hemingway) 589  72 129 339  866
sound and fury 7 (faulkner)  541 109 136 228  763

str(author)
 num [1:12, 1:26] 550 515 590 557 589 541 517 592 576 557 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:12] "three daughters (buck)" "drifters (michener)" "lost world (clark)" "east wind (buck)" ...
  ..$ : chr [1:26] "a" "b" "c" "d" ...

ca(author[,1:5])

 Principal inertias (eigenvalues):
           1        2        3        4       
Value      0.008122 0.001307 0.001072 0.000596
Percentage 73.19%   11.78%   9.66%    5.37%   

...

然后我尝试将author数据写成csv,读取csv再次进行分析。那么 ca 就不行了。读取的csv文件的str不同,不是偶然的table-like。因此,ca 函数会产生错误。

author1 <- read.csv("author.csv")
colnames(author1)[1] <- ""
head(author1[,1:5])
                                 a   b   c   d
1       three daughters (buck) 550 116 147 374
2          drifters (michener) 515 109 172 311
3           lost world (clark) 590 112 181 265
4             east wind (buck) 557 129 128 343
5 farewell to arms (hemingway) 589  72 129 339
6  sound and fury 7 (faulkner) 541 109 136 228

str(author1[,1:5])
'data.frame':   12 obs. of  5 variables:
 $  : Factor w/ 12 levels "asia (michener)",..: 12 2 6 3 4 11 10 9 5 8 ...
 $ a: int  550 515 590 557 589 541 517 592 576 557 ...
 $ b: int  116 109 112 129 72 109 96 151 120 97 ...
 $ c: int  147 172 181 128 129 136 127 251 136 145 ...
 $ d: int  374 311 265 343 339 228 356 238 404 354 ...

ca(author1[,1:5])
Error in sum(N) : invalid 'type' (character) of argument

我想知道是否有一个简单的修复程序可以将 author1 转换为源代码 author

作者的第一列实际上是行名,因此读取 csv 并将第一列的名称更改为“”是问题所在。

这有效。

library(data.table)
library(dplyr)
library(ca)

head(author[,1:5])

write.csv(author, file="author.csv")
author2 <- read.csv("author.csv")

head(author2[,1:5]) # here to row names are numbers
                             X   a   b   c   d
1       three daughters (buck) 550 116 147 374
2          drifters (michener) 515 109 172 311
3           lost world (clark) 590 112 181 265
4             east wind (buck) 557 129 128 343
5 farewell to arms (hemingway) 589  72 129 339
6  sound and fury 7 (faulkner) 541 109 136 228

# set row names to be first column of the csv
rownames(author2) <- author2$X

# remove the first column
author2 %>% select(-X) -> author2

head(author2[,1:5]) # notice the row names have changed

                               a   b   c   d    e
three daughters (buck)       550 116 147 374 1015
drifters (michener)          515 109 172 311  827
lost world (clark)           590 112 181 265  940
east wind (buck)             557 129 128 343  996
farewell to arms (hemingway) 589  72 129 339  866
sound and fury 7 (faulkner)  541 109 136 228  763