为什么将行名更改为相同时相同的数据帧会变得不同

Question

我在玩一些数据帧时遇到了一个奇怪的行为：当我创建两个相同的数据帧时 a,b，然后交换它们的行名，它们并不相同：

rm(list=ls())

a <- data.frame(a=c(1,2,3),b=c(2,3,4))
b <- a
identical(a,b)
#TRUE

identical(rownames(a),rownames(b))
#TRUE

rownames(b) <- rownames(a)

identical(a,b)
#FALSE

谁能reproduce/explain为什么？

Answer 1

诚然，这有点令人困惑。从 ?data.frame 开始，我们看到：

If row.names was supplied as NULL or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be ‘automatic’, and not preserved by as.matrix).

所以最初 a 和 b 每个都有一个名为 row.names 的整数属性：

> str(attributes(a))
List of 3
 $ names    : chr [1:2] "a" "b"
 $ row.names: int [1:3] 1 2 3
 $ class    : chr "data.frame"

但是 rownames() returns 一个字符向量（dimnames() 也是，实际上是一个字符向量列表，在后台调用）。因此，在重新分配行名称后，您最终得到：

> str(attributes(b))
List of 3
 $ names    : chr [1:2] "a" "b"
 $ row.names: chr [1:3] "1" "2" "3"
 $ class    : chr "data.frame"

为什么将行名更改为相同时相同的数据帧会变得不同

Why do identical dataframes become different when changing rownames to the same

r

dataframe

rowname