为什么基本比例不能与 Tibble 一起使用?
Why isn't base scale working with Tibble?
我有一个使用 readxl
从 excel 导入的数据集,称为 GSMA。检查对象 returns 的 class:
class(GSMA)
[1] "tbl_df" "tbl" "data.frame"
我想使用基本比例对第 2 列到第 4 列进行标准化。我试试 运行ning:
GSMA[2:4] <- scale(GSMA[2:4])
这会导致数据框缩放不正确,每一行的所有列都具有相同的值。
问题的潜在线索:当我尝试对缩放不正确的数据帧进行排序时,返回此错误:
Error in xj[i, , drop = FALSE] : subscript out of bounds
当我重新导入同一个数据集,然后运行:
GSMA <- as.data.frame(GSMA)
GSMA[2:4] <- scale(GSMA[2:4])
数据框列正确缩放。
这是怎么回事?为什么基本比例在第一个实例中不起作用?
dput(head(GSMA))
structure(list(Country = c("GBR", "CHE", "DEU", "ROU", "LUX",
"KAZ"), entry = c(98.4974384307861, 95.6549962361654, 91.4044539133708,
90.8518393834432, 90.4088099797567, 88.0471547444662), medium = c(86.0081672668457,
93.0372142791748, 91.2993144989014, 100, 96.7348480224609, 100
), high = c(74.6774760159579, 84.1793060302734, 79.542350769043,
99.6931856328791, 97.031680020419, 92.5396745855158)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
奇怪的是,这是正确的:
> scale(head(GSMA[2:4]))
entry medium high
[1,] 1.5644225 -1.5528676 -1.3233285
[2,] 0.8257534 -0.2694974 -0.3755223
[3,] -0.2788406 -0.5868048 -0.8380579
[4,] -0.4224492 1.0017748 1.1719851
[5,] -0.5375798 0.4056202 0.9065003
[6,] -1.1513063 1.0017748 0.4584233
attr(,"scaled:center")
entry medium high
92.47745 94.51326 87.94395
attr(,"scaled:scale")
entry medium high
3.848059 5.477022 10.025077
但这不是:
> GSMA[2:4] <- scale(GSMA[2:4])
> head(GSMA)
# A tibble: 6 x 4
Country entry[,"entry"] [,"medium"] [,"high"] medium[,"entry"] [,"medium"] [,"high"]
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GBR 2.13 1.25 0.870 2.13 1.25 0.870
2 CHE 2.00 1.52 1.27 2.00 1.52 1.27
3 DEU 1.80 1.46 1.07 1.80 1.46 1.07
4 ROU 1.78 1.80 1.92 1.78 1.80 1.92
5 LUX 1.76 1.67 1.81 1.76 1.67 1.81
6 KAZ 1.65 1.80 1.62 1.65 1.80 1.62
# ... with 3 more variables: high[,"entry"] <dbl>, [,"medium"] <dbl>, [,"high"] <dbl>
Tibble 3.0.0 的已知问题。恢复到 2.1.3 的旧行为。
或者:
library(tibble)
iris <- as_tibble(iris)
scale <- scale(iris[1:3])
class(scale)
#> [1] "matrix"
iris[1:3] <- as.data.frame(scale)
我有一个使用 readxl
从 excel 导入的数据集,称为 GSMA。检查对象 returns 的 class:
class(GSMA)
[1] "tbl_df" "tbl" "data.frame"
我想使用基本比例对第 2 列到第 4 列进行标准化。我试试 运行ning:
GSMA[2:4] <- scale(GSMA[2:4])
这会导致数据框缩放不正确,每一行的所有列都具有相同的值。
问题的潜在线索:当我尝试对缩放不正确的数据帧进行排序时,返回此错误:
Error in xj[i, , drop = FALSE] : subscript out of bounds
当我重新导入同一个数据集,然后运行:
GSMA <- as.data.frame(GSMA)
GSMA[2:4] <- scale(GSMA[2:4])
数据框列正确缩放。
这是怎么回事?为什么基本比例在第一个实例中不起作用?
dput(head(GSMA))
structure(list(Country = c("GBR", "CHE", "DEU", "ROU", "LUX",
"KAZ"), entry = c(98.4974384307861, 95.6549962361654, 91.4044539133708,
90.8518393834432, 90.4088099797567, 88.0471547444662), medium = c(86.0081672668457,
93.0372142791748, 91.2993144989014, 100, 96.7348480224609, 100
), high = c(74.6774760159579, 84.1793060302734, 79.542350769043,
99.6931856328791, 97.031680020419, 92.5396745855158)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
奇怪的是,这是正确的:
> scale(head(GSMA[2:4]))
entry medium high
[1,] 1.5644225 -1.5528676 -1.3233285
[2,] 0.8257534 -0.2694974 -0.3755223
[3,] -0.2788406 -0.5868048 -0.8380579
[4,] -0.4224492 1.0017748 1.1719851
[5,] -0.5375798 0.4056202 0.9065003
[6,] -1.1513063 1.0017748 0.4584233
attr(,"scaled:center")
entry medium high
92.47745 94.51326 87.94395
attr(,"scaled:scale")
entry medium high
3.848059 5.477022 10.025077
但这不是:
> GSMA[2:4] <- scale(GSMA[2:4])
> head(GSMA)
# A tibble: 6 x 4
Country entry[,"entry"] [,"medium"] [,"high"] medium[,"entry"] [,"medium"] [,"high"]
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GBR 2.13 1.25 0.870 2.13 1.25 0.870
2 CHE 2.00 1.52 1.27 2.00 1.52 1.27
3 DEU 1.80 1.46 1.07 1.80 1.46 1.07
4 ROU 1.78 1.80 1.92 1.78 1.80 1.92
5 LUX 1.76 1.67 1.81 1.76 1.67 1.81
6 KAZ 1.65 1.80 1.62 1.65 1.80 1.62
# ... with 3 more variables: high[,"entry"] <dbl>, [,"medium"] <dbl>, [,"high"] <dbl>
Tibble 3.0.0 的已知问题。恢复到 2.1.3 的旧行为。
或者:
library(tibble)
iris <- as_tibble(iris)
scale <- scale(iris[1:3])
class(scale)
#> [1] "matrix"
iris[1:3] <- as.data.frame(scale)