base R cor() 函数的结果与 recommenderlab 包中的 similarity() 函数的结果不同?
Different results from base R cor() function than similarity() function in recommenderlab package?
谁能解释为什么这两个相关矩阵 return 不同的结果?
library(recommenderlab)
data(MovieLense)
cor_mat <- as( similarity(MovieLense, method = "pearson", which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print( cor_mat_base[1:5, 1:5] )
dissimilarity() = 1 - pmax(cor(), 0)
R 基函数。另外,重要的是指定 method
使它们都使用同一个:
library("recommenderlab")
data(MovieLense)
cor_mat <- as( dissimilarity(MovieLense, method = "pearson",
which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), method = "pearson"
, use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print(1- cor_mat_base[1:5, 1:5] )
> print( cor_mat[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.0000000 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.0000000 0.0000000 1.0000000
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.0000000 0.0000000
> print(1- cor_mat_base[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.2019687 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.2019687 0.0000000 1.2373503
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.2373503 0.0000000
为了更好地理解它,请查看两个包的详细信息:)。
OP/编辑:
重要的是要指出,甚至 1-dissimilarity
和 cor
之间也有一些值略有不同,cor
大于 1。这是因为 dissimilarity()
设置了一个floor 为 0(即,不 return 负数),并且 cor()
可以 return 值大于 1。https://www.rdocumentation.org/packages/stats/versions/3.6.0/topics/cor 他们只指定
For r <- cor(*, use = "all.obs"), it is now guaranteed that all(abs(r) <= 1).
这应该被评估。
谁能解释为什么这两个相关矩阵 return 不同的结果?
library(recommenderlab)
data(MovieLense)
cor_mat <- as( similarity(MovieLense, method = "pearson", which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print( cor_mat_base[1:5, 1:5] )
dissimilarity() = 1 - pmax(cor(), 0)
R 基函数。另外,重要的是指定 method
使它们都使用同一个:
library("recommenderlab")
data(MovieLense)
cor_mat <- as( dissimilarity(MovieLense, method = "pearson",
which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), method = "pearson"
, use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print(1- cor_mat_base[1:5, 1:5] )
> print( cor_mat[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.0000000 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.0000000 0.0000000 1.0000000
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.0000000 0.0000000
> print(1- cor_mat_base[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.2019687 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.2019687 0.0000000 1.2373503
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.2373503 0.0000000
为了更好地理解它,请查看两个包的详细信息:)。
OP/编辑:
重要的是要指出,甚至 1-dissimilarity
和 cor
之间也有一些值略有不同,cor
大于 1。这是因为 dissimilarity()
设置了一个floor 为 0(即,不 return 负数),并且 cor()
可以 return 值大于 1。https://www.rdocumentation.org/packages/stats/versions/3.6.0/topics/cor 他们只指定
For r <- cor(*, use = "all.obs"), it is now guaranteed that all(abs(r) <= 1).
这应该被评估。