使用字数计算欧氏距离
Compute the Euclidean distance using word counts
考虑以下两个句子。
Sentence 1: The quick brown fox jumps over the lazy dog.
Sentence 2: A quick brown dog outpaces a quick fox.
使用字数计算欧氏距离。
您可以使用包 tm
查找字数,然后计算欧氏距离
> library(tm)
> s1 <- " The quick brown fox jumps over the lazy dog"
> s2 <- "A quick brown dog outpaces a quick fox"
>
> VS <- VectorSource(c(s1,s2))
> corp <- Corpus(VS)
> dtm <- DocumentTermMatrix(corp)
> d <- dist(t(dtm), method = 'euclidean')
> d
brown dog fox jumps lazy outpaces over quick
dog 0.000000
fox 0.000000 0.000000
jumps 1.000000 1.000000 1.000000
lazy 1.000000 1.000000 1.000000 0.000000
outpaces 1.000000 1.000000 1.000000 1.414214 1.414214
over 1.000000 1.000000 1.000000 0.000000 0.000000 1.414214
quick 1.000000 1.000000 1.000000 2.000000 2.000000 1.414214 2.000000
the 1.414214 1.414214 1.414214 1.000000 1.000000 2.236068 1.000000 2.236068
考虑以下两个句子。
Sentence 1: The quick brown fox jumps over the lazy dog.
Sentence 2: A quick brown dog outpaces a quick fox.
使用字数计算欧氏距离。
您可以使用包 tm
查找字数,然后计算欧氏距离
> library(tm)
> s1 <- " The quick brown fox jumps over the lazy dog"
> s2 <- "A quick brown dog outpaces a quick fox"
>
> VS <- VectorSource(c(s1,s2))
> corp <- Corpus(VS)
> dtm <- DocumentTermMatrix(corp)
> d <- dist(t(dtm), method = 'euclidean')
> d
brown dog fox jumps lazy outpaces over quick
dog 0.000000
fox 0.000000 0.000000
jumps 1.000000 1.000000 1.000000
lazy 1.000000 1.000000 1.000000 0.000000
outpaces 1.000000 1.000000 1.000000 1.414214 1.414214
over 1.000000 1.000000 1.000000 0.000000 0.000000 1.414214
quick 1.000000 1.000000 1.000000 2.000000 2.000000 1.414214 2.000000
the 1.414214 1.414214 1.414214 1.000000 1.000000 2.236068 1.000000 2.236068