如何从R中循环的输出中创建的循环中找到第二高的值

Question

这里是R程序员新手，我想在数据框中找到与单身人士最相配的人。兼容性基于将点分配给数据框中某些值的算法。我有一个名为 kewl.d00dz 的数据框，它看起来像这样：

name  dream.name birth.state birth.month birth.date major
1   stephen       butch          CO         oct         11  ELEC
2     clark     richard          VA         jan         19  BUAD
3   anthony          bo          NJ         mar         26  BUAD
4      jack     kordell          VA         jul         27  BUAD
5      eric      adrian          ND         jun         17  GEOG
6     tyler     anthony          VA         apr         12  CPSC
7    olivia    isabella          VA         may         29  MATH
8      brad      harvey          HI         aug         21  BUAD
9    hannah     charlie          VA         aug         28  PSYC
10     will      ronald          VA         may         11  BUAD
11     noor         ani          CA         apr         14  BUAD
12 victoria   elizabeth          VA         jan         11  MATH
13 morgan c      lauren          FL         jun         15  BUAD
14 morgan w   elizabeth          VA         feb         21  ARTS
15   helena      helena          VA         apr         26  BIOL
16    amber amber leigh          VA         dec          6  PSCI
17     ekta        kate          VA         apr         14  ARTH
18 caroline     georgia          DC         jun         20  BUAD
19     anna        abby          VA         sep         21  BUAD
20     nate       julio          VA         sep          5  ECON
21  jessica    jeanette          VA         oct          7  BUAD
22   shaina      skylar          VA         sep          2  BUAD
23     ruth        lucy          VA         jan          4  CPSC
24   sohyun    caroline      Seoul          nov         16  PSYC
25    aaron         don          VA         sep          1  ECON
26     alex        axel          VA         sep          6  BIOL
       cell num.bills num.states
1      none         5         41
2     apple         8         14
3     apple         4         14
4     apple        19         10
5     apple         6         19
6   samsung         1         10
7     apple         3          8
8     apple         1         18
9     apple         2         16
10    apple         5         20
11    apple         3         19
12    apple         5         17
13    apple         3         15
14    apple         4         24
15  android         0         18
16    apple         1         12
17    apple         1         19
18    apple         0         22
19    apple         0         27
20  samsung         4         32
21  samsung         5         11
22    apple         0         15
23    apple         7         30
24    apple        10         10
25 motorola         8         18
26      htc         3         20

我需要找到与我在我的函数中输入的任何人最兼容的人：

    source("compatibility.R")
find.most.compatible<-function(x){
  a<-which(kewl.d00dz$name==x)
  x<-as.list(kewl.d00dz[a,])
  pts<-list()
  namez<-list()
  for (i in 1:nrow(kewl.d00dz)){
    y<-as.list(kewl.d00dz[i,])
    pts[i]<-compatibility(x,y)
    namez[i]<-kewl.d00dz[i,"name"]
    names(pts)<-namez
  }
  n<-length(pts)
  (which(pts == sort(pts,partial=n-1)[n-1]))
}

我希望它对我来说是return第二高的价值，因为如果它是return第一的人将最适合自己。但是它给了我这个错误信息：

    > find.most.compatible("stephen")
02727312231332325212224261723292219149302611312321
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

这是我在前面提到的函数中调用的函数我不想更改以下代码：

compatibility<-function(x,y){
  #start point bag
  com.points<-0

  #number of bills compatibility points
  com.points<-com.points +(10-abs(as.integer(x[["num.bills"]] - y[["num.bills"]]))) 

  #different number of states compatibility points
  diff.states<-abs(as.integer(x[["num.states"]]-y[["num.states"]]))
  cat(diff.states)
  if(diff.states<5){
    com.points<-com.points+5
  } else if(diff.states<10){
    com.points<com.points+3
  } else {
    com.points<-com.points
  }
  #birth month compatibility points 
  if(x[["birth.month"]]== "dec"||x[["birth.month"]]== "jan"||x[["birth.month"]]== "feb"){
    season1<-"winter"
  } else if(x[["birth.month"]]== "mar"|| x[["birth.month"]]== "apr" || x[["birth.month"]]== "may"){
    season1<-"spring"
  } else if(x[["birth.month"]]== "jun"||x[["birth.month"]]== "jul"||x[["birth.month"]]== "aug"){
    season1<-"summer"
  } else {
    season1<-"fall"
  }

  if(y[["birth.month"]]== "dec" || y[["birth.month"]]== "jan" || y[["birth.month"]] == "feb"){
    season2<-"winter"
  } else if(y[["birth.month"]]== "mar"||y[["birth.month"]]== "apr"||y[["birth.month"]]== "may"){
    season2<-"spring"
  } else if(y[["birth.month"]]== "jun"||y[["birth.month"]]== "jul"||y[["birth.month"]]== "aug"){
    season2<-"summer"
  } else {
    season2<-"fall"
  }
   if (x[["birth.month"]] == y[["birth.month"]]){
     com.points<-com.points + 3
   } else if(season1==season2){
     com.points<-com.points + 1
   } else {
     com.points<-com.points
   }
  #birth state compatibility points
  if (x[["birth.state"]]==y[["birth.state"]]){
    com.points<-com.points + 1
    } else {
      com.points<-com.points
    }
  #major compatibility points
  if (x[["major"]]==y[["major"]]){
    com.points<-com.points + 4
  } else {
    com.points<-com.points
  }

  #cellular provider compatibility points
  if(x[["cell"]] == y[["cell"]]){
    com.points<-com.points + 2
  } else {
    com.points<-com.points
  }    
return(com.points)
}

有人可以在不使用任何特殊功能（如应用、子集等）的情况下解决我的代码问题吗？

只允许 which.max 等。

Answer 1

我还没有尝试过您的全部代码，但我可以看出您需要将循环修改为类似这样的内容——否则您的函数将在第一次迭代时 return。

我注释掉了 names(pts) 行 b/c 一旦所有项目都在里面，这也可以在你的循环之外。

pts <- list() # if you actually want a list. You could also do c() for a vector

for (i in 1:nrow(kewl.d00dz)) {
  y <- as.list(kewl.d00dz[i,])
  pts[i] <- compatibility(x,y)
  # names(pts) <- sprintf(kewl.d00dz[i,"name"],1:length(pts))
}

return(pts)

如何从R中循环的输出中创建的循环中找到第二高的值

How to find second highest value from created loop from outputs from loop in R

for-loop

r

list

nested-function