R 中的 Monty Hall - setdiff 意想不到的结果

Question

我正在用 R 编写一个程序来对此处解释的蒙提霍尔问题进行一些模拟，https://www.youtube.com/watch?v=4Lb-6rxZxx0

考虑这段代码，sample(setdiff(doors, c(pick, car)),1) "should" 每次都是 3，但事实并非如此。

doors <- 1:3
pick <- 2
car <- 1
sample(setdiff(doors, c(pick, car)),1)
[1] 3
sample(setdiff(doors, c(pick, car)),1)
[1] 1

知道我哪里出错了吗？

谢谢。

Answer 1

你的问题是你最终调用 sample.int 因为

doors <- 3L
pick <- sample(doors, 1)
car <- sample(doors, 1)
class(setdiff(doors, c(pick, car)))
#R [1] "integer"

和

length(setdiff(doors, c(pick, car)))
#R [1] 1

参见 help("sample.int") 或

body(sample)
#R {
#R    if (length(x) == 1L && is.numeric(x) && is.finite(x) && x >= 
#R         1) {
#R         if (missing(size)) 
#R            size <- x
#R         sample.int(x, size, replace, prob)
#R     }
#R    else {
#R        ...

除非您的集合中有多个变量，否则采样没有意义。

Answer 2

这是我为解决问题而编写的最终代码。我用了一个if语句，只在必要的时候调用sample，也就是候选人用车挑门的情况。我觉得无意中听到额外的条件语句比强制样本以非预期方式工作的成本更有价值。

  doors <- 1:3
  trials <- 1000

  games <- do.call(rbind, lapply(1:trials, function(i){
    pick <- sample(doors, 1)
    car <- sample(doors, 1)
    #open the door the contestant didn't pick and isn't the car
    open_door <- setdiff(doors, c(pick, car))
    #if pick and car are the same, there are two possible doors to open
    #so pick one at random
    #note, sample will malfunction if there is only 1 int passed to it. See documentation.
    #this is the reason for if statement, only deal with the case where there is more than 
    #one int passed
    if(length(open_door)>1) open_door <- sample(open_door, 1)
    #switch to the door that isn't picked and is closed
    switch_to <- setdiff(doors, c(pick, open_door)) 

    data.frame(pick, car, open_door, switch_to)
  }))

  games$switch_wins <- ifelse(games$switch_to == games$car, 1, 0)
  games$stay_wins <- ifelse(games$pick == games$car, 1, 0)

  cat("Switch wins: ", sum(games$switch_wins)/nrow(games), "Stay wins: ", 
      sum(games$stay_wins)/nrow(games), "\n")

输出：

Switch wins:  0.672 Stay wins:  0.328

R 中的 Monty Hall - setdiff 意想不到的结果

Monty Hall In R - setdiff unexpected results

r

sample

set-difference

do.call