R 为每个新行循环 `sample()` 函数

R loop `sample()` function for each new row

library(tidyverse)
fruit %>% 
  as_tibble() %>%
  transmute(fruit = value, fruit.abr = substring(value, 1, sample(3:6, 1)))

#> # A tibble: 80 x 2
#>    fruit        fruit.abr
#>    <chr>        <chr>    
#>  1 apple        app      
#>  2 apricot      apr      
#>  3 avocado      avo      
#>  4 banana       ban      
#>  5 bell pepper  bel      
#>  6 bilberry     bil      
#>  7 blackberry   bla      
#>  8 blackcurrant bla      
#>  9 blood orange blo      
#> 10 blueberry    blu      
#> # ... with 70 more rows

我希望我的缩写水果列是 3 到 6 个字符之间的随机字符串长度。每行将有不同的字符串长度(在 3 到 6 之间)。

我编写代码的方式是选择 3 到 6 之间的样本一次,然后用于每一行。我如何“回收”或“循环”此 sample() 函数以使其 select 每行的新值(例如 3、6、4、3、5 等)?

试试这个,也许更接近你想要的。您可以使用 runif 创建一个介于 3 和 6 之间的随机索引,然后使用 sample() 随机打乱原始单词中的字符。这里的代码:

#Data
df <- data.frame(fruit=c('apple','orange'),stringsAsFactors = F)
#My func
myfunc<-function(x)
{
  y <- unlist(strsplit(x,split=''))
  #Number
  index <- round(runif(1,3,6),0)
  #Create id
  var <- paste0(sample(y,index),collapse = '')
  #Return
  return(var)
}
#Apply
df$ID <- apply(df,1,myfunc)

输出:

   fruit   ID
1  apple eppa
2 orange egnr

添加rowwise()

fruit %>% 
     as_tibble() %>% 
     rowwise() %>% 
     transmute(fruit = value, fruit.abr = substring(value, 1, sample(3:6, 1)))

# A tibble: 80 x 2
# Rowwise: 
   fruit        fruit.abr
   <chr>        <chr>    
 1 apple        apple    
 2 apricot      apri     
 3 avocado      avocad   
 4 banana       bana     
 5 bell pepper  bell     
 6 bilberry     bil      
 7 blackberry   black    
 8 blackcurrant bla      
 9 blood orange blo      
10 blueberry    blu      
# ... with 70 more rows

sample(3:6, 1)returns单个值并且会循环到行的长度。您应该一次抽取与行数相同大小的样本。记得设置replace = TRUE取放样

fruit %>% 
  as_tibble() %>%
  transmute(fruit = value, fruit.abr = substring(value, 1, sample(3:6, n(), TRUE)))

# # A tibble: 10 x 2
#    fruit        fruit.abr
#    <chr>        <chr>    
#  1 apple        "app"    
#  2 apricot      "apr"    
#  3 avocado      "avoca"  
#  4 banana       "banana" 
#  5 bell pepper  "bell "  
#  6 bilberry     "bilbe"  
#  7 blackberry   "blac"   
#  8 blackcurrant "blac"   
#  9 blood orange "blo"    
# 10 blueberry    "blu"

数据

fruit <- structure(list(value = c("apple", "apricot", "avocado", "banana", 
"bell pepper", "bilberry", "blackberry", "blackcurrant", "blood orange", 
"blueberry")), class = "data.frame", row.names = c(NA, -10L))