多次从向量列表中采样单个值

Sample a single value from list of vectors multiple times

我有以下向量列表:

list(c(663L, 705L, 680L, 769L, 775L, 327L, 665L, 805L, 808L, 
689L, 774L, 831L, 832L, 217L, 739L, 918L, 354L, 373L, 764L, 691L, 
839L, 372L, 146L, 840L, 727L, 728L, 617L, 647L, 159L, 161L, 581L, 
142L, 618L, 332L, 585L, 134L, 809L, 154L, 158L, 133L, 448L, 736L, 
737L, 815L, 876L, 151L, 750L, 701L, 778L, 861L, 584L, 692L, 427L, 
455L, 601L, 412L, 432L, 449L, 457L, 456L, 620L, 124L, 125L, 679L, 
329L, 667L, 697L, 806L, 807L, 312L, 315L, 733L, 821L, 222L, 583L, 
702L, 631L, 642L, 812L, 850L, 726L, 853L, 129L, 660L, 799L, 410L, 
188L, 798L, 130L, 703L, 341L, 826L, 137L, 253L, 123L, 827L, 844L, 
786L, 655L, 879L, 695L, 749L, 866L, 820L, 890L, 889L, 888L, 694L, 
744L, 746L, 813L, 818L, 868L, 873L, 872L, 869L, 870L, 414L, 738L, 
751L, 208L, 209L, 210L, 899L, 900L, 901L, 903L, 902L, 904L, 913L, 
911L, 912L, 767L, 917L, 777L, 521L, 396L, 397L, 915L, 277L, 529L, 
740L, 509L, 508L, 524L, 224L, 790L, 791L, 698L, 725L, 696L, 817L, 
802L, 897L, 898L, 787L, 788L, 789L, 462L, 356L, 395L, 693L, 745L, 
469L, 519L, 336L, 355L, 792L, 556L, 375L, 398L, 358L, 399L, 720L, 
539L, 558L, 331L, 166L, 167L, 128L, 131L, 214L, 239L, 269L, 276L, 
213L, 337L, 176L, 304L, 503L, 394L, 296L, 298L, 211L, 223L, 238L, 
338L, 487L, 490L, 488L, 489L, 273L, 274L, 892L, 300L, 301L, 816L, 
819L, 275L, 752L, 139L, 206L, 420L, 793L, 215L, 320L, 321L, 676L, 
226L, 699L, 325L, 252L, 319L, 672L, 236L, 306L, 743L, 237L, 439L, 
212L, 675L, 333L, 429L, 476L, 478L, 704L, 768L, 440L, 517L, 518L, 
776L, 810L, 413L, 554L, 555L, 765L, 622L, 626L, 624L, 625L, 231L, 
577L, 335L, 628L, 629L, 511L, 339L, 352L, 353L, 138L, 578L, 349L, 
496L, 611L, 606L, 614L, 612L, 613L, 607L, 609L, 608L, 610L, 328L, 
194L, 195L, 639L, 183L, 632L, 340L, 418L, 308L, 435L, 436L, 437L, 
543L, 905L, 914L, 428L, 374L, 444L, 502L, 825L, 510L, 732L, 557L, 
559L, 730L, 566L, 567L, 506L, 520L, 531L, 534L, 549L, 630L, 174L, 
175L, 140L, 677L, 426L, 377L, 392L, 196L, 186L, 197L, 144L, 141L, 
407L), c(887L, 886L, 884L, 885L), c(528L, 527L, 525L, 526L), 
    c(70L, 71L, 75L, 77L, 72L, 73L, 74L, 76L), c(111L, 109L, 
    110L, 98L, 120L, 112L, 116L, 103L, 106L, 93L, 95L, 94L, 119L, 
    117L, 99L, 118L), c(87L, 88L, 89L, 81L, 82L, 83L, 84L, 85L, 
    86L, 91L, 92L, 949L, 126L, 127L, 90L, 122L), c(530L, 185L, 
    202L, 363L, 729L, 880L, 368L, 401L, 391L, 405L, 906L, 513L, 
    652L, 708L, 552L, 766L, 505L, 382L, 383L, 803L, 565L, 571L, 
    572L, 688L, 460L, 480L, 661L, 153L, 859L, 256L, 268L, 685L, 
    763L, 147L, 865L, 874L, 741L, 754L, 858L, 878L, 220L, 225L, 
    307L, 317L, 313L, 758L, 314L, 848L, 163L, 165L, 387L, 452L, 
    378L, 270L, 271L, 464L, 302L, 280L, 283L, 504L, 712L, 281L, 
    801L), c(595L, 596L, 597L, 908L, 841L, 842L, 493L, 669L, 
    783L, 360L, 507L, 500L, 501L, 823L, 824L, 779L, 891L, 780L, 
    781L, 760L, 379L, 756L, 762L, 857L, 814L, 759L, 854L, 867L, 
    871L, 856L, 855L, 877L, 851L, 852L, 318L, 735L, 811L, 619L, 
    863L, 322L, 326L, 310L, 309L, 323L, 324L, 459L, 700L, 461L, 
    687L, 664L, 668L, 587L, 590L, 562L, 563L, 564L, 574L, 569L, 
    573L, 342L, 547L, 561L, 568L, 575L, 662L, 240L, 316L, 311L, 
    761L, 443L, 445L, 446L, 836L, 755L, 909L, 910L, 830L, 533L, 
    881L, 916L, 716L, 843L, 666L, 690L, 670L, 551L, 173L, 466L, 
    415L, 748L, 718L, 860L, 673L, 747L, 742L, 846L, 875L, 576L, 
    345L, 594L, 604L, 644L, 603L, 602L, 605L, 598L, 441L, 442L, 
    450L, 453L, 616L, 447L, 454L, 419L, 433L, 822L, 431L, 634L, 
    633L, 645L, 586L, 615L, 359L, 421L, 361L, 385L, 386L, 347L, 
    351L, 757L, 834L, 835L, 155L, 481L, 169L, 390L, 170L, 636L, 
    417L, 711L, 160L, 162L, 143L, 156L, 593L, 150L, 657L, 656L, 
    658L, 152L, 648L, 357L, 380L, 434L, 829L, 847L, 580L, 145L, 
    678L, 164L, 430L, 203L, 204L, 198L, 199L, 635L, 637L, 640L, 
    641L, 544L, 179L, 828L, 148L, 254L, 184L, 653L, 650L, 651L, 
    191L, 200L, 201L, 177L, 178L, 181L, 182L, 207L, 495L, 424L, 
    381L, 403L, 282L, 404L, 406L, 710L, 278L, 279L, 494L, 484L, 
    485L, 486L, 425L, 498L, 497L, 334L, 348L, 371L, 463L, 467L, 
    686L, 362L, 402L, 384L, 400L, 230L, 344L, 671L, 684L, 546L, 
    560L, 709L, 479L, 550L, 570L, 388L, 389L, 149L, 190L, 221L, 
    376L), c(1364L, 1373L, 1371L, 1372L, 1148L, 1211L, 1369L, 
    1370L, 1165L, 1377L, 1378L, 1112L, 1140L, 1139L, 1143L, 1019L, 
    1006L, 1247L, 1263L, 1191L, 1208L, 1059L, 1062L, 1115L, 1451L, 
    1448L, 1449L, 1113L, 1144L, 1458L, 1498L, 1499L, 955L, 968L, 
    1093L, 1365L, 1141L, 1265L, 1248L, 1249L, 1040L, 985L, 1119L, 
    1107L, 986L, 1197L, 1317L, 975L, 1155L, 1267L, 1215L, 1266L, 
    1106L, 1111L, 1058L, 1060L, 1457L, 1250L, 1314L, 1234L, 1146L, 
    1315L, 1101L, 1116L, 1310L, 1335L, 1041L, 1114L, 1124L, 954L, 
    1351L, 1358L, 1011L, 1409L, 1049L, 1167L, 1341L, 1278L, 1316L, 
    1392L, 1418L, 1307L, 1342L, 1086L, 1356L, 1432L, 1434L, 1466L, 
    1467L, 1479L, 1501L, 1487L, 1496L, 1495L, 1497L, 1476L, 1505L, 
    1506L, 1508L, 1507L, 1510L, 944L, 950L), c(1069L, 1094L, 
    1200L, 1306L, 981L, 1110L, 1206L, 1308L, 1047L, 1207L, 1312L, 
    1313L, 1109L, 1334L, 1309L, 1332L), c(1237L, 1242L, 1240L, 
    1243L, 1239L, 1238L, 1241L, 1343L, 1181L, 1301L, 1298L, 1300L, 
    1117L, 1133L, 1061L, 1419L, 1416L, 1417L, 1453L, 1311L, 1339L, 
    1333L, 1336L, 1028L, 1079L, 1459L, 1486L, 1192L, 1010L, 1012L, 
    1125L, 1199L, 1142L, 1205L, 1196L, 1198L, 951L, 1137L, 1128L, 
    1435L), c(930L, 942L, 922L, 940L, 941L, 943L, 920L, 921L, 
    923L, 925L, 927L, 928L, 924L, 926L, 931L, 932L, 937L, 938L, 
    939L, 935L, 936L, 929L, 933L, 934L), c(956L, 1051L, 1433L, 
    1468L, 1077L, 973L, 1438L, 1009L, 1158L, 1082L, 1170L, 1195L, 
    1177L, 1212L, 1213L, 1088L, 1153L, 1152L, 1354L, 959L, 1052L, 
    1176L, 1178L, 957L, 1376L, 1374L, 1375L, 1159L, 1223L, 1227L, 
    1268L, 1302L, 1275L, 1285L, 1016L, 1014L, 1126L, 1055L, 1102L, 
    1171L, 1327L, 1183L, 1274L, 1288L, 1296L, 1186L, 1297L, 1426L, 
    1454L, 1515L, 1078L, 989L, 990L, 980L, 1098L, 1150L, 1151L
    ), 78:79, c(1455L, 1475L, 1509L, 1477L, 1478L, 1494L, 1490L, 
    1491L, 1492L, 1427L, 1425L, 1473L, 1471L, 1472L, 1474L, 977L, 
    1179L, 1299L, 1290L, 1292L, 1480L, 1187L, 1295L, 1233L, 1188L, 
    1185L, 1293L, 1184L, 1294L, 1291L, 1175L, 1286L, 1424L, 1469L, 
    1502L, 1503L, 1421L, 1103L, 1488L, 1489L, 1092L, 1452L, 1350L, 
    1046L, 1166L, 1100L, 1305L, 1180L, 1182L, 1190L, 1289L, 979L, 
    961L, 1406L, 1273L, 1303L, 1456L, 1105L, 1331L, 1304L, 1407L, 
    994L, 1022L, 1021L, 1020L, 1025L, 1024L, 1023L, 1026L, 1216L, 
    1163L, 1161L, 1262L, 1156L, 1164L, 1230L, 1228L, 1224L, 80L, 
    953L, 962L, 974L, 992L, 1004L, 1005L, 1017L, 1031L, 1032L, 
    1029L, 1030L, 1057L, 982L, 1003L, 1007L, 1008L, 1042L, 1097L, 
    1089L, 1160L, 963L, 972L, 1070L, 1044L, 1431L, 1194L, 1204L, 
    993L, 1000L, 1001L, 1209L, 1210L, 1470L, 1287L, 1493L, 1075L, 
    1073L, 1074L, 1355L, 1090L, 1154L, 1357L, 1085L, 1087L, 1218L, 
    1504L, 1217L, 1174L, 1269L, 1270L, 1120L, 1272L, 1015L, 1018L, 
    946L, 1145L, 1397L, 971L, 1083L, 1284L, 1045L, 1048L, 1360L, 
    1361L, 1149L, 1282L, 1235L, 1236L, 1172L, 1367L, 1368L, 1345L, 
    964L, 976L, 1189L, 1281L, 1280L, 1279L, 1330L, 1328L, 1329L, 
    1157L, 1271L, 1324L, 1325L, 1081L, 1398L, 1391L, 1393L, 1405L, 
    1420L, 1104L, 1168L, 1201L, 1202L, 1338L, 1340L, 1277L, 1283L, 
    945L, 978L, 1422L, 1054L, 1076L, 960L, 1096L, 1091L, 1080L, 
    1169L, 1276L, 1050L, 1084L, 1035L, 1053L, 1095L, 1173L, 1056L, 
    1099L, 1138L, 997L, 1162L, 958L, 947L, 1344L), c(1222L, 1221L, 
    1219L, 1220L), c(1444L, 1446L, 1447L, 1445L, 1450L, 1132L, 
    1131L, 1130L, 1253L, 1462L, 1129L, 1254L, 965L, 966L, 967L, 
    1463L, 1134L, 1485L, 1483L, 1481L, 1482L, 1513L, 1465L, 1464L, 
    1512L, 1255L, 1258L, 1381L, 1318L, 1257L, 1323L, 1027L, 1251L, 
    1252L, 1214L, 1229L, 1256L, 1225L, 1226L, 1349L, 1352L, 1347L, 
    1348L, 1430L, 1428L, 1429L, 1436L, 1439L, 1440L, 952L, 1399L, 
    1389L, 1410L, 1385L, 1380L, 1401L, 1382L, 1366L, 1404L, 1403L, 
    1402L, 1400L, 1259L, 1415L, 1414L, 1413L, 1411L, 1412L, 1036L, 
    1039L, 1387L, 1386L, 1383L, 1379L, 1396L, 1394L, 1395L), 
    c(1322L, 1321L, 1319L, 1320L), c(998L, 1193L, 1072L, 991L, 
    999L, 1261L, 1326L, 1043L, 1037L, 1038L, 1353L, 1260L, 1390L, 
    1437L, 1346L, 1384L, 1408L, 1127L, 1423L, 1147L, 1135L, 1514L
    ), c(579L, 643L, 189L, 192L, 599L, 600L, 591L, 423L, 458L, 
    422L, 654L, 365L, 772L, 833L, 771L, 770L, 837L, 838L, 227L, 
    416L, 706L, 773L, 849L, 542L, 621L, 364L, 845L, 919L, 346L, 
    707L, 659L, 135L, 721L), c(305L, 255L, 795L, 800L, 719L, 
    734L, 794L, 1108L, 1136L, 1118L, 1071L, 1264L, 1203L, 1337L, 
    108L, 1232L, 1362L), c(674L, 796L, 864L, 235L, 724L, 408L, 
    731L, 723L, 722L, 548L, 168L, 797L, 132L, 205L, 649L, 180L, 
    582L, 330L, 157L, 465L, 499L, 536L, 516L, 883L), c(491L, 
    411L, 171L, 172L, 216L, 681L, 682L, 343L, 862L, 896L, 538L, 
    882L, 907L, 468L, 474L, 473L, 472L, 471L, 470L, 475L, 244L, 
    243L, 242L, 257L, 260L, 263L, 262L, 261L, 259L, 258L, 266L, 
    265L, 264L, 267L, 229L, 483L, 893L, 245L, 241L, 299L, 409L, 
    136L, 638L, 588L, 589L, 234L, 232L, 293L, 294L, 251L, 250L, 
    247L, 246L, 286L, 287L, 292L, 291L, 290L, 272L, 233L, 248L, 
    249L, 297L, 303L, 785L, 717L, 894L, 895L, 366L, 367L, 477L, 
    532L, 350L, 370L), c(289L, 288L, 284L, 285L), c(96L, 101L, 
    104L, 107L, 105L, 114L, 121L, 102L, 113L, 115L, 97L, 100L
    ), c(948L, 970L, 1033L, 969L, 996L, 987L, 988L, 995L, 1002L, 
    1034L, 1067L, 1068L, 1013L, 983L, 984L, 1460L, 1442L, 1500L, 
    1484L, 1246L, 1511L, 1461L, 1123L, 1443L, 1388L, 1063L, 1363L, 
    1064L, 1122L, 1359L, 1121L, 1231L, 1244L, 1245L, 1066L, 1065L, 
    1441L), c(295L, 438L, 753L, 782L, 219L, 228L, 714L, 369L, 
    553L, 393L, 713L, 683L, 784L, 492L, 715L, 482L, 541L, 592L, 
    451L, 627L, 187L, 193L, 804L, 623L, 646L, 514L, 515L, 522L, 
    512L, 523L, 545L, 218L, 535L, 537L, 540L), 16L, c(15L, 18L
    ), 1L, c(7L, 9L), 6L, 14L, 4L, 5L, 3L, 11L, 17L, 8L, 10L)

我想从每次迭代的每个列表条目中采样一个值,以创建一个大型样本矩阵,这意味着我将有 40 列(组的数量)和 5000 行(采样次数)

我尝试了以下方法:

# groups - is the list
# repetition - is 5000
as.matrix(sapply(groups, sample, repetition, TRUE))

这似乎适用于小列表,但当我在大列表上尝试时,我从其他组中获取了不应出现的元素:

使用上述代码的示例:

当你有长度为 1 的向量时,采样发生在 1:x。来自 ?sample :

If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x

所以当你这样做时

set.seed(123)
sample(10, 1)
#[1] 3

它正在从 1 到 10 中选择 1 个数字。为避免这种情况发生,您可以在 sapply 中检查向量的长度:

sapply(groups, function(x) if(length(x) == 1) rep(x, repetition) 
                           else sample(x, repetition, replace = TRUE))

所以当向量的长度为1时,这将return相同的次数repetition

我们可以list将单个值放入子列表中,以避免1:x“方便”。 示例:

groups <- list(2, 9, 2:9, 22:99)

groups[lengths(groups) == 1] <- lapply(groups[lengths(groups) == 1], list)
str(groups)
# List of 4
# $ :List of 1
#  ..$ : num 2
# $ :List of 1
#  ..$ : num 9
# $ : int [1:8] 2 3 4 5 6 7 8 9
# $ : int [1:78] 22 23 24 25 26 27 28 29 30 31 ...

repetition <- 10
set.seed(42)
r <- t(replicate(repetition, sapply(groups, sample, 1, replace=TRUE)))
r
#     [,1] [,2] [,3] [,4]
#  [1,] 2    9    2    46  
#  [2,] 2    9    3    70  
#  [3,] 2    9    9    92  
#  [4,] 2    9    6    41  
#  [5,] 2    9    8    24  
#  [6,] 2    9    4    57  
#  [7,] 2    9    6    26  
#  [8,] 2    9    5    24  
#  [9,] 2    9    3    45  
# [10,] 2    9    8    43  

注意,长度为 1 的子列表是 sampled 作为列表,sapply 在内部使用 simplify2array 将它们简化为整数(即​​ unlists 它们) .

sample 的手册给出了这种情况的解决方案如果'x' 的长度为 1,则在 示例中为数字 与:

resample <- function(x, ...) x[sample.int(length(x), ...)]

set.seed(42)
repetition <- 5
as.matrix(sapply(groups, resample, repetition, TRUE))
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40]
#[1,]  778  886  525   77  106   90  153  310 1059  1110  1243   937  1051    79  1489  1220  1394  1320  1408   706  1232   180   907   288    96  1231   482    16    18     1     9     6    14     4     5     3    11    17     8    10
#[2,]  802  887  526   71  106  126  878  857 1250  1094  1205   936   989    78  1478  1222  1253  1321  1127   838  1362   723   216   284    97  1442   545    16    18     1     7     6    14     4     5     3    11    17     8    10
#[3,]  222  885  528   71   95   82  202  145 1370  1109  1196   938  1433    79  1424  1221  1214  1321   999   845  1264   516   350   288   113   983   537    16    15     1     7     6    14     4     5     3    11    17     8    10
#[4,]  237  884  528   74   98   81  280  309 1365  1313  1028   943   980    79  1277  1222  1463  1321  1437   772  1136   408   287   288   113   987   523    16    18     1     9     6    14     4     5     3    11    17     8    10
#[5,]  224  885  527   75  120   88  763  143 1114   981  1336   943  1052    79  1044  1222  1036  1320  1043   227  1071   674   473   285   104  1441   218    16    18     1     9     6    14     4     5     3    11    17     8    10

其中 sample.int 采用 可供选择的项目数 sample 可供选择的元素或正整数.