使三重循环更快

Make triple loop faster

我目前正在处理一个带有标识列“ZA.Betriebnr”的数据集。

我想将这个 10 位的标识号更改为另一个具有从 1 到 600 的唯一 ID 的标识号(AUI.ID)。为此,我有另一个数据集,其中每个“ZA.Betriebnr ”一个通讯员“AUI.ID”(ZA.Betriebsnr。和AUI.Betriebsnummer是等价的)。Feldkalender矩阵有70000行和IDConverter 600,所以不同的“ZA.Betriebnr”可能需要相同的“AUI.ID”。

我的代码:

Read-in data and loading of libraries for the years 2009 to 2018

d9<-read.csv2("Feldkalender_2009.csv",header=T,stringsAsFactors = F)
d10<-read.csv2("Feldkalender_2010.csv",header=T,stringsAsFactors = F)
d11<-read.csv2("Feldkalender_2011.csv",header=T,stringsAsFactors = F)
d12<-read.csv2("Feldkalender_2012.csv",header=T,stringsAsFactors = F)
d13<-read.csv2("Feldkalender_2013.csv",header=T,stringsAsFactors = F)
d14<-read.csv2("Feldkalender_2014.csv",header=T,stringsAsFactors = F)
d15<-read.csv2("Feldkalender_2015.csv",header=T,stringsAsFactors = F)
d16<-read.csv2("Feldkalender_2016.csv",header=T,stringsAsFactors = F)
d17<-read.csv2("Feldkalender_2017.csv",header=T,stringsAsFactors = F)
d18<-read.csv2("Feldkalender_2018.csv",header=T,stringsAsFactors = F)

ds<-list(d9,d10,d11,d12,d13,d14,d15,d16,d17,d18)

为 2009 年到 2018 年创建向量

years<-c(2009,2010,2011,2012,2013,2014,2015,2016,2017,2018)

正在为 AUI.Betriebsnr.

加载对应 AUI.ID 的文件
IDConverter<-read.csv2("Zuweisung_BetrNr_AUI_ID.csv",header=T,stringsAsFactors = F)

替换ZA.Betriebnsr。识别变量由唯一 AUI.ID 识别变量

options("scipen"=100, "digits"=4)

for ( i in 1:length(years)){
  for (j in 1:length(ds[[i]]$ZA.Betriebsnr.)){
  for(k in 1:length(IDConverter$AUI.Betriebsnummer)){
  o3<-IDConverter
  o<-ds[[i]]
  o$ZA.Betriebsnr.[o$ZA.Betriebsnr.[j] == o3$AUI.Betriebsnummer[k]]<-o3$AUI.ID[k]
    }
    ds[[i]]$ZA.Betriebsnr.[j]<-o$ZA.Betriebsnr.[j]
  }
}

我制作了一个三重循环外观,效果不错,但需要很长时间才能结束。有人知道如何让它 运行 更快吗?

数据:

ds((1)) IDConverter

ds((1)):

 [1] 10100001151 10100001248 10100001252 10100001392 10100001445 10100001633
 [7] 10100048900 10100091095 10100091571 10200002050 10200002133 10200002162
 [13] 10200002248 10200002260 10200002280 10200002299 10200002300 10200002304
 [19] 10200002324 10200002328 10200002380 10200002509 10200002517 10200002518
 [25] 10200002549 10200002550 10200002553 10200002900 10200024541 10200049529
 [31] 10200049709 10200050203 10200050694 10200050716 10200123451 10300003001
 [37] 10300003195 10300003333 10300003379 10300003469 10300003546 10300003757
 [43] 10300003770 10300043107 10300043152 10300052076 10300053285 10300054306
 [49] 10300056544 10300058629 10400004248 10400004709 10400004894 10400004913
 [55] 10400004922 10400047954 10500005013 10500005027 10500005032 10500005069
 [61] 10500005105 10500005153 10500005160 10500006021 10500006030 10500006047
 [67] 10500006084 10500006098 10500007001 10500007027 10500007030 10500073183
 [73] 11000010059 11000010067 11000010071 11000010108 11000010142 11000010185
 [79] 11000010358 11000010367 11000010405 11000010413 11000010507 11000010537
 [85] 11000010542 11000010679 11000010694 11000010704 11000010721 11000010768
 [91] 11000010779 11000010789 11100011215 11100011293 11100011346 11100011354
 [97] 11100011356 11100011364 11100011517 11100011592 11100011643 11200012286
[103] 11200012567 11200012701 11200012738 11200012784 11200012793 11200012868

ID转换器:

structure(list(AUI.ID = c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 
8L, 9L, 10L, 11L, 12L, 12L, 13L, 14L, 14L, 15L, 16L, 17L, 18L, 
19L, 20L, 21L, 21L, 22L, 22L, 23L, 24L, 24L, 25L, 26L, 26L, 27L, 
28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 37L, 37L, 39L, 40L, 41L, 
42L, 43L, 45L, 46L, 47L, 48L, 49L, 50L, 50L, 51L, 52L, 53L, 54L, 
55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 
67L, 68L, 69L, 70L, 71L, 72L, 73L, 73L, 74L, 75L, 76L, 76L, 77L, 
78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 89L, 
90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L, 101L, 
102L, 103L, 104L, 105L, 106L, 107L, 108L, 110L, 111L, 112L, 113L, 
114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 124L, 
124L, 125L, 126L, 127L, 128L, 128L, 129L, 130L, 133L, 133L, 134L, 
134L, 134L, 135L, 135L, 136L, 136L, 137L, 137L, 138L, 138L, 139L, 
139L, 140L, 140L, 141L, 141L, 142L, 142L, 143L, 143L, 144L, 144L, 
145L, 146L, 146L, 147L, 147L, 148L, 149L, 149L, 150L, 151L, 151L, 
153L, 153L, 154L, 154L, 154L, 155L, 155L, 156L, 156L, 157L, 158L, 
159L, 159L, 160L, 161L, 161L, 162L, 162L, 163L, 164L, 164L, 165L, 
166L, 167L, 168L, 169L, 169L, 170L, 170L, 170L, 171L, 172L, 172L, 
173L, 174L, 174L, 175L, 175L, 176L, 176L, 177L, 178L, 179L, 180L, 
180L, 181L, 181L, 182L, 182L, 182L, 184L, 185L, 186L, 187L, 187L, 
187L, 188L, 189L, 189L, 190L, 191L, 192L, 193L, 195L, 196L, 196L, 
196L, 197L, 198L, 199L, 200L, 201L, 202L, 203L, 204L, 204L, 205L, 
205L, 205L, 206L, 207L, 208L, 208L, 209L, 210L, 213L, 213L, 214L, 
214L, 215L, 215L, 216L, 217L, 218L, 219L, 220L, 221L, 221L, 222L, 
222L, 223L, 224L, 226L, 227L, 228L, 230L, 231L, 232L, 232L, 233L, 
234L, 235L, 236L, 237L, 238L, 238L, 240L, 241L, 242L, 243L, 244L, 
245L, 245L, 246L, 247L, 248L, 249L, 251L, 252L, 253L, 254L, 255L, 
255L, 256L, 257L, 258L, 259L, 260L, 261L, 261L, 262L, 264L, 265L, 
266L, 267L, 268L, 268L, 269L, 270L, 271L, 271L, 272L, 272L, 273L, 
273L, 276L, 277L, 278L, 279L, 280L, 281L, 282L, 283L, 284L, 285L, 
285L, 286L, 287L, 288L, 289L, 290L, 290L, 291L, 292L, 293L, 293L, 
294L, 294L, 295L, 296L, 296L, 297L, 298L, 299L, 300L, 301L, 301L, 
302L, 303L, 304L, 304L, 305L, 306L, 306L, 307L, 308L, 308L, 309L, 
309L, 310L, 311L, 311L, 312L, 313L, 314L, 315L, 316L, 317L, 318L, 
319L, 320L, 321L, 322L, 323L, 323L, 324L, 325L, 326L, 327L, 328L, 
329L, 330L, 331L, 331L, 332L, 333L, 333L, 334L, 335L, 336L, 337L, 
337L, 338L, 339L, 339L, 340L, 341L, 341L, 342L, 342L, 342L, 343L, 
343L, 344L, 344L, 345L, 346L, 346L, 347L, 347L, 348L, 349L, 349L, 
349L, 350L, 351L, 351L, 352L, 352L, 353L, 354L, 354L, 355L, 356L, 
356L, 357L, 358L, 358L, 359L, 359L, 360L, 361L, 362L, 362L, 363L, 
364L, 364L, 364L, 365L, 366L, 367L, 368L, 369L, 370L, 371L, 372L, 
373L, 374L, 375L, 376L, 377L, 378L, 379L, 381L, 384L, 385L, 386L, 
387L, 388L, 389L, 392L, 393L, 394L, 395L, 396L, 397L, 402L, 402L, 
403L, 403L, 406L, 407L, 409L, 409L, 409L, 410L, 411L, 411L, 412L, 
413L, 414L, 415L, 415L, 416L, 417L, 418L, 420L, 421L, 422L, 423L, 
423L, 425L, 425L, 426L, 428L, 429L, 430L, 433L, 433L, 434L, 435L, 
436L, 437L, 438L, 439L, 440L, 441L, 442L, 443L, 444L, 445L, 445L, 
446L, 446L, 447L, 448L, 449L, 450L, 451L, 452L, 453L, 454L, 455L, 
456L, 457L, 457L, 458L, 459L, 460L, 461L, 463L, 464L, 465L, 466L, 
467L, 468L, 469L, 470L, 471L, 472L, 473L, 474L, 475L, 476L, 477L, 
478L, 479L, 480L, 481L, 482L, 483L, 484L, 485L, 486L, 487L, 488L, 
489L, 490L, 491L, 492L, 493L, 494L, 495L, 496L, 497L, 498L, 499L, 
500L, 501L, 502L, 503L, 504L, 505L, 506L, 507L, 508L, 509L, 510L, 
511L, 512L, 513L, 514L, 515L, 516L, 517L, 518L, 519L, 520L, 521L, 
522L, 523L, 524L, 525L, 526L, 527L, 528L, 529L, 530L, 531L, 532L, 
533L, 534L, 535L, 536L, 537L, 538L, 539L, 540L, 541L, 542L, 543L, 
544L, 545L, 546L, 547L, 548L, 549L, 550L), AUI.Betriebsnummer = c(10100001151, 
10100001248, 
10100001252, 10100001392, 10100001445, 50023000140, 10100001595, 
50023000141, 10100001633, 10100048900, 10100091095, 10100091571, 
10200002050, 10200002133, 10200052063, 10200002162, 10200002248, 
10200002560, 10200002260, 10200002280, 10200002299, 10200002300, 
10200002304, 10200002312, 10200002324, 10200002583, 10200002328, 
10200023281, 10200002380, 10200024541, 10200002454, 10200002461, 
10200123451, 10200002490, 10200002491, 10200002509, 10200002517, 
10200002518, 10200002541, 10200002545, 10200002549, 10200002550, 
10200002553, 10200002900, 10200092900, 10200049529, 10200049709, 
10200050203, 10200050694, 10200050716, 10300003001, 10300003195, 
10300003333, 10300003379, 10300003469, 10300003532, 10300067247, 
10300003546, 10300003757, 10300003770, 10300043107, 10300043152, 
10300052076, 10300053285, 10300054306, 10300056436, 10300056544, 
10300058629, 10400004248, 10400004709, 10400004894, 10400004913, 
10400004922, 12100054023, 10400004940, 10400047954, 10500005013, 
10500005027, 10500005032, 10500005069, 10500005105, 10500005106, 
10500005153, 10500005160, 10500006021, 40600003014, 10500006030, 
10500006047, 10500006084, 10500006098, 10500007001, 10500007027, 
10500007030, 10500073183, 11000010059, 11000010067, 11000010071, 
11000010107, 11000010108, 11000010796, 11000010142, 11000010185, 
11000010358, 11000010367, 11000010405, 11000010413, 11000010507, 
11000010537, 11000010542, 11000010679, 11000010694, 11000010704, 
11000010721, 11000010732, 11000010746, 11000010768, 11000010779, 
11000010789, 11000010792, 11100011215, 11100011293, 11100011346, 
11100011354, 11100011356, 11100011364, 11100011421, 11100011517, 
11100011592, 11100011643, 11100050411, 11100061583, 11200012286, 
11200012567, 11200012701, 40500012701, 11200012738, 11200012784, 
11200012793, 11200012868, 40500012868, 11200012878, 11200042615, 
11300013064, 11300001473, 11300013142, 11300013391, 11300004307, 
11300013153, 11300001142, 11300013160, 11300001355, 11300013173, 
11300001252, 11300013175, 11300001692, 11300013207, 11300001524, 
11300013253, 11300001691, 11300013275, 11300003062, 11300013297, 
11300001527, 11300013316, 11300001038, 11300013321, 11300002180, 
11300013334, 11300013344, 11300002012, 11300013358, 11300001424, 
11300013359, 11300113368, 11300013368, 11300013373, 11300013419, 
11300002032, 11500015038, 40500015038, 11500015053, 11600015053, 
11200015053, 11500015070, 40500015070, 11500015086, 40500015086, 
11500015161, 11600012207, 11600015183, 40500015183, 11600016019, 
11600016369, 40500016369, 11600016479, 40500016479, 11600016501, 
11600016505, 40500016505, 11600016511, 11600046326, 11600046631, 
11700017220, 11700017231, 11700017234, 12100012140, 10500006130, 
40600003031, 12100020125, 12100020129, 33800020129, 12100020234, 
12100202371, 12100020237, 12100020244, 40600003045, 12100020257, 
10500006119, 12100020285, 12100020302, 12100020309, 12100203641, 
12100020364, 12100076053, 12100020423, 12100044092, 10500006134, 
40600003004, 12100056058, 12100065060, 12100066090, 12100068039, 
10500006121, 40600003022, 12100068051, 12100068082, 12100021029, 
12100068186, 12100069117, 12100073027, 12100076004, 12100077055, 
12100078018, 10500006116, 40600002026, 12100078031, 12100078036, 
12100079017, 12100082016, 12100083070, 12100086025, 12100086206, 
12100087022, 10500006122, 12100087028, 33800087028, 33800040081, 
12100087039, 12100087054, 12100090030, 11000010857, 12100091003, 
12100095028, 20101001048, 20101002536, 20101011068, 20101522505, 
20101018528, 20101018556, 20101021506, 20101023523, 20101024001, 
20101025504, 20101025510, 20101045022, 20101025511, 20101039043, 
20101026536, 20101030020, 20101036505, 20101040508, 20101044050, 
20101045012, 20101045514, 20101047508, 20101049005, 20101049523, 
20101056043, 20101057501, 20101082014, 20101089004, 20101089046, 
20101089056, 20101089521, 20101104520, 20101105017, 20101105026, 
20101105502, 20101112004, 20101122031, 20101112523, 20101123002, 
20101125018, 20101501505, 20101510507, 20111000129, 20111003006, 
20111012905, 20300003110, 20300003139, 20300003928, 20300003196, 
20300003203, 20300003277, 20300003309, 20300003334, 20300003750, 
50023000133, 20300003850, 20300003930, 20400014491, 20400014572, 
20400014691, 20414603014, 20414611012, 20414605011, 20414606022, 
20414607001, 20414615005, 20414607009, 20414610012, 20414609013, 
20414611011, 20414681008, 20414689013, 20414693224, 20414693950, 
20414699906, 20500005113, 20500005407, 20600006010, 20600006688, 
20600006694, 20600006093, 20700007184, 20700007423, 20700007561, 
20700007592, 20700007630, 20600006342, 20700007739, 20700007925, 
22400024056, 20400024056, 22400024386, 22424615008, 22400024808, 
22400024813, 20400024813, 22400024819, 22400024873, 22400024892, 
22400024895, 22424605004, 20424605004, 22424605007, 22424606007, 
22424607001, 22424614003, 22424607006, 22424607009, 22424613017, 
22424607010, 22424683003, 20424683003, 20424690006, 22424690006, 
22424690008, 22424691007, 20424691007, 22424693915, 22424693916, 
22500025602, 22500025922, 22600026611, 33800010595, 33800038215, 
33800038245, 33800038293, 33800038297, 33800038650, 33800038710, 
40400038710, 33800038839, 33800038841, 33800038845, 33800038850, 
33800038869, 33800038879, 33800039342, 33800039455, 40400039455, 
33800039484, 33800039697, 33800038967, 33800039741, 33800039814, 
33800042204, 34000040001, 34000010001, 34000040007, 34000040014, 
34200040014, 34000040029, 34000040036, 34000010036, 34000040064, 
34000010064, 34000011077, 34000010071, 34000040071, 34000040142, 
34000010142, 34000040148, 34000040236, 34000010236, 34000010276, 
34000040276, 34000040330, 34000040349, 34200040349, 34000010349, 
34000040350, 34000040745, 34000010745, 34000040750, 34000010750, 
34200042003, 34200042192, 34000020533, 34200042237, 34200042282, 
34200020282, 34200042308, 34200042683, 34200020683, 34200042706, 
34200020706, 34200042744, 40100001069, 40100001129, 40700001129, 
40100001198, 40200002033, 40200033333, 40200064112, 40200014010, 
40200064073, 40300050012, 40300050040, 40300050050, 40300050060, 
40300050130, 40300050280, 40300050300, 40300050340, 40300050450, 
40300050465, 40300059999, 20101105510, 11100011432, 12100020374, 
22424610001, 40200014020, 40200014041, 40200014060, 40200014143, 
40200015003, 10200049005, 10300003015, 10300043087, 10300055348, 
10300056171, 10300058510, 11100021278, 11100051278, 11300013392, 
11300004444, 20400014298, 20414609026, 34000040174, 34200040174, 
34000010174, 34000040188, 40100001161, 40739621104, 40200064231, 
40300050640, 10200002336, 10200002487, 10200002601, 10200002493, 
10200002505, 10200002579, 10200049289, 10200050612, 10400050736, 
11600012905, 40500012905, 12100079050, 12100103904, 20101006001, 
20600006785, 20600006806, 22424613003, 34200042689, 34200030689, 
40200014119, 40200044205, 40300060005, 20484610012, 10200002347, 
10200002494, 10200002584, 10200051553, 10300233561, 10400004860, 
10500005080, 11200001330, 40500001330, 11600016543, 40500016543, 
12100020408, 12100021025, 12100047309, 20101015520, 20101032522, 
20101089525, 20101104527, 20101122023, 20600006798, 20600006808, 
34200042248, 34200020248, 40100001076, 40100001077, 40300050730, 
10400050794, 20101035542, 20101045535, 20111018901, 40300050740, 
34200020737, 10300061548, 10300062011, 10400047179, 10400050776, 
10500000001, 11100070131, 12100046094, 10400050787, 20101014026, 
20101021530, 20101043527, 20424608001, 34200030559, 34200030675, 
40300050467, 50023000101, 50023000102, 50023000104, 50023000105, 
50023000107, 50023000108, 50023000109, 50023000111, 50023000112, 
50023000113, 50023000114, 50023000115, 50023000117, 50023000119, 
50023000120, 50023000121, 50023000123, 50023000124, 50023000125, 
50023000126, 50023000127, 50023000129, 40200064160, 10300023157, 
10400047256, 10400004658, 40600002024, 40600002090, 40200044179, 
10400050790, 11100051027, 10300054270, 40200064215, 20111019049, 
50023000122, 50023000134, 50023000137, 50023000138, 50023000139, 
20101002552, 11000090575, 12100013085, 12100013088, 12100021051, 
12100248133, 20101107522, 20101112529, 34000010768, 34200030877, 
40200044188, 40200064025, 40400039414, 50023000136, 50023000142, 
50023000143, 50023000147, 50023000153, 50023000155, 50023000158, 
50023000165, 50023000166, 50023000167, 50023000170, 20600005170, 
11300004623, 50023000181, 50023000150, 50023000202)), class = "data.frame", 
row.names = c(NA, -636L))

首先,为了优化数据加载,您可以执行一个循环为每个文件构建路径(因为它们遵循清晰的模式),然后将文件直接放入列表而不是分配变量然后列出它们:

ds = list()
for(i in 1:10){
  ds[[i]] = read.csv2(paste("Feldkalender_", 2008+i, ".csv", sep=""),
                      header=T,stringsAsFactors = F)}

通过这个循环,已经很清楚你有 10 年了,所以你不需要创建一个向量并测量它的长度。但是,如果您想这样做,更简单的方法是 years = 2009:20018.

至于您的三重循环,您需要向我们提供一些数据,以便我们 运行 编写代码。复制并粘贴 dput(ds[[1]])(列表中的第一项)和 dput(IDConverter).

的输出

假设你的IDConverter看起来像这样

> IDConverter
   AUI.Betriebsnummer  AUI.ID
1                   a  lovely
2                   b    flap
3                   c    drum
4                   d    cars
5                   e  flight
6                   f pretend
7                   g    fuel
8                   h    self
9                   i    line
10                  j  letter

然后,您可以将其转换为以 AUI.Betriebsnummer 作为名称的命名向量。

> IDConverter <- with(IDConverter, setNames(AUI.ID, AUI.Betriebsnummer))
> IDConverter
        a         b         c         d         e         f         g         h         i         j 
 "lovely"    "flap"    "drum"    "cars"  "flight" "pretend"    "fuel"    "self"    "line"  "letter" 

此转换后,您可以将一组项目直接映射到另一组项目,例如,

unname(IDConverter[c("a", "b", "f")])

这给了你

> unname(IDConverter[c("a", "b", "f")])
[1] "lovely"  "flap"    "pretend"

将此方法与其他一些优化一起使用,您的代码将变为

years < -c(2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018)
IDConverter <- read.csv2("Zuweisung_BetrNr_AUI_ID.csv", header = T, stringsAsFactors = F)
IDConverter <- with(IDConverter, setNames(AUI.ID, AUI.Betriebsnummer))

dfs <- lapply(years, function(y) {
  df <- read.csv2(paste0("Feldkalender_", y, ".csv"), header = T, stringsAsFactors = F)
  within(df, ZA.Betriebsnr. <- unname(IDConverter[ZA.Betriebsnr.]))
})

这应该比你的旧版本更有效率。

更新

你得到了 NA,因为 ZA.Betriebsnr. 包含数字,而不是字符。您只需要将其转换为字符向量即可。做这样的事情

dfs <- lapply(years, function(y) {
  df <- read.csv2(paste0("Feldkalender_", y, ".csv"), header = T, stringsAsFactors = F)
  within(df, ZA.Betriebsnr. <- unname(IDConverter[as.character(ZA.Betriebsnr.)]))
})