循环数据框并创建新行

loop over data frame and create new rows

我有一个包含两列的数据集。帐户名和帐户号。它有 35 行。我想用 AccountName、AccountNumber 和 LocationNumber 创建一个新的数据框。 LocationNumber 存储在另一个数据框中,其中包含 1 列 350 行。

所以基本上对于每个帐户名称和编号,对于每个位置编号,添加另一行,其中包含帐户名称 + 编号 + 位置编号。所以如果我有 35 个帐号和 350 个位置,最终目标是有 12,250 行。我试过使用 for 循环无济于事。

账户(姓名 | 号码)

STR EXP-VACATION ESTIMATE-0200900   200900
STR EXP-HOLIDAY PAY-0200920 200920
STR EXP-SICK PAY-0200930    200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990   200990

地点:

Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005

每个帐号的最终结果

STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-002
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-003
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-004
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-005

将产生我想要的结果的 PHP 代码:

foreach($accounts as $name => $number) {
    foreach($locations as $location) {
        echo sprintf("%s,%s,%s\n", $name, $number, $location);
    }
}

我的解决方案:

acc.run <- function() {
  locFileName <- 'location-list.csv'
  accFileName <- 'account-list.csv'

  locations <- read.csv(locFileName, sep=',', quote='\"', header=T)

  accounts <- read.csv(accFileName, sep=',', quote='\"', header=T)

  #Add row numbers
  accounts$rowNum <- 1:nrow(accounts)

  merged <- merge(accounts, locations)

  sorted <- merged[order(merged$rowNum), ]

  final <- sorted[, !(names(sorted) %in% c('rowNum'))]

  # Random file extension to prevent duplicate/overwriting
  rExt <- paste(round(runif(6,10,100)), sep='', collapse='')

  write.csv(final, paste('accounts-concat', rExt, '.csv', sep='', collapse=''), row.names=F)
}

告诉我如何改进它?

这是我的原始答案的编辑版本, 修改以包含您的测试信息。 这符合您的需求吗?

# Generate some usable test data
accounts <- read.csv(text = "
AccountName|AccountNumber
STR EXP-VACATION ESTIMATE-0200900|200900
STR EXP-HOLIDAY PAY-0200920|200920
STR EXP-SICK PAY-0200930|200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990|200990
", sep = "|")

locations <- read.table(header = TRUE, text = "
Location
Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005
")$Location

# Combine the data into wide format
df <- cbind(accounts, locations = t(locations))
# Restructure the data in long format
reshape(df, varying = grep("locations", names(df)), direction = "long" )