R：重新格式化数据文件

Question

我怀疑这是一个简单的数据重新格式化问题。数据文件 (txt) 的结构是在不同的行上使用观察编号，

期望的输出是，

非常感谢有关如何进行转换的建议。谢谢

Answer 1

我们可以用 readLines 读取文件。创建一个索引变量并拆分 'lines'。删除列表元素的第一个元素，使用read.table读取文件，unnest

 lines <- readLines('file.txt')
 library(stringr)
 #remove leading/lagging spaces if any 
 lines <- str_trim(lines) 
 #create the index mentioned above based on white space 
 indx  <- !grepl('\s+', lines)
 #cumsum the above index to create grouping
 indx1 <- cumsum(indx)
 #split the lines with and change the names of the list elements 
 lst <- setNames(split(lines, indx1), lines[indx])
 #Use unnest after reading with read.table 
 library(tidyr)
 unnest(lapply(lst, function(x) read.table(text=x[-1])), gr)
 #   gr V1 V2
 #1  1 45 65
 #2  1 78 56
 #3  2 89 34
 #4  2 39 55

或者我们可以使用 base R 方法中的 Map

 do.call(rbind,Map(cbind, gr=names(lst), 
             lapply(lst, function(x) read.table(text=x[-1]))))

R：重新格式化数据文件

R: Reformatting data file

r

reformatting