如何将文本文件中的多个 table 转换为一个 table 并增加一列?
how to turn multiple tables in a text file into one table with an additional column?
我的文本文件 "myfile.txt" 包含许多具有相同列(姓名、年龄、体重、职业)的表格。看起来像:
table_ID 001
John | 38 | 165 | Computer scientist
Mary | 22 | 122 | Student
table_ID 002
Patric| 44 | 105 | Teacher
Kim | 56 | 155 | Salesman
Kate | 33 | 133 | Student
...
table_ID 100
Peter| 44 | 105 | Teacher
Han | 56 | 155 | Salesman
Ken | 33 | 133 | Student
I want to output a data.frame with an additional column ("table_ID"), which looks like:
table_ID name age weight profession
001 John 38 165 Computer scientist
001 Mary 22 122 Student
002 Patric 44 105 Teacher
002 Kim 56 155 Salesman
002 Kate 33 133 Student
...
100 Peter 44 105 Teacher
100 Han 56 155 Salesman
100 Ken 33 133 Student
我如何在 R 中做到这一点?非常感谢。
你可以试试
library(tidyr)
lines <- readLines('paul.txt')
indx <- grepl('table_ID', lines)
lst <- split(lines, cumsum(indx))
names(lst) <- sub('\D+', '', sapply(lst,`[`, 1))
res <- unnest(lapply(lst, function(x)
read.table(text=x[-1], header=FALSE, sep="|")), table_ID)
我的文本文件 "myfile.txt" 包含许多具有相同列(姓名、年龄、体重、职业)的表格。看起来像:
table_ID 001
John | 38 | 165 | Computer scientist
Mary | 22 | 122 | Student
table_ID 002
Patric| 44 | 105 | Teacher
Kim | 56 | 155 | Salesman
Kate | 33 | 133 | Student
...
table_ID 100
Peter| 44 | 105 | Teacher
Han | 56 | 155 | Salesman
Ken | 33 | 133 | Student
I want to output a data.frame with an additional column ("table_ID"), which looks like:
table_ID name age weight profession
001 John 38 165 Computer scientist
001 Mary 22 122 Student
002 Patric 44 105 Teacher
002 Kim 56 155 Salesman
002 Kate 33 133 Student
...
100 Peter 44 105 Teacher
100 Han 56 155 Salesman
100 Ken 33 133 Student
我如何在 R 中做到这一点?非常感谢。
你可以试试
library(tidyr)
lines <- readLines('paul.txt')
indx <- grepl('table_ID', lines)
lst <- split(lines, cumsum(indx))
names(lst) <- sub('\D+', '', sapply(lst,`[`, 1))
res <- unnest(lapply(lst, function(x)
read.table(text=x[-1], header=FALSE, sep="|")), table_ID)