r中从列到行的数据转换
Data transformation from columns to rows in r
我有这样的数据框
1954 <- c(a,b,c,d)#names of a person
X2 <- c(5,6,1,2)#their score
1955 <- c(e,f,g,h)
X3 <- c(2,4,6,9)
1956 <- c(j,k,l,m)
X4 <- c(1,3,6,8)
Girls <- data.frame(1954,X2,1955,X3,1956,X4)
女孩数据框看起来像这样
1954 X2 1955 X3 1956 X4 . . . . . . . n
a 5 e 2 j 1 . . . . . . . n
b 6 f 4 k 3 . . . . . . . . n
c 1 g 6 l 6 . . . . . . . . .n
d 2 h 9 m 8 . . . . . . . . . n
我希望数据框看起来像这样
`Name score year(#new col)
a 5 1954
b 6 1954
c 1 1954
d 2 1954
e 2 1955
f 4 1955
g 6 1955
h 9 1955
j 1 1956
k 3 1956
l 6 1956
m 8 1956
. . .
. . .
n n n`
这是一个学校项目,我正在努力转变data.Could有人帮我解决这个问题吗?
我不得不对您的代码进行一些更改,因为列名不能是数字。但这应该可以做到:
X1954 <- c("a","b","c","d")#names of a person
X2 <- c(5,6,1,2)#their score
X1955 <- c("e","f","g","h")
X3 <- c(2,4,6,9)
X1956 <- c("j","k","l","m")
X4 <- c(1,3,6,8)
Girls <- data.frame(X1954,X2,X1955,X3,X1956,X4, stringsAsFactors = FALSE)
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
Girls %>%
as_tibble() %>%
gather(key = "year", value = "Name", X1954, X1955, X1956) %>%
mutate(key = paste0(year, Name)) %>%
gather(key = "key", value = "score", X2, X3, X4) %>%
select(-key) %>%
mutate(year = str_extract(year, "[:digit:]+$"))
#> # A tibble: 36 x 3
#> year Name score
#> <chr> <chr> <dbl>
#> 1 1954 a 5
#> 2 1954 b 6
#> 3 1954 c 1
#> 4 1954 d 2
#> 5 1955 e 5
#> 6 1955 f 6
#> 7 1955 g 1
#> 8 1955 h 2
#> 9 1956 j 5
#> 10 1956 k 6
#> # … with 26 more rows
祝你的学校项目顺利!
由 reprex package (v0.2.1)
于 2019-02-09 创建
没有额外的包,你可以这样做:
setNames(
cbind(
stack(Girls[, grep("\d{4}", names(Girls))]),
stack(Girls[, grep("^X", names(Girls))])[, 1, drop = F]
),
c("Name", "Year", "Score")
)
输出:
Name Year Score
1 a 1954 5
2 b 1954 6
3 c 1954 1
4 d 1954 2
5 e 1955 2
6 f 1955 4
7 g 1955 6
8 h 1955 9
9 j 1956 1
10 k 1956 3
11 l 1956 6
12 m 1956 8
请注意,这需要对您用于创建示例的代码进行一些更改,因为您不能直接将数字作为列名(它们需要在 `` 内,并且字母需要被引号括起来)。
正确的代码是:
`1954` <- c("a","b","c","d")
X2 <- c(5,6,1,2)
`1955` <- c("e","f","g","h")
X3 <- c(2,4,6,9)
`1956` <- c("j","k","l","m")
X4 <- c(1,3,6,8)
Girls <- data.frame(`1954`,X2,`1955`,X3,`1956`,X4,
stringsAsFactors = FALSE, check.names = FALSE)
我有这样的数据框
1954 <- c(a,b,c,d)#names of a person
X2 <- c(5,6,1,2)#their score
1955 <- c(e,f,g,h)
X3 <- c(2,4,6,9)
1956 <- c(j,k,l,m)
X4 <- c(1,3,6,8)
Girls <- data.frame(1954,X2,1955,X3,1956,X4)
女孩数据框看起来像这样
1954 X2 1955 X3 1956 X4 . . . . . . . n
a 5 e 2 j 1 . . . . . . . n
b 6 f 4 k 3 . . . . . . . . n
c 1 g 6 l 6 . . . . . . . . .n
d 2 h 9 m 8 . . . . . . . . . n
我希望数据框看起来像这样
`Name score year(#new col)
a 5 1954
b 6 1954
c 1 1954
d 2 1954
e 2 1955
f 4 1955
g 6 1955
h 9 1955
j 1 1956
k 3 1956
l 6 1956
m 8 1956
. . .
. . .
n n n`
这是一个学校项目,我正在努力转变data.Could有人帮我解决这个问题吗?
我不得不对您的代码进行一些更改,因为列名不能是数字。但这应该可以做到:
X1954 <- c("a","b","c","d")#names of a person
X2 <- c(5,6,1,2)#their score
X1955 <- c("e","f","g","h")
X3 <- c(2,4,6,9)
X1956 <- c("j","k","l","m")
X4 <- c(1,3,6,8)
Girls <- data.frame(X1954,X2,X1955,X3,X1956,X4, stringsAsFactors = FALSE)
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
Girls %>%
as_tibble() %>%
gather(key = "year", value = "Name", X1954, X1955, X1956) %>%
mutate(key = paste0(year, Name)) %>%
gather(key = "key", value = "score", X2, X3, X4) %>%
select(-key) %>%
mutate(year = str_extract(year, "[:digit:]+$"))
#> # A tibble: 36 x 3
#> year Name score
#> <chr> <chr> <dbl>
#> 1 1954 a 5
#> 2 1954 b 6
#> 3 1954 c 1
#> 4 1954 d 2
#> 5 1955 e 5
#> 6 1955 f 6
#> 7 1955 g 1
#> 8 1955 h 2
#> 9 1956 j 5
#> 10 1956 k 6
#> # … with 26 more rows
祝你的学校项目顺利!
由 reprex package (v0.2.1)
于 2019-02-09 创建没有额外的包,你可以这样做:
setNames(
cbind(
stack(Girls[, grep("\d{4}", names(Girls))]),
stack(Girls[, grep("^X", names(Girls))])[, 1, drop = F]
),
c("Name", "Year", "Score")
)
输出:
Name Year Score
1 a 1954 5
2 b 1954 6
3 c 1954 1
4 d 1954 2
5 e 1955 2
6 f 1955 4
7 g 1955 6
8 h 1955 9
9 j 1956 1
10 k 1956 3
11 l 1956 6
12 m 1956 8
请注意,这需要对您用于创建示例的代码进行一些更改,因为您不能直接将数字作为列名(它们需要在 `` 内,并且字母需要被引号括起来)。
正确的代码是:
`1954` <- c("a","b","c","d")
X2 <- c(5,6,1,2)
`1955` <- c("e","f","g","h")
X3 <- c(2,4,6,9)
`1956` <- c("j","k","l","m")
X4 <- c(1,3,6,8)
Girls <- data.frame(`1954`,X2,`1955`,X3,`1956`,X4,
stringsAsFactors = FALSE, check.names = FALSE)