如何转置数据框末尾的列?
How can I transpose columns at the end of my data frame?
我有一个名为 data 的数据框,看起来像这样
Record Plot Row Column Cp Csp Entry Year Location Genotype STD V1 V2 V3 W1 W2 W3
521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 2 1 3 1 4 5 4
284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 3 3 3 3 4 6 5
673 103 1 1 0 0 673 2019 Preston NxB-RIL-374-22 3 3 3 3 5 6 7
40 104 1 1 0 0 40 2019 Preston BxN-RIL-347-19 2 2 2 1 3 4 1
715 105 1 1 1 0 715 2019 Preston NorLin 3 2 3 2 3 5 0 0
108 106 1 1 0 0 108 2019 Preston BxN-RIL-351-2 2 2 3 2 5 6 5
456 107 1 1 0 0 456 2019 Preston NxB-RIL-365-18 2 2 4 3 0 3 2
我想做的是把它改成这样
Record Plot Row Column Cp Csp Entry Year Location Genotype Param Value
521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 STD 2
522 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V1 2
523 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V2 1
524 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V3 3
525 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W1 1
526 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W2 4
527 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W3 5
我尝试的是拆分数据框
col_idx <- grep("Genotype", names(data))
val_start <- col_idx + 1
val_end<-length(data) #last attribute column
d1 <- data[1:col_idx]
d2 <- data[val_start:val_end]
headerd2<-names( d2 )
然后迭代它以“重新assemble”它
for (colnum in 1:length(headerd2)) #colnum=2
{
for (row in 1:nrow(d1))
{
d1$Param <- paste(gsub(" ","_",headerd2[colnum]), sep="")#create an environment attribute in the dataframe
d1$Value <- paste(gsub(" ","_",d2[row, colnum]), sep="")#create an environment attribute in the dataframe
write.table(d1, DataFilenameConverted, sep = ",", col.names = !file.exists(DataFilenameConverted), append = T)
}
}
我也试过了
library(reshape2)
d4 <-recast(data, Genotype + variable ~ Genotype, id.var = c("Record", "Plot", "Row", "Column", "Cp", "Csp", "Entry", "Year" , "Location", "Genotype"))
两者均无效,请问关于如何转换此数据的建议?
tidyr::pivot_longer(dat, STD:W3, names_to = "Param", values_to = "Value")
# # A tibble: 49 x 12
# Record Plot Row Column Cp Csp Entry Year Location Genotype Param Value
# <int> <int> <int> <int> <int> <int> <int> <int> <chr> <chr> <chr> <int>
# 1 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 STD 2
# 2 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V1 1
# 3 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V2 3
# 4 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V3 1
# 5 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W1 4
# 6 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W2 5
# 7 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W3 4
# 8 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 STD 3
# 9 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 V1 3
# 10 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 V2 3
# # ... with 39 more rows
数据
dat <- structure(list(Record = c(521L, 284L, 673L, 40L, 715L, 108L, 456L), Plot = 101:107, Row = c(1L, 1L, 1L, 1L, 1L, 1L, 1L), Column = c(1L, 1L, 1L, 1L, 1L, 1L, 1L), Cp = c(0L, 0L, 0L, 0L, 1L, 0L, 0L), Csp = c(0L, 0L, 0L, 0L, 0L, 0L, 0L), Entry = c(521L, 284L, 673L, 40L, 715L, 108L, 456L), Year = c(2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L), Location = c("Preston", "Preston", "Preston", "Preston", "Preston", "Preston", "Preston"), Genotype = c("NxB-RIL-368-16", "BxN-RIL-359-4", "NxB-RIL-374-22", "BxN-RIL-347-19", "NorLin-3", "BxN-RIL-351-2", "NxB-RIL-365-18"), STD = c(2L, 3L, 3L, 2L, 2L, 2L, 2L), V1 = c(1L, 3L, 3L, 2L, 3L, 2L, 2L), V2 = c(3L, 3L, 3L, 2L, 2L, 3L, 4L), V3 = c(1L, 3L, 3L, 1L, 3L, 2L, 3L ), W1 = c(4L, 4L, 5L, 3L, 5L, 5L, 0L), W2 = c(5L, 6L, 6L, 4L, 0L, 6L, 3L), W3 = c(4L, 5L, 7L, 1L, 0L, 5L, 2L)), class = "data.frame", row.names = c(NA, -7L))
我有一个名为 data 的数据框,看起来像这样
Record Plot Row Column Cp Csp Entry Year Location Genotype STD V1 V2 V3 W1 W2 W3
521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 2 1 3 1 4 5 4
284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 3 3 3 3 4 6 5
673 103 1 1 0 0 673 2019 Preston NxB-RIL-374-22 3 3 3 3 5 6 7
40 104 1 1 0 0 40 2019 Preston BxN-RIL-347-19 2 2 2 1 3 4 1
715 105 1 1 1 0 715 2019 Preston NorLin 3 2 3 2 3 5 0 0
108 106 1 1 0 0 108 2019 Preston BxN-RIL-351-2 2 2 3 2 5 6 5
456 107 1 1 0 0 456 2019 Preston NxB-RIL-365-18 2 2 4 3 0 3 2
我想做的是把它改成这样
Record Plot Row Column Cp Csp Entry Year Location Genotype Param Value
521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 STD 2
522 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V1 2
523 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V2 1
524 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V3 3
525 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W1 1
526 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W2 4
527 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W3 5
我尝试的是拆分数据框
col_idx <- grep("Genotype", names(data))
val_start <- col_idx + 1
val_end<-length(data) #last attribute column
d1 <- data[1:col_idx]
d2 <- data[val_start:val_end]
headerd2<-names( d2 )
然后迭代它以“重新assemble”它
for (colnum in 1:length(headerd2)) #colnum=2
{
for (row in 1:nrow(d1))
{
d1$Param <- paste(gsub(" ","_",headerd2[colnum]), sep="")#create an environment attribute in the dataframe
d1$Value <- paste(gsub(" ","_",d2[row, colnum]), sep="")#create an environment attribute in the dataframe
write.table(d1, DataFilenameConverted, sep = ",", col.names = !file.exists(DataFilenameConverted), append = T)
}
}
我也试过了
library(reshape2)
d4 <-recast(data, Genotype + variable ~ Genotype, id.var = c("Record", "Plot", "Row", "Column", "Cp", "Csp", "Entry", "Year" , "Location", "Genotype"))
两者均无效,请问关于如何转换此数据的建议?
tidyr::pivot_longer(dat, STD:W3, names_to = "Param", values_to = "Value")
# # A tibble: 49 x 12
# Record Plot Row Column Cp Csp Entry Year Location Genotype Param Value
# <int> <int> <int> <int> <int> <int> <int> <int> <chr> <chr> <chr> <int>
# 1 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 STD 2
# 2 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V1 1
# 3 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V2 3
# 4 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 V3 1
# 5 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W1 4
# 6 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W2 5
# 7 521 101 1 1 0 0 521 2019 Preston NxB-RIL-368-16 W3 4
# 8 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 STD 3
# 9 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 V1 3
# 10 284 102 1 1 0 0 284 2019 Preston BxN-RIL-359-4 V2 3
# # ... with 39 more rows
数据
dat <- structure(list(Record = c(521L, 284L, 673L, 40L, 715L, 108L, 456L), Plot = 101:107, Row = c(1L, 1L, 1L, 1L, 1L, 1L, 1L), Column = c(1L, 1L, 1L, 1L, 1L, 1L, 1L), Cp = c(0L, 0L, 0L, 0L, 1L, 0L, 0L), Csp = c(0L, 0L, 0L, 0L, 0L, 0L, 0L), Entry = c(521L, 284L, 673L, 40L, 715L, 108L, 456L), Year = c(2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L), Location = c("Preston", "Preston", "Preston", "Preston", "Preston", "Preston", "Preston"), Genotype = c("NxB-RIL-368-16", "BxN-RIL-359-4", "NxB-RIL-374-22", "BxN-RIL-347-19", "NorLin-3", "BxN-RIL-351-2", "NxB-RIL-365-18"), STD = c(2L, 3L, 3L, 2L, 2L, 2L, 2L), V1 = c(1L, 3L, 3L, 2L, 3L, 2L, 2L), V2 = c(3L, 3L, 3L, 2L, 2L, 3L, 4L), V3 = c(1L, 3L, 3L, 1L, 3L, 2L, 3L ), W1 = c(4L, 4L, 5L, 3L, 5L, 5L, 0L), W2 = c(5L, 6L, 6L, 4L, 0L, 6L, 3L), W3 = c(4L, 5L, 7L, 1L, 0L, 5L, 2L)), class = "data.frame", row.names = c(NA, -7L))