在 R 中,我如何根据特定的 row/column 标准有选择地 'copy and paste' 一个单元格到另一个单元格?
In R, how do I selectively 'copy and paste' a cell into another cell based on specific row/column criteria?
我有一些数据(见下文),其中我的参与者(ID
列)在三个变量(Name_A
、Name_B
和 Name_C
).这些分数目前针对相关变量水平记录在 X1Score
、X2Score
和 X3Score
列中。我想将这些分数 'copied and pasted'(因为缺少更好的短语)添加到相关的列中——Name_A
、Name_B
和 Name_C
(当前填充 NA
) – 这样我就有了长格式的数据。我该怎么做?
ID X1 X1Score X2 X2Score X3 X3Score Name_A Name_B Name_C
1 Name_A 4.58 Name_C 4.79 Name_B 5.22 NA NA NA
2 Name_C 5.35 Name_B 5.33 Name_A 5.61 NA NA NA
3 Name_B 5.59 Name_C 5.48 Name_A 4.89 NA NA NA
4 Name_C 5.36 Name_B 5.04 Name_A 4.93 NA NA NA
5 Name_A 5.39 Name_B 5.27 Name_C 5.11 NA NA NA
6 Name_C 4.91 Name_A 4.99 Name_B 5.01 NA NA NA
df <- structure(list(ID = 1:6,
X1 = c("Name_A", "Name_C", "Name_B", "Name_C", "Name_A", "Name_C"),
X1Score = c(4.58, 5.35, 5.59, 5.36, 5.39, 4.91),
X2 = c("Name_C", "Name_B", "Name_C", "Name_B", "Name_B", "Name_A"),
X2Score = c(4.79, 5.33, 5.48, 5.04, 5.27, 4.99),
X3 = c("Name_B", "Name_A", "Name_A", "Name_A", "Name_C", "Name_B"),
X3Score = c(5.22, 5.61, 4.89, 4.93, 5.11, 5.01),
Name_A = c(NA, NA, NA, NA, NA, NA),
Name_B = c(NA, NA, NA, NA, NA, NA),
Name_C = c(NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -6L), class = "data.frame")
#Edit:我上面的原始请求太简单了,虽然答案在技术上解决了这个问题,但我不明白如何概括它。因此,这是一个修改后的示例(其中唯一的主要区别是列的命名约定)- 尽管在上述示例中使用相同的代码,但此示例会产生错误。我希望通过我的问题的另一个示例,我将能够理解 'X\d+(.*)'
行,因为它看起来是使其工作的关键。这是更新后的示例:
df <- structure(list(ID = 1:6,
X1_Name = c("Name_A", "Name_C", "Name_B", "Name_C", "Name_A", "Name_C"),
X1_Score = c(4.58, 5.35, 5.59, 5.36, 5.39, 4.91),
X5_Name = c("Name_C", "Name_B", "Name_C", "Name_B", "Name_B", "Name_A"),
X5_Score = c(4.79, 5.33, 5.48, 5.04, 5.27, 4.99),
X19_Name = c("Name_B", "Name_A", "Name_A", "Name_A", "Name_C", "Name_B"),
X19_Score = c(5.22, 5.61, 4.89, 4.93, 5.11, 5.01)),
row.names = c(NA, -6L), class = "data.frame")
df %>%
#get the data in long format creating two columns Name and Score
pivot_longer(cols = -ID,
names_to = '.value',
names_pattern = 'X\d+(.*)') %>%
#Get data in wide format.
pivot_wider(names_from = Name, values_from = Score)
您可以使用 pivot_longer
/pivot_wider
-
进行整形
library(dplyr)
library(tidyr)
df %>%
#To drop empty NA columns
select(-starts_with('Name')) %>%
#Rename X1 to X1Name, X2 to X2Name and so on
rename_with(~paste0(., 'Name'), matches('^X\d+$')) %>%
#get the data in long format creating two columns Name and Score
pivot_longer(cols = -ID,
names_to = '.value',
names_pattern = 'X\d+(.*)') %>%
#Get data in wide format.
pivot_wider(names_from = Name, values_from = Score)
# ID Name_A Name_C Name_B
# <int> <dbl> <dbl> <dbl>
#1 1 4.58 4.79 5.22
#2 2 5.61 5.35 5.33
#3 3 4.89 5.48 5.59
#4 4 4.93 5.36 5.04
#5 5 5.39 5.11 5.27
#6 6 4.99 4.91 5.01
如果您想保留数据中的所有其他列并单独添加这 3 列,您可以将数据集与原始数据集合并。
...Code from above %>%
left_join(df %>% select(-starts_with('Name')), by = 'ID')
我有一些数据(见下文),其中我的参与者(ID
列)在三个变量(Name_A
、Name_B
和 Name_C
).这些分数目前针对相关变量水平记录在 X1Score
、X2Score
和 X3Score
列中。我想将这些分数 'copied and pasted'(因为缺少更好的短语)添加到相关的列中——Name_A
、Name_B
和 Name_C
(当前填充 NA
) – 这样我就有了长格式的数据。我该怎么做?
ID X1 X1Score X2 X2Score X3 X3Score Name_A Name_B Name_C
1 Name_A 4.58 Name_C 4.79 Name_B 5.22 NA NA NA
2 Name_C 5.35 Name_B 5.33 Name_A 5.61 NA NA NA
3 Name_B 5.59 Name_C 5.48 Name_A 4.89 NA NA NA
4 Name_C 5.36 Name_B 5.04 Name_A 4.93 NA NA NA
5 Name_A 5.39 Name_B 5.27 Name_C 5.11 NA NA NA
6 Name_C 4.91 Name_A 4.99 Name_B 5.01 NA NA NA
df <- structure(list(ID = 1:6,
X1 = c("Name_A", "Name_C", "Name_B", "Name_C", "Name_A", "Name_C"),
X1Score = c(4.58, 5.35, 5.59, 5.36, 5.39, 4.91),
X2 = c("Name_C", "Name_B", "Name_C", "Name_B", "Name_B", "Name_A"),
X2Score = c(4.79, 5.33, 5.48, 5.04, 5.27, 4.99),
X3 = c("Name_B", "Name_A", "Name_A", "Name_A", "Name_C", "Name_B"),
X3Score = c(5.22, 5.61, 4.89, 4.93, 5.11, 5.01),
Name_A = c(NA, NA, NA, NA, NA, NA),
Name_B = c(NA, NA, NA, NA, NA, NA),
Name_C = c(NA, NA, NA, NA, NA, NA)),
row.names = c(NA, -6L), class = "data.frame")
#Edit:我上面的原始请求太简单了,虽然答案在技术上解决了这个问题,但我不明白如何概括它。因此,这是一个修改后的示例(其中唯一的主要区别是列的命名约定)- 尽管在上述示例中使用相同的代码,但此示例会产生错误。我希望通过我的问题的另一个示例,我将能够理解 'X\d+(.*)'
行,因为它看起来是使其工作的关键。这是更新后的示例:
df <- structure(list(ID = 1:6,
X1_Name = c("Name_A", "Name_C", "Name_B", "Name_C", "Name_A", "Name_C"),
X1_Score = c(4.58, 5.35, 5.59, 5.36, 5.39, 4.91),
X5_Name = c("Name_C", "Name_B", "Name_C", "Name_B", "Name_B", "Name_A"),
X5_Score = c(4.79, 5.33, 5.48, 5.04, 5.27, 4.99),
X19_Name = c("Name_B", "Name_A", "Name_A", "Name_A", "Name_C", "Name_B"),
X19_Score = c(5.22, 5.61, 4.89, 4.93, 5.11, 5.01)),
row.names = c(NA, -6L), class = "data.frame")
df %>%
#get the data in long format creating two columns Name and Score
pivot_longer(cols = -ID,
names_to = '.value',
names_pattern = 'X\d+(.*)') %>%
#Get data in wide format.
pivot_wider(names_from = Name, values_from = Score)
您可以使用 pivot_longer
/pivot_wider
-
library(dplyr)
library(tidyr)
df %>%
#To drop empty NA columns
select(-starts_with('Name')) %>%
#Rename X1 to X1Name, X2 to X2Name and so on
rename_with(~paste0(., 'Name'), matches('^X\d+$')) %>%
#get the data in long format creating two columns Name and Score
pivot_longer(cols = -ID,
names_to = '.value',
names_pattern = 'X\d+(.*)') %>%
#Get data in wide format.
pivot_wider(names_from = Name, values_from = Score)
# ID Name_A Name_C Name_B
# <int> <dbl> <dbl> <dbl>
#1 1 4.58 4.79 5.22
#2 2 5.61 5.35 5.33
#3 3 4.89 5.48 5.59
#4 4 4.93 5.36 5.04
#5 5 5.39 5.11 5.27
#6 6 4.99 4.91 5.01
如果您想保留数据中的所有其他列并单独添加这 3 列,您可以将数据集与原始数据集合并。
...Code from above %>%
left_join(df %>% select(-starts_with('Name')), by = 'ID')